我想選擇的每一行pandas.DataFrame,其df['Title']將有一個(或多個)keywords元素。
將此串列視為關鍵字:
keywords = ['k_1', 'k_2', 'k_3', 'k_4']
我試過這種方法對我沒有用:
df[df['Title'].str.contains(keywords)]
uj5u.com熱心網友回復:
df[df["Title"].apply(lambda x: any(k in x for k in keywords))]
uj5u.com熱心網友回復:
創建一個正則運算式模式并使用str.findall:
設定:
df = pd.DataFrame({'Title': ['k_1 and k_2', 'k_3 alone', 'k_z not here']})
keywords = ['k_1', 'k_2', 'k_3', 'k_4']
pattern = fr"\b({'|'.join(keywords)})\b"
df['Keywords'] = df['Title'].str.findall(pattern)
輸出:
>>> df
Title Keywords
0 k_1 and k_2 [k_1, k_2]
1 k_3 alone [k_3]
2 k_z not here []
>>> print(pattern)
\b(k_1|k_2|k_3|k_4)\b
獲取行:
>>> df[df['Title'].str.findall(pattern).astype(bool)]
Title
0 k_1 and k_2
1 k_3 alone
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/360937.html
上一篇:基于切片更新熊貓資料框?
