我有一個字典和資料框列,它有一系列字串型別的串列元素。
如果字典 item 中的值與任何應該用 itemname 標記的字串元素匹配
例如:輸入
text_column=[['grapes','are','good','for','health'],['banana','is','not','good','for','health'],
['apple','keeps','the','doctor','away'],['automobile','industry','is','in','top','position','from','recent','times']]
dict={ "fruit_name":['apple','grapes','lemon','cherry'],
"profession":['health','manufacturing','automobiles']
}
輸出 :
1) fruit_name
2) fruit_name
3) profession
4) profession
uj5u.com熱心網友回復:
您可以反向dict創建reverse_dct和map輸入'text_column'到'word_type'(順便說一下,dict是 Python 中的字典建構式,不要命名您的變數dict)。
reverse_dct = {}
for k,v in dct.items():
for i in v:
reverse_dct[i] = k
df = pd.DataFrame({'text_column':text_column})
df['word_type'] = df['text_column'].explode().map(reverse_dct).dropna().groupby(level=0).apply(','.join)
輸出:
text_column word_type
0 [grapes, are, good, for, health] fruit_name,profession
1 [banana, is, not, good, for, health] profession
2 [apple, keeps, the, doctor, away] fruit_name
3 [automobile, industry, is, in, top, position, ... NaN
請注意,最后一行沒有型別,因為您有automobilesindict但automobilein text_column。如果你想讓你的程式識別這些是相同的,你需要規范拼寫。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/404579.html
標籤:
