如果現有資料框列包含字典的鍵,則嘗試將字典值插入資料框中的單獨列。我已經嘗試了下面的代碼,但是回傳[]值對:
import pandas as pd
import numpy as np
df = pd.DataFrame({'key' : ["vs, vscode", "jupyter, jupyterlab", "python, vs", "python", "it was spyder before dawn"]})
my_dict = {'vscode' : 'is gross',
'jupyter' : 'is not so awesome, but hes ok, ig',
'vs' : 'is awesome',
'jupyterlab' : 'is rad',
'python' : "booya"}
def cascade_col(row_value):
cvc_row = []
for word in row_value:
if word in my_dict:
cvc_row.append(my_dict[word])
return cvc_row
df['dict value'] = df['key'].apply(cascade_col)
print(df)
我的預期輸出如下:
df = pd.DataFrame({'key' : ["vs, vscode", "jupyter, jupyterlab", "python, vs", "python", "it was spyder before dawn"],
'Corresponding Value(s)' : ['is awesome, is gross', 'is not so awesome, but hes ok, ig, is rad', 'booya, is awesome', 'booya', np.nan]})
df
謝謝你接受我的問題。
我試圖解決這個問題,但被卡住了。我已經定義了我的問題,我嘗試過的代碼,但正在尋找進一步的幫助。謝謝你。
uj5u.com熱心網友回復:
您可以將正則運算式提取和映射與字典一起使用:
import re
regex = '|'.join(map(re.escape, my_dict))
df['dict value'] = (df['key'].str.extractall(f'({regex})')[0]
.map(my_dict)
.groupby(level=0).agg(', '.join)
)
輸出:
key dict value
0 vs, vscode is awesome, is gross
1 jupyter, jupyterlab is not so awesome, but hes ok, ig, is not so awesome, but hes ok, ig
2 python, vs booya, is awesome
3 python booya
4 it was spyder before dawn NaN
uj5u.com熱心網友回復:
代碼:
def cascade_col(row_value):
cvc_row = []
for word in row_value.split(','):
word =word.strip()
if word in my_dict:
cvc_row.append(my_dict[word])
return ','.join(cvc_row)
使用 lambda
df['Corresponding Value(s)'] = df['key'].apply(lambda row: ','.join([my_dict[i] for i in [l.strip() for l in row.split(',')]if i in my_dict]))
uj5u.com熱心網友回復:
需要對該功能進行一些更改。首先我們需要將行中的值轉換成一個串列。否則我們無法迭代。在預期的輸出中,新行是字串型別的請求,因此我們對回傳部分進行了更改,并將串列轉換為字串運算式。
import numpy as np
def cascade_col(row_value):
cvc_row = []
for word in list(row_value.split(", ")): # ----> string to list
if word in list(my_dict.keys()): # ---- > dictionary keys to list
cvc_row.append(my_dict[word])
return ','.join(cvc_row) # ---- > list to string
df['dict_value'] = df['key'].apply(lambda x: cascade_col(x)).replace("",np.nan) # fill empty rows with nan
輸出:
key dict_value
0 vs, vscode is awesome,is gross
1 jupyter, jupyterlab is not so awesome, but hes ok, ig,is rad
2 python, vs booya,is awesome
3 python booya
4 it was spyder before dawn nan
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/522835.html
