我需要根據條件向 Pandas 資料框列添加一個新的鍵值對。目標列資料采用字典格式。因此,如果條件為真,則必須創建對,否則不需要任何操作。我正在嘗試通過 np.where:
df = pd.DataFrame({"amenity": ["1","2","3","4"], "tags": [{"building":"yes"},{"entrance": "yes"},{},{}], "sport": [None, "hockey", "football", None], "leisure":["multi", "some", "field", "wake"]})
leisure_var_add = ["field", "multi"]
df['tags']['sport'] = np.where((df['sport'] != None) | (df['leisure'].isin(leisure_var_add))), df['sport'], None)
df['tags']['leisure'] = np.where((df['sport'] == None) & (df['leisure'] !=None) & (~df['leisure'].isin(leisure_var_add)), df['leisure'], None)
我想得到這樣的東西:
amenity tags sport leisure
0 1 {'building':'yes','sport': 'multi'} None multi
1 2 {'entrance': 'yes','sport': 'hockey'} hokkey some
2 3 {'sport': 'football', 'leisure': 'field'} football field
3 4 {'leisure': 'wake'} None wake
我已經通過對每一行的回圈和索引操作來實作這個任務,但在這種情況下,我失去了 Pandas 的所有好處。您知道如何實施嗎?
uj5u.com熱心網友回復:
使用理解:
df['tags'] = df[['sport', 'leisure']] \
.apply(lambda x: {k: v for k, v in x[x.notna()].items()}, axis=1)
輸出:
>>> df
amenity tags sport leisure
0 1 {'leisure': 'multi'} None multi
1 2 {'sport': 'hokkey', 'leisure': 'some'} hokkey some
2 3 {'sport': 'football', 'leisure': 'field'} football field
3 4 {'leisure': 'wake'} None wake
uj5u.com熱心網友回復:
我使用 apply 將所有資料移動到列,然后使用不包括便利性的列資料迭代構建標簽字典的行
df = pd.DataFrame({"amenity": ["1","2","3","4"], "tags": [{"building":"yes"},{"entrance": "yes"},{},{}], "sport": [None, "hockey", "football", None], "leisure":["multi", "some", "field", "wake"]})
def EmptyList(x):
if len(x)>0:
return x[0]
else:
return None
df['building']=df['tags'].apply(lambda x: [v for k,v in x.items() if k=='building']).apply(EmptyList)
df['entrance']=df['tags'].apply(lambda x: [v for k,v in x.items() if k=='entrance']).apply(EmptyList)
df.drop(['tags'],inplace=True,axis=1)
print(df)
tags_dict={}
columns=df.columns
for key,value in df.iterrows():
for column in columns:
if value[column]!=None and column != 'amenity':
#print(value[column])
tags_dict[column]=value[column]
#print(tags_dict)
df.loc[key,'tags']=str(tags_dict)
tags_dict.clear()
print(df)
輸出
amenity sport leisure building entrance \
0 1 None multi yes None
1 2 hockey some None yes
2 3 football field None None
3 4 None wake None None
tags
0 {'leisure': 'multi', 'building': 'yes'}
1 {'sport': 'hockey', 'leisure': 'some', 'entran...
2 {'sport': 'football', 'leisure': 'field'}
3 {'leisure': 'wake'}
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/334216.html
上一篇:檢測串列中0附近的交替值?
