給定以下資料框和字典串列:
import pandas as pd
import numpy as np
df = pd.DataFrame.from_dict([
{'id': '912SAFD', 'key': 3, 'list_index': [0]},
{'id': '812SAFD', 'key': 4, 'list_index': [0, 1]},
{'id': '712SAFD', 'key': 5, 'list_index': [2]}])
designs = [{'designs': [{'color_id': 609090, 'value': 'b', 'lang': ''}]},
{'designs': [{'color_id': 609091, 'value': 'c', 'lang': ''}]},
{'designs': [{'color_id': 609092, 'value': 'd', 'lang': 'fr'}]}]
資料框輸出:
id key list_index
0 912SAFD 3 [0]
1 812SAFD 4 [0, 1]
2 712SAFD 5 [2]
如果不使用顯式回圈(如果可能),是否可以遍歷'list_index'每一行的串列,提取值并使用它們按索引訪問字典串列,然后根據字典中的值創建新列?
以下是預期結果的示例:
id key list_index 609090 609091 609092 609092_lang
0 912SAFD 3 [0] b NaN NaN NaN
1 812SAFD 4 [0, 1] b c NaN NaN
2 712SAFD 5 [2] NaN NaN d fr
如果'lang'不為空,則應將其作為列添加到資料框中,方法是使用color_id結合下劃線的值及其自己的名稱作為列名。例如:609092_lang。
任何幫助將非常感激。
uj5u.com熱心網友回復:
# this is to get the inner dictionary and make a tidy dataframe from it
designs = [info for design in designs for info in design['designs']]
df_designs = pd.DataFrame(designs)
df = df.explode('list_index').merge(df_designs , left_on='list_index', right_index=True)
df = df.pivot(index=['id', 'key','lang'], columns = 'color_id', values = 'value').reset_index()
print(df)
輸出 :
>>>
color_id id key lang 609090 609091 609092
0 712SAFD 5 fr NaN NaN d
1 812SAFD 4 b c NaN
2 912SAFD 3 b NaN NaN
uj5u.com熱心網友回復:
首先,我們需要更改designs字典以獲取相關資料并創建一個將索引映射到字典值的映射器。使用enumerateanddict.setdefault為此:
designs_dict = {}
for i, des in enumerate(designs):
color_id = des['designs'][0]['color_id']
designs_dict.setdefault(i, []).append({color_id : des['designs'][0]['value']})
if des['designs'][0]['lang'] != '':
designs_dict.setdefault(i, []).append({'{}_lang'.format(color_id) : des['designs'][0]['lang']})
現在designs_dict看起來像這樣:
{0: [{609090: 'b'}],
1: [{609091: 'c'}],
2: [{609092: 'd'}, {'609092_lang': 'fr'}]}
然后
(i) explode“list_index”,對于那里的每個索引,map“designs_dict”;然后explode再次擺脫串列
(ii) 從 (i) 構造一個 DataFrame;groupby索引并用于first縮小 DataFrame
(iii) join(ii) 至df
s_from_designs = df['list_index'].explode().map(designs_dict).explode()
df_from_designs = pd.DataFrame(s_from_designs.tolist(), index=s_from_designs.index).groupby(level=0).first()
out = df.join(df_from_designs)
最終輸出:
id key list_index 609090 609091 609092 609092_lang
0 912SAFD 3 [0] b None None None
1 812SAFD 4 [0, 1] b c None None
2 712SAFD 5 [2] None None d fr
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/418707.html
標籤:
上一篇:如何在dict中洗掉對
下一篇:從字典中提取數值
