我有以下 DataFrame,其中包含一個列是 dict 專案串列:
d = pd.DataFrame([
['Green', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
['Apply', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
['Range', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
['Peop', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]]
], columns=['Name', 'Legal Description'])
我想把它轉換成一個簡單的 DataFrame 像這樣:
d = pd.DataFrame([
['Green', 'STERLING GREEN SO', '01', 'L0038', 'B0008'],
['Apply', 'STERLING GREEN SO', '01', 'L0038', 'B0008'],
['Range', 'STERLING GREEN SO', '01', 'L0038', 'B0008'],
['Peop', 'STERLING GREEN SO', '01', 'L0038', 'B0008']
], columns=['Name', 'Legal Description', 'Desc', 'Sec', 'Lot', 'Block'])
uj5u.com熱心網友回復:
IMO,理想的解決方案是在上游采取行動并獲得格式正確的字典或資料幀。
您的單鍵字典串列的問題在于您必須合并它們。您可以使用字典理解并轉換為系列:
d2 = d['Legal Description'].apply(lambda c:
pd.Series({next(iter(x.keys())).strip(':'):
next(iter(x.values())) for x in c})
)
然后加入原始資料幀:
d.drop(columns='Legal Description').join(d2)
輸出:
Name Desc Sec Lot Block
0 Green STERLING GREEN SO 01 L0038 B0008
1 Apply STERLING GREEN SO 01 L0038 B0008
2 Range STERLING GREEN SO 01 L0038 B0008
3 Peop STERLING GREEN SO 01 L0038 B0008
uj5u.com熱心網友回復:
如果可能,您應該在創建 DataFrame 之前處理資料。它比創建后重塑 DataFrame 更快。例如,像
data = [
['Green', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
['Apply', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
['Range', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
['Peop', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]]
]
records = []
for name, legal_desc in data:
rec = {}
rec['Name'] = name
rec.update(x for d in legal_desc for x in d.items())
records.append(rec)
d = pd.DataFrame(records)
輸出:
>>> d
Name Desc: Sec: Lot: Block:
0 Green STERLING GREEN SO 01 L0038 B0008
1 Apply STERLING GREEN SO 01 L0038 B0008
2 Range STERLING GREEN SO 01 L0038 B0008
3 Peop STERLING GREEN SO 01 L0038 B0008
>>> records
[{'Name': 'Green', 'Desc:': 'STERLING GREEN SO', 'Sec:': '01', 'Lot:': 'L0038', 'Block:': 'B0008'}, {'Name': 'Apply', 'Desc:': 'STERLING GREEN SO', 'Sec:': '01', 'Lot:': 'L0038', 'Block:': 'B0008'}, {'Name': 'Range', 'Desc:': 'STERLING GREEN SO', 'Sec:': '01', 'Lot:': 'L0038', 'Block:': 'B0008'}, {'Name': 'Peop', 'Desc:': 'STERLING GREEN SO', 'Sec:': '01', 'Lot:': 'L0038', 'Block:': 'B0008'}]
uj5u.com熱心網友回復:
您還可以使用:
df.set_index('Name', inplace=True)
df = df['Legal Description'].explode().apply(pd.Series).groupby(level=0).sum().reset_index()
OUTPUT
Name Desc: Sec: Lot: Block:
0 Apply STERLING GREEN SO 01 L0038 B0008
1 Green STERLING GREEN SO 01 L0038 B0008
2 Peop STERLING GREEN SO 01 L0038 B0008
3 Range STERLING GREEN SO 01 L0038 B0008
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/393711.html
上一篇:檢查時間戳是否在兩個日期之間
