我有一個如下所示的 JSON blob:
{'status': 'OK',
'data-availability': 'available',
'data': [{'page': 1, 'pages': 1, 'total': 7},
[{'domain_id': '101',
'domain_name': 'Province1',
'domain_url': 'https://province1.com'},
{'domain_id': '102',
'domain_name': 'Province2',
'domain_url': 'https://province2.com'},
{'domain_id': '103',
'domain_name': 'Province3',
'domain_url': 'https://province3.com'},
{'domain_id': '104',
'domain_name': 'Province4',
'domain_url': 'https://province4.com'},
{'domain_id': '105',
'domain_name': 'Province5',
'domain_url': 'https://province5.com'},
{'domain_id': '106',
'domain_name': 'Province6',
'domain_url': 'https://province6.com'},
{'domain_id': '107',
'domain_name': 'Province7',
'domain_url': 'https://province7.com'}]]}
我想要的是將它規范化為 Pandas DataFrame,該列由 domain_id、domain_name 和 domain_url 組成。
我怎樣才能做到這一點?
uj5u.com熱心網友回復:
重復附加到資料幀很慢。相反,收集字典中的所有內容,然后呼叫.from_dict():
from pandas import pd
result = defaultdict(list)
for entry in data['data'][1]:
for key, value in entry.items():
result[key].append(value)
print(pd.DataFrame.from_dict(result))
這輸出:
domain_id domain_name domain_url
0 101 Province1 https://province1.com
1 102 Province2 https://province2.com
2 103 Province3 https://province3.com
3 104 Province4 https://province4.com
4 105 Province5 https://province5.com
5 106 Province6 https://province6.com
6 107 Province7 https://province7.com
uj5u.com熱心網友回復:
這完成了作業,
data = json.loads(test)["data"][-1]
df = pd.DataFrame()
for d in data:
temp_df = pd.DataFrame([data[0]])
df = pd.concat([df, temp_df])
uj5u.com熱心網友回復:
您可以使用pd.json_normalize()。
raw_data = [{'domain_id': '101',
'domain_name': 'Province1',
'domain_url': 'https://province1.com'},
{'domain_id': '102',
'domain_name': 'Province2',
'domain_url': 'https://province2.com'},
{'domain_id': '103',
'domain_name': 'Province3',
'domain_url': 'https://province3.com'},
{'domain_id': '104',
'domain_name': 'Province4',
'domain_url': 'https://province4.com'},
{'domain_id': '105',
'domain_name': 'Province5',
'domain_url': 'https://province5.com'},
{'domain_id': '106',
'domain_name': 'Province6',
'domain_url': 'https://province6.com'},
{'domain_id': '107',
'domain_name': 'Province7',
'domain_url': 'https://province7.com'}]
# store data as df
df = pd.DataFrame({'raw':raw_data})
# split dict into columns with keys as column names
df_json = pd.json_normalize(df['raw'])
# concat dfs
df = pd.concat([df, df_json], axis=1)
# display
display(df)
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/461320.html
標籤:Python json python-3.x 熊猫 数据框
上一篇:如何根據其他列的值更改列的值?
