下面是我的代碼,它收到錯誤“將字典與非系列混合可能會導致排序不明確”。這是什么原因,我應該如何解決這個問題?我如何在 python 中可視化這本字典以進行除錯?
import pandas as pd
df = pd.read_json('https://stats.oecd.org/sdmx-json/data/QNA/AUS AUT.GDP B1_GE.CUR VOBARSA.Q/all?startTime=2009-Q2&endTime=2011-Q4')
提前致謝。
uj5u.com熱心網友回復:
下載json檔案并執行:
from pandas.io.json import json_normalize
import json
with open('aaa.json') as data_file:
d= json.load(data_file)
df = json_normalize(d)
這使:
dataSets \
0 [{'action': 'Information', 'series': {'0:0:0:0...
header.id header.test \
0 a0d6c1d0-79b9-4cc2-9e7a-06050c934194 False
header.prepared header.sender.id \
0 2021-11-04T10:42:25.7631355Z OCDE
header.sender.name \
0 Organisation de coop??ration et de d??veloppem...
header.links \
0 [{'href': 'https://stats.oecd.org:443/sdmx-jso...
structure.links \
0 [{'href': 'https://stats.oecd.org/sdmx-json/da...
structure.name structure.description \
0 Comptes nationaux trimestriels Comptes nationaux trimestriels
structure.dimensions.series \
0 [{'keyPosition': 0, 'id': 'LOCATION', 'name': ...
structure.dimensions.observation \
0 [{'id': 'TIME_PERIOD', 'name': 'P??riode', 'va...
structure.attributes.dataSet \
0 []
structure.attributes.series \
0 [{'id': 'TIME_FORMAT', 'name': 'Time Format', ...
structure.attributes.observation \
0 [{'id': 'OBS_STATUS', 'name': 'Observation Sta...
structure.annotations
0 [{'title': 'Copyright OECD - All rights reserv...
請注意,您有嵌套的 json,您必須取消嵌套。使用例如:
def flatten_nested_json_df(df):
df = df.reset_index()
s = (df.applymap(type) == list).all()
list_columns = s[s].index.tolist()
s = (df.applymap(type) == dict).all()
dict_columns = s[s].index.tolist()
while len(list_columns) > 0 or len(dict_columns) > 0:
new_columns = []
for col in dict_columns:
horiz_exploded = pd.json_normalize(df[col]).add_prefix(f'{col}.')
horiz_exploded.index = df.index
df = pd.concat([df, horiz_exploded], axis=1).drop(columns=[col])
new_columns.extend(horiz_exploded.columns) # inplace
for col in list_columns:
print(f"exploding: {col}")
df = df.drop(columns=[col]).join(df[col].explode().to_frame())
new_columns.append(col)
s = (df[new_columns].applymap(type) == list).all()
list_columns = s[s].index.tolist()
s = (df[new_columns].applymap(type) == dict).all()
dict_columns = s[s].index.tolist()
return df
uj5u.com熱心網友回復:
你可以使用 Dframcy。
from dframcy import DframCy
with open('aaa.json') as data_file:
d= json.load(data_file)
dataframe = dframcy.to_dataframe(d, columns=["text","start","","","","",""])
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/349101.html
