將JSON轉換為Pandas資料幀，允許復雜的字典結構-有解無憂

下面是我的代碼，它收到錯誤“將字典與非系列混合可能會導致排序不明確”。這是什么原因，我應該如何解決這個問題？我如何在 python 中可視化這本字典以進行除錯？

import pandas as pd
df = pd.read_json('https://stats.oecd.org/sdmx-json/data/QNA/AUS AUT.GDP B1_GE.CUR VOBARSA.Q/all?startTime=2009-Q2&endTime=2011-Q4')

提前致謝。

uj5u.com熱心網友回復：

下載json檔案并執行：

from pandas.io.json import json_normalize
import json

with open('aaa.json') as data_file:    
    d= json.load(data_file)

df = json_normalize(d)

這使：

             dataSets  \
0  [{'action': 'Information', 'series': {'0:0:0:0...   

                              header.id  header.test  \
0  a0d6c1d0-79b9-4cc2-9e7a-06050c934194        False   

                header.prepared header.sender.id  \
0  2021-11-04T10:42:25.7631355Z             OCDE   

                                  header.sender.name  \
0  Organisation de coop??ration et de d??veloppem...   

                                        header.links  \
0  [{'href': 'https://stats.oecd.org:443/sdmx-jso...   

                                     structure.links  \
0  [{'href': 'https://stats.oecd.org/sdmx-json/da...   

                   structure.name           structure.description  \
0  Comptes nationaux trimestriels  Comptes nationaux trimestriels   

                         structure.dimensions.series  \
0  [{'keyPosition': 0, 'id': 'LOCATION', 'name': ...   

                    structure.dimensions.observation  \
0  [{'id': 'TIME_PERIOD', 'name': 'P??riode', 'va...   

  structure.attributes.dataSet  \
0                           []   

                         structure.attributes.series  \
0  [{'id': 'TIME_FORMAT', 'name': 'Time Format', ...   

                    structure.attributes.observation  \
0  [{'id': 'OBS_STATUS', 'name': 'Observation Sta...   

                               structure.annotations  
0  [{'title': 'Copyright OECD - All rights reserv...

請注意，您有嵌套的 json，您必須取消嵌套。使用例如：

def flatten_nested_json_df(df):
    df = df.reset_index()
    s = (df.applymap(type) == list).all()
    list_columns = s[s].index.tolist()
    
    s = (df.applymap(type) == dict).all()
    dict_columns = s[s].index.tolist()

    
    while len(list_columns) > 0 or len(dict_columns) > 0:
        new_columns = []

        for col in dict_columns:
            horiz_exploded = pd.json_normalize(df[col]).add_prefix(f'{col}.')
            horiz_exploded.index = df.index
            df = pd.concat([df, horiz_exploded], axis=1).drop(columns=[col])
            new_columns.extend(horiz_exploded.columns) # inplace

        for col in list_columns:
            print(f"exploding: {col}")
            df = df.drop(columns=[col]).join(df[col].explode().to_frame())
            new_columns.append(col)

        s = (df[new_columns].applymap(type) == list).all()
        list_columns = s[s].index.tolist()

        s = (df[new_columns].applymap(type) == dict).all()
        dict_columns = s[s].index.tolist()
    return df

uj5u.com熱心網友回復：

你可以使用 Dframcy。

from dframcy import DframCy

with open('aaa.json') as data_file:    
    d= json.load(data_file)

dataframe = dframcy.to_dataframe(d, columns=["text","start","","","","",""])

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/349101.html

標籤：json 熊猫数据框

上一篇：決議由鍵“1”、“2”給出的隱式JSON串列

下一篇：如何更改json的輸出以匹配python中轉換后的格式