將嵌套的json解壓縮到資料框中？-有解無憂

背景

我正在嘗試從以下API訪問資料。這是一個帶有嵌套字典的巨大嵌套 json。我正在努力提高可讀性。json 檔案的格式與此鏈接一致（右側，單擊全部展開）

我試過的

我搜索了 SO 和其他網站，pd.json_normalize 似乎是答案，但我嘗試了幾種方法，它只解包一層。

# Attempt 1
url = 'https://api.opennem.org.au/station/'
response = requests.get(url).json()
df2 = pd.json_normalize(response, max_level=0)
print(df2)

# Attempt 2
url = 'https://api.opennem.org.au/station/'
response = requests.get(url).json()
df = pd.json_normalize(response, record_path=['facilities'])
print(df)

當前錯誤輸出

  version  ...                                               data
0  3.11.3  ...  [{'id': 488, 'code': 'ADP', 'name': 'Adelaide ...

[1 rows x 5 columns]

請求幫助

任何人都知道如何將這個大型嵌套 json 解壓縮到資料框中？

uj5u.com熱心網友回復：

您可以將json_normalize嵌套串列用于規范化data和facilities：

df2 = pd.json_normalize(response, ['data',['facilities']])

print(df2.head(3))

    id  station_id        code dispatch_type  active  capacity_registered  \
0  689         488     ADPBA1L          LOAD    True                 6.27   
1  690         488     ADPBA1G     GENERATOR    True                 6.27   
2  516         372  ALBANY_WF1     GENERATOR    True                21.60   

  network_region  unit_number  unit_capacity  approved network.code  \
0            SA1          1.0           6.27      True          NEM   
1            SA1          1.0           6.27      True          NEM   
2            WEM          NaN            NaN      True          WEM   

  network.country network.label  \
0              au           NEM   
1              au           NEM   
2              au           WEM   

                                     network.regions  network.timezone  \
0  [{'code': 'NSW1'}, {'code': 'QLD1'}, {'code': ...  Australia/Sydney   
1  [{'code': 'NSW1'}, {'code': 'QLD1'}, {'code': ...  Australia/Sydney   
2                                  [{'code': 'WEM'}]   Australia/Perth   

  network.timezone_database  network.offset  network.interval_size  \
0                      AEST             600                      5   
1                      AEST             600                      5   
2                      AWST             480                     30   

   network.interval_shift  network.has_interconnectors  \
0                       5                        False   
1                       5                        False   
2                       0                        False   

   network.intervals_per_hour        fueltech.code         fueltech.label  \
0                        12.0     battery_charging     Battery (Charging)   
1                        12.0  battery_discharging  Battery (Discharging)   
2                         2.0                 wind                   Wind   

  fueltech.renewable status.code status.label           registered  \
0               True   committed    Committed                  NaN   
1               True   committed    Committed                  NaN   
2               True   operating    Operating  2018-10-12T00:00:00   

                        approved_at  emissions_factor_co2 approved_by  
0                               NaN                   NaN         NaN  
1                               NaN                   NaN         NaN  
2  2020-12-09T15:34:49.465445 00:00                   NaN         NaN

獎金：

如果需要也標量network.regions：

df2 = pd.json_normalize(response, ['data',['facilities']])
df2['network.regions'] = [[y['code'] for y in x] for x in df2['network.regions']]
df2 = df2.explode('network.regions').reset_index(drop=True)

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/516778.html

標籤：Pythonjson熊猫数据框嵌套的

上一篇：如何洗掉特定列具有特殊文本的行？

下一篇：查找每個特定視窗中的最大行數