我有一個帶有嵌套物件的 json 檔案,該檔案在 pandas 資料框中展平。有一個包含嵌套 json 物件的列,我發現很難展平。
我嘗試了很多方法,這是讓我走得最遠的方法。
非常感謝您的幫助,謝謝。
不幸的是,我無法找到一個類似 jsfiddle 的替代 python 來提供一個作業示例。
我知道使用 json_normalize 的元引數,我可以將列添加到我的資料框中。但是這種方法不適用于 unflat 列,因為我只有通過將 record_path 設定為 'markets' 來讓 json_normalize 在我的設定中正常作業,這是我檔案中的主要 json 物件。因此,在此設定中,我無法將記錄路徑記錄到“marketStats”并通過元引數添加任何相關列。
目標
目標是將 marketStats 物件中的一個或所有 json 物件轉換為資料框的列。
代碼
with open('Data/20012022.json') as file:
data = json.loads(file.read())
# Flatten data
df0 = pd.json_normalize(
data,
record_path =['markets']
)
df0.head(3)
截屏
這是表格當前的截圖,marketStats 列包含嵌套的 json。

資料
這是 json 檔案中的一個片段。`
{
"markets": [
{
"id": 335,
"baseCurrency": "eth",
"quoteCurrency": "btc",
"exchangeName": "Binance",
"exchangeCode": "BINA",
"longName": "BTC-ETH",
"marketName": "btc-eth",
"symbol": "ETHBTC",
"volume": "40624.5823",
"quoteVolume": "3026.13646935",
"btcVolume": "3026.13646935",
"usdVolume": "127009429.050524367",
"currentPrice": 0.074681,
"latestBase": {
"id": 161774475,
"time": 1639576800,
"date": "2021-12-15T14:00:00.000 00:00",
"price": "0.077653",
"lowestPrice": "0.0729",
"bounce": "6.283",
"currentDrop": "-3.8272829124438206",
"crackedAt": "2022-01-07T03:00:00.000Z",
"respectedAt": "2022-01-15T15:00:00.000Z",
"isLowest": false
},
"marketStats": [
{
"algorithm": "original",
"ratio": "50.0",
"medianDrop": "-4.08",
"medianBounce": "5.51",
"hoursToRespected": 106,
"crackedCount": 2,
"respectedCount": 1
},
{
"algorithm": "day_trade",
"ratio": "100.0",
"medianDrop": "-6.12",
"medianBounce": "6.28",
"hoursToRespected": 204,
"crackedCount": 1,
"respectedCount": 1
},
{
"algorithm": "conservative",
"ratio": "100.0",
"medianDrop": "-6.12",
"medianBounce": "8.38",
"hoursToRespected": 204,
"crackedCount": 1,
"respectedCount": 1
},
{
"algorithm": "position",
"ratio": "50.0",
"medianDrop": "-6.12",
"medianBounce": "6.19",
"hoursToRespected": 204,
"crackedCount": 2,
"respectedCount": 1
},
{
"algorithm": "hodloo",
"ratio": "50.0",
"medianDrop": "-3.29",
"medianBounce": "0.0",
"hoursToRespected": 225,
"crackedCount": 4,
"respectedCount": 2
}
]
},
{
"id": 337,
"baseCurrency": "ltc",
"quoteCurrency": "btc",
"exchangeName": "Binance",
"exchangeCode": "BINA",
"longName": "BTC-LTC",
"marketName": "btc-ltc",
"symbol": "LTCBTC",
"volume": "68309.637",
"quoteVolume": "223.79294524",
"btcVolume": "223.79294524",
"usdVolume": "9392773.4219378968",
"currentPrice": 0.003275,
"latestBase": {
"id": 163982984,
"time": 1642374000,
"date": "2022-01-16T23:00:00.000 00:00",
"price": "0.003346",
"lowestPrice": "0.00322",
"bounce": "3.839",
"currentDrop": "-2.1219366407650926",
"crackedAt": "2022-01-18T23:00:00.000Z",
"respectedAt": null,
"isLowest": false
},
"marketStats": [
{
"algorithm": "original",
"ratio": "57.14",
"medianDrop": "-3.28",
"medianBounce": "3.84",
"hoursToRespected": 186,
"crackedCount": 7,
"respectedCount": 4
},
{
"algorithm": "day_trade",
"ratio": "0.0",
"medianDrop": "0.0",
"medianBounce": "5.68",
"hoursToRespected": 0,
"crackedCount": 1,
"respectedCount": 0
},
{
"algorithm": "conservative",
"ratio": "0.0",
"medianDrop": "0.0",
"medianBounce": "5.68",
"hoursToRespected": 0,
"crackedCount": 1,
"respectedCount": 0
},
{
"algorithm": "position",
"ratio": "0.0",
"medianDrop": "0.0",
"medianBounce": "8.16",
"hoursToRespected": 0,
"crackedCount": 1,
"respectedCount": 0
},
{
"algorithm": "hodloo",
"ratio": "75.0",
"medianDrop": "-3.7",
"medianBounce": "0.0",
"hoursToRespected": 35,
"crackedCount": 4,
"respectedCount": 3
}
]
},
{
"id": 339,
"baseCurrency": "bnb",
"quoteCurrency": "btc",
"exchangeName": "Binance",
"exchangeCode": "BINA",
"longName": "BTC-BNB",
"marketName": "btc-bnb",
"symbol": "BNBBTC",
"volume": "154576.177",
"quoteVolume": "1724.66664804",
"btcVolume": "1724.66664804",
"usdVolume": "72385673.4448901928",
"currentPrice": 0.01099,
"latestBase": {
"id": 163753765,
"time": 1642068000,
"date": "2022-01-13T10:00:00.000 00:00",
"price": "0.01093",
"lowestPrice": "0.01093",
"bounce": "3.102",
"currentDrop": "0.5489478499542543",
"crackedAt": null,
"respectedAt": null,
"isLowest": false
},
"marketStats": [
{
"algorithm": "original",
"ratio": "100.0",
"medianDrop": "-7.18",
"medianBounce": "4.34",
"hoursToRespected": 62,
"crackedCount": 2,
"respectedCount": 2
},
{
"algorithm": "day_trade",
"ratio": "100.0",
"medianDrop": "-6.19",
"medianBounce": "4.3",
"hoursToRespected": 63,
"crackedCount": 1,
"respectedCount": 1
},
{
"algorithm": "conservative",
"ratio": "66.67",
"medianDrop": "-3.15",
"medianBounce": "4.05",
"hoursToRespected": 62,
"crackedCount": 3,
"respectedCount": 2
},
{
"algorithm": "position",
"ratio": "100.0",
"medianDrop": "-3.15",
"medianBounce": "4.46",
"hoursToRespected": 60,
"crackedCount": 2,
"respectedCount": 2
},
{
"algorithm": "hodloo",
"ratio": "100.0",
"medianDrop": "-7.46",
"medianBounce": "0.0",
"hoursToRespected": 62,
"crackedCount": 5,
"respectedCount": 5
}
]
}
]
}
uj5u.com熱心網友回復:
這是選擇單個物件的一種方法。不確定您打算如何組合所有這些。
df = data.markets.apply(pd.Series)
new = df.marketStats.apply(lambda x: pd.Series(x[0])) # Getting first one only
combined = pd.concat([df, new], axis=1)
輸出:
id baseCurrency quoteCurrency exchangeName exchangeCode longName marketName ... algorithm ratio medianDrop medianBounce hoursToRespected crackedCount respectedCount
0 335 eth btc Binance BINA BTC-ETH btc-eth ... original 50.0 -4.08 5.51 106 2 1
1 337 ltc btc Binance BINA BTC-LTC btc-ltc ... original 57.14 -3.28 3.84 186 7 4
2 339 bnb btc Binance BINA BTC-BNB btc-bnb ... original 100.0 -7.18 4.34 62 2 2
uj5u.com熱心網友回復:
您可以應用一些后處理df0來實作您想要的。在這里您可以先應用explode,然后apply(pf.Series)再應用到'marketStats'列:
df1 = df0.explode('marketStats')['marketStats'].apply(pd.Series)
df1 看起來像這樣:
algorithm ratio medianDrop medianBounce hoursToRespected crackedCount respectedCount
-- ------------ ------- ------------ -------------- ------------------ -------------- ----------------
0 original 50 -4.08 5.51 106 2 1
0 day_trade 100 -6.12 6.28 204 1 1
0 conservative 100 -6.12 8.38 204 1 1
0 position 50 -6.12 6.19 204 2 1
0 hodloo 50 -3.29 0 225 4 2
1 original 57.14 -3.28 3.84 186 7 4
1 day_trade 0 0 5.68 0 1 0
1 conservative 0 0 5.68 0 1 0
1 position 0 0 8.16 0 1 0
1 hodloo 75 -3.7 0 35 4 3
2 original 100 -7.18 4.34 62 2 2
2 day_trade 100 -6.19 4.3 63 1 1
2 conservative 66.67 -3.15 4.05 62 3 2
2 position 100 -3.15 4.46 60 2 2
2 hodloo 100 -7.46 0 62 5 5
如果您希望它與您可以使用的所有其他列結合使用join:
df0.join(df1)
我不會發布此命令的輸出,因為它相當大
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/420052.html
標籤:
上一篇:在強化學習中將渲染轉換為小視頻
