我想嘗試使用熊貓來決議此嵌套的JSON,當我想從列“量”和??“專案”中提取資料時,我感到困惑,并且資料有很多行,例如數百,這是示例之一
{
"_id": "62eaa99b014c9bb30203e48a",
"amount": {
"product": 291000,
"shipping": 75000,
"admin_fee": 4500,
"order_voucher_deduction": 0,
"transaction_voucher_deduction": 0,
"total": 366000,
"paid": 366000
},
"status": 32,
"items": [
{
"_id": "62eaa99b014c9bb30203e48d",
"earning": 80400,
"variants": [
{
"name": "Color",
"value": "Black"
},
{
"name": "Size",
"value": "38"
}
],
"marketplace_price": 65100,
"product_price": 62000,
"reseller_price": 145500,
"product_id": 227991,
"name": "Heels",
"sku_id": 890512,
"internal_markup": 3100,
"weight": 500,
"image": "https://product-asset.s3.ap-southeast-1.amazonaws.com/1659384575578.jpeg",
"quantity": 1,
"supplier_price": 60140
}
我已經嘗試使用此索引,只顯示索引
dfjson=pd.json_normalize(datasetjson)
dfjson.head(3)

##更新

我嘗試添加pd.Dataframe,是的,它可以成為資料框,但我仍然不知道如何提取_id,收入,變體
uj5u.com熱心網友回復:
嘗試pd.json_normalize(datasetjson, max_level=0)
uj5u.com熱心網友回復:
我想您會混淆使用字典或JSON格式。
這條線與您擁有的樣本相同,但最后錯過]}了。我格式化洗掉空格,但它是相同的:
dfjson = {"_id":"62eaa99b014c9bb30203e48a","amount":{"product":291000,"shipping":75000,"admin_fee":4500,"order_voucher_deduction":0,"transaction_voucher_deduction":0,"total":366000,"paid":366000},"status":32,"items":[{"_id":"62eaa99b014c9bb30203e48d","earning":80400,"variants":[{"name":"Color","value":"Black"},{"name":"Size","value":"38"}],"marketplace_price":65100,"product_price":62000,"reseller_price":145500,"product_id":227991,"name":"Heels","sku_id":890512,"internal_markup":3100,"weight":500,"image":"https://product-asset.s3.ap-southeast-1.amazonaws.com/1659384575578.jpeg","quantity":1,"supplier_price":60140}]}
Now, if you want to call amount:
dfjson['amount']
# Output
{'product': 291000,
'shipping': 75000,
'admin_fee': 4500,
'order_voucher_deduction': 0,
'transaction_voucher_deduction': 0,
'total': 366000,
'paid': 366000}
如果要呼叫專案:
dfjson['items']
# Output
[{'_id': '62eaa99b014c9bb30203e48d',
'earning': 80400,
'variants': [{'name': 'Color', 'value': 'Black'},
{'name': 'Size', 'value': '38'}],
'marketplace_price': 65100,
'product_price': 62000,
'reseller_price': 145500,
'product_id': 227991,
'name': 'Heels',
'sku_id': 890512,
'internal_markup': 3100,
'weight': 500,
'image': 'https://product-asset.s3.ap-southeast-1.amazonaws.com/1659384575578.jpeg',
'quantity': 1,
'supplier_price': 60140}]
要獲取專案,您可以創建一個串列:
list_items = []
for i in dfjson['items']:
list_items.append(i)
uj5u.com熱心網友回復:
如果要使用 Dataframe,資料必須是 2d 格式。您的資料是串列,字典..按資料組織,您需要先將其剪切,然后將其轉換為Dataframe。
uj5u.com熱心網友回復:
鑒于:
data = {
'_id': '62eaa99b014c9bb30203e48a',
'amount': {'admin_fee': 4500,
'order_voucher_deduction': 0,
'paid': 366000,
'product': 291000,
'shipping': 75000,
'total': 366000,
'transaction_voucher_deduction': 0},
'items': [{'_id': '62eaa99b014c9bb30203e48d',
'earning': 80400,
'image': 'https://product-asset.s3.ap-southeast-1.amazonaws.com/1659384575578.jpeg',
'internal_markup': 3100,
'marketplace_price': 65100,
'name': 'Heels',
'product_id': 227991,
'product_price': 62000,
'quantity': 1,
'reseller_price': 145500,
'sku_id': 890512,
'supplier_price': 60140,
'variants': [{'name': 'Color', 'value': 'Black'},
{'name': 'Size', 'value': '38'}],
'weight': 500}],
'status': 32
}
正在做:
df = pd.json_normalize(data, ['items'], ['amount'])
df = df.join(df.amount.apply(pd.Series))
df = df.join(df.variants.apply(pd.DataFrame)[0].set_index('name').T.reset_index(drop=True))
df = df.drop(['amount', 'variants'], axis=1)
print(df)
輸出:
_id earning marketplace_price product_price reseller_price product_id name sku_id internal_markup weight image quantity supplier_price product shipping admin_fee order_voucher_deduction transaction_voucher_deduction total paid Color Size
0 62eaa99b014c9bb30203e48d 80400 65100 62000 145500 227991 Heels 890512 3100 500 https://product-asset.s3.ap-southeast-1.amazon... 1 60140 291000 75000 4500 0 0 366000 366000 Black 38
可能有更好的方法來做這件事,但提供的樣本甚至不是有效的 json 物件,所以我不能確定真實資料的實際樣子。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/504389.html
