如何將非固定鍵json多行抽象地合并為一個json-有解無憂

如果我有一個沉重的 json 檔案，其中包含 30m 這樣的條目

{"id":3,"price":"231","type":"Y","location":"NY"}
{"id":4,"price":"321","type":"N","city":"BR"}
{"id":5,"price":"354","type":"Y","city":"XE","location":"CP"}
--snip--
{"id":30373779,"price":"121","type":"N","city":"SR","location":"IU"}
{"id":30373780,"price":"432","type":"Y","location":"TB"}
{"id":30373780,"price":"562","type":"N","city":"CQ"}

我如何只能抽象位置和城市并將其決議為一個 json，就像在 python 中那樣：

{
    "orders":{
        3:{
            "location":"NY"
        },
        4:{
            "city":"BR"
        },
        5:{
            "city":"XE",
            "location":"CP"
        },
        30373779:{
            "city":"SR",
            "location":"IU"
        },
        30373780:{
            "location":"TB"
        },
        30373780:{
            "city":"CQ"
        }
    }
}

PS：beatufy 的語法不是必須的。

uj5u.com熱心網友回復：

假設您的輸入檔案實際上是jsonlines格式，那么您可以讀取每一行，從字典中提取city和location鍵，然后將它們附加到新字典：

import json
from collections import defaultdict

orders = { 'orders' : defaultdict(dict) }
with open('orders.txt', 'r') as f:
    for line in f:
        o = json.loads(line)
        id = o['id']
        if 'location' in o:
            orders['orders'][id]['location'] = o['location'] 
        if 'city' in o:
            orders['orders'][id]['city'] = o['city'] 

print(orders)

示例資料的輸出（請注意，它有兩個30373780id 值，因此這些值會合并到一個字典中）：

{
    "orders": {
        "3": {
            "location": "NY"
        },
        "4": {
            "city": "BR"
        },
        "5": {
            "location": "CP",
            "city": "XE"
        },
        "30373779": {
            "location": "IU",
            "city": "SR"
        },
        "30373780": {
            "location": "TB",
            "city": "CQ"
        }
    }
}

uj5u.com熱心網友回復：

正如您所說，您的檔案非常大，您可能不想將所有條目保留在記憶體中，這是逐行使用源檔案并立即寫入輸出的方法：

import json

with open(r"in.jsonp") as i_f, open(r"out.json", "w") as o_f:
    o_f.write('{"orders":{')
    for i in i_f:
        i_obj = json.loads(i)
        o_f.write(f'{i_obj["id"]}:')
        o_obj = {}
        if location := i_obj.get("location"):
            o_obj["location"] = location
        if city := i_obj.get("city"):
            o_obj["city"] = city
        json.dump(o_obj, o_f)
        o_f.write(",")
    o_f.write('}}')

它將以您在問題中提供的相同格式生成半有效的 JSON 物件。

你可以幫助我的國家，查看我的個人資料資訊。

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/512203.html

標籤：Pythonjson解析

上一篇：Highcharts資料模塊：過濾參考的HTML表中的特定列

下一篇：從JSON中反序列化很多屬性