如果我有一個沉重的 json 檔案,其中包含 30m 這樣的條目
{"id":3,"price":"231","type":"Y","location":"NY"}
{"id":4,"price":"321","type":"N","city":"BR"}
{"id":5,"price":"354","type":"Y","city":"XE","location":"CP"}
--snip--
{"id":30373779,"price":"121","type":"N","city":"SR","location":"IU"}
{"id":30373780,"price":"432","type":"Y","location":"TB"}
{"id":30373780,"price":"562","type":"N","city":"CQ"}
我如何只能抽象位置和城市并將其決議為一個 json,就像在 python 中那樣:
{
"orders":{
3:{
"location":"NY"
},
4:{
"city":"BR"
},
5:{
"city":"XE",
"location":"CP"
},
30373779:{
"city":"SR",
"location":"IU"
},
30373780:{
"location":"TB"
},
30373780:{
"city":"CQ"
}
}
}
PS:beatufy 的語法不是必須的。
uj5u.com熱心網友回復:
假設您的輸入檔案實際上是jsonlines格式,那么您可以讀取每一行,從字典中提取city和location鍵,然后將它們附加到新字典:
import json
from collections import defaultdict
orders = { 'orders' : defaultdict(dict) }
with open('orders.txt', 'r') as f:
for line in f:
o = json.loads(line)
id = o['id']
if 'location' in o:
orders['orders'][id]['location'] = o['location']
if 'city' in o:
orders['orders'][id]['city'] = o['city']
print(orders)
示例資料的輸出(請注意,它有兩個30373780id 值,因此這些值會合并到一個字典中):
{
"orders": {
"3": {
"location": "NY"
},
"4": {
"city": "BR"
},
"5": {
"location": "CP",
"city": "XE"
},
"30373779": {
"location": "IU",
"city": "SR"
},
"30373780": {
"location": "TB",
"city": "CQ"
}
}
}
uj5u.com熱心網友回復:
正如您所說,您的檔案非常大,您可能不想將所有條目保留在記憶體中,這是逐行使用源檔案并立即寫入輸出的方法:
import json
with open(r"in.jsonp") as i_f, open(r"out.json", "w") as o_f:
o_f.write('{"orders":{')
for i in i_f:
i_obj = json.loads(i)
o_f.write(f'{i_obj["id"]}:')
o_obj = {}
if location := i_obj.get("location"):
o_obj["location"] = location
if city := i_obj.get("city"):
o_obj["city"] = city
json.dump(o_obj, o_f)
o_f.write(",")
o_f.write('}}')
它將以您在問題中提供的相同格式生成半有效的 JSON 物件。
你可以幫助我的國家,查看我的個人資料資訊。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/512203.html
標籤:Pythonjson解析
上一篇:Highcharts資料模塊:過濾參考的HTML表中的特定列
下一篇:從JSON中反序列化很多屬性
