我正在嘗試將資料從 csv 檔案添加到 json 鍵中并按原樣保持原始結構..json 檔案看起來像這樣..
{
"inputDocuments": {
"gcsDocuments": {
"documents": [
{
"gcsUri": "gs://test/.PDF",
"mimeType": "application/pdf"
}
]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://test"
}
},
"skipHumanReview": false
我試圖加載的 csv 檔案具有以下結構..

請注意,
模仿型別
不包含在 csv 檔案中。
我已經有可以做到這一點的代碼,但是它有點手動,我正在尋找一種更簡單的方法,只需要一個帶有值的 csv 檔案,并且這些資料將被添加到 json 結構中。預期結果應如下所示:
{
"inputDocuments": {
"gcsDocuments": {
"documents": [
{
"gcsUri": "gs://sampleinvoices/Handwritten/1.pdf",
"mimeType": "application/pdf"
},
{
"gcsUri": "gs://sampleinvoices/Handwritten/2.pdf",
"mimeType": "application/pdf"
}
]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://test"
}
},
"skipHumanReview": false
我目前使用的代碼有點手動,看起來像這樣..
import json
# function to add to JSON
def write_json(new_data, filename='keyvalue.json'):
with open(filename,'r ') as file:
# load existing data into a dict.
file_data = json.load(file)
# Join new_data with file_data inside documents
file_data["inputDocuments"]["gcsDocuments"]["documents"].append(new_data)
# Sets file's current position at offset.
file.seek(0)
# convert back to json.
json.dump(file_data, file, indent = 4)
# python object to be appended
y = {
"gcsUri": "gs://test/.PDF",
"mimeType": "application/pdf"
}
write_json(y)
uj5u.com熱心網友回復:
我會建議這樣的事情:
import pandas as pd
import json
from pathlib import Path
df_csv = pd.read_csv("your_data.csv")
json_file = Path("your_data.json")
json_data = json.loads(json_file.read_text())
documents = [
{
"gcsUri": cell,
"mimeType": "application/pdf"
}
for cell in df_csv["column_name"]
]
json_data["inputDocuments"]["gcsDocuments"]["documents"] = documents
json_file.write_text(json.dumps(json_data))
可能您應該將其拆分為單獨的功能,但它應該傳達總體思路。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/372810.html
