我正在嘗試使用 python 代碼創建 JSON 檔案。檔案已使用英語成功創建,但無法正常使用馬拉地語。請查看代碼:
import os
import json
jsonFilePath = "E:/file/"
captchaImgLocation = "E:/file/captchaimg/"
path_to_tesseract = r"C:/Program Files/Tesseract-OCR/tesseract.exe"
image_path = r"E:/file/captchaimg/captcha.png"
x = {
"FName": "??????",
}
# convert into JSON:
y = json.dumps(x, ensure_ascii=False).encode('utf8')
# the result is a JSON string:
print(y.decode())
completeName = os.path.join(jsonFilePath, "searchResult_Unicode.json")
print(str(completeName))
file1 = open(completeName, "w")
file1.write(str(y))
file1.close()
O/P on console:
{"FName": "??????"}
<br>
File created inside folder like this:
b'{"FName": "\xe0\xa4\xaa\xe0\xa5\x8d\xe0\xa4\xb0\xe0\xa4\xb5\xe0\xa5\x80\xe0\xa4\xa3"}'
沒有運行時或編譯時錯誤,但 JSON 是用上述格式創建的。請建議我任何解決方案。
uj5u.com熱心網友回復:
您已對 JSON 字串進行編碼,因此您必須在寫入檔案之前以二進制模式打開檔案或解碼 JSON,因此:
file1 = open(completeName, "wb")
file1.write(y)
或者
file1 = open(completeName, "w")
file1.write(y.decode('utf-8'))
正在做
file1 = open(completeName, "w")
file1.write(str(y))
將位元組的字串表示形式寫入檔案,這總是錯誤的做法。
uj5u.com熱心網友回復:
你想讓你的 json 是人類可讀的嗎?這通常是不好的做法,因為您永遠不會知道要使用什么編碼。
您可以使用 json 模塊寫入/讀取您的 json 檔案,而無需擔心編碼:
import json
json_path = "test.json"
x = {"FName": "??????"}
with open(json_path, "w") as outfile:
json.dump(x, outfile, indent=4)
with open(json_path, "r") as infile:
print(json.load(infile))
uj5u.com熱心網友回復:
以您需要的編碼打開檔案,然后打開json.dump它:
import os
import json
data = { "FName": "??????" }
# Writing human-readable. Note some text viewers on Windows required UTF-8 w/ BOM
# to *display* correctly. It's not a problem with writing, but you can use
# encoding='utf-8-sig' to hint to those programs that the file is UTF-8 if
# you see that issue. MUST use encoding='utf8' to read it back correctly.
with open('out.json', 'w', encoding='utf8') as f:
json.dump(data, f, ensure_ascii=False)
# Writing non-human-readable for non-ASCII, but others will have few
# problems reading it back into Python because all common encodings are ASCII-compatible.
# Using the default encoding this will work. I'm being explicit about encoding
# because it is good practice.
with open('out2.json', 'w', encoding='ascii') as f:
json.dump(data, f, ensure_ascii=True) # True is the default anyway
# reading either one is the same
with open('out.json', encoding='utf8') as f:
data2 = json.load(f)
with open('out2.json', encoding='utf8') as f: # UTF-8 is ASCII-compatible
data3 = json.load(f)
# Round-tripping test
print(data == data2, data2)
print(data == data3, data3)
輸出:
True {'FName': '??????'}
True {'FName': '??????'}
out.json(UTF-8 編碼):
{"FName": "??????"}
out2.json(ASCII 編碼):
{"FName": "\u092a\u094d\u0930\u0935\u0940\u0923"}
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/318436.html
