使用馬拉地語創建JSON的Python代碼給出不可讀的JSON-有解無憂

我正在嘗試使用 python 代碼創建 JSON 檔案。檔案已使用英語成功創建，但無法正常使用馬拉地語。請查看代碼：

import os
import json

jsonFilePath = "E:/file/"
captchaImgLocation = "E:/file/captchaimg/"

path_to_tesseract = r"C:/Program Files/Tesseract-OCR/tesseract.exe"
image_path = r"E:/file/captchaimg/captcha.png"


x = {
    "FName": "??????",
}

# convert into JSON:
y = json.dumps(x, ensure_ascii=False).encode('utf8')

# the result is a JSON string:
print(y.decode())

completeName = os.path.join(jsonFilePath, "searchResult_Unicode.json")
print(str(completeName))
file1 = open(completeName, "w")
file1.write(str(y))
file1.close()

O/P on console:
{"FName": "??????"}
<br>
File created inside folder like this:
b'{"FName": "\xe0\xa4\xaa\xe0\xa5\x8d\xe0\xa4\xb0\xe0\xa4\xb5\xe0\xa5\x80\xe0\xa4\xa3"}'

沒有運行時或編譯時錯誤，但 JSON 是用上述格式創建的。請建議我任何解決方案。

uj5u.com熱心網友回復：

您已對 JSON 字串進行編碼，因此您必須在寫入檔案之前以二進制模式打開檔案或解碼 JSON，因此：

file1 = open(completeName, "wb")
file1.write(y)

或者

file1 = open(completeName, "w")
file1.write(y.decode('utf-8'))

正在做

file1 = open(completeName, "w")
file1.write(str(y))

將位元組的字串表示形式寫入檔案，這總是錯誤的做法。

uj5u.com熱心網友回復：

你想讓你的 json 是人類可讀的嗎？這通常是不好的做法，因為您永遠不會知道要使用什么編碼。
您可以使用 json 模塊寫入/讀取您的 json 檔案，而無需擔心編碼：

import json

json_path = "test.json"
x = {"FName": "??????"}

with open(json_path, "w") as outfile:
    json.dump(x, outfile, indent=4)

with open(json_path, "r") as infile:
  print(json.load(infile))

uj5u.com熱心網友回復：

以您需要的編碼打開檔案，然后打開json.dump它：

import os
import json

data = { "FName": "??????" }

# Writing human-readable.  Note some text viewers on Windows required UTF-8 w/ BOM
# to *display* correctly.  It's not a problem with writing, but you can use
# encoding='utf-8-sig' to hint to those programs that the file is UTF-8 if
# you see that issue.  MUST use encoding='utf8' to read it back correctly.
with open('out.json', 'w', encoding='utf8') as f:
    json.dump(data, f, ensure_ascii=False)

# Writing non-human-readable for non-ASCII, but others will have few
# problems reading it back into Python because all common encodings are ASCII-compatible.
# Using the default encoding this will work.  I'm being explicit about encoding
# because it is good practice.
with open('out2.json', 'w', encoding='ascii') as f:
    json.dump(data, f, ensure_ascii=True) # True is the default anyway

# reading either one is the same
with open('out.json', encoding='utf8') as f:
    data2 = json.load(f)

with open('out2.json', encoding='utf8') as f:  # UTF-8 is ASCII-compatible
    data3 = json.load(f)

# Round-tripping test
print(data == data2, data2)
print(data == data3, data3)

輸出：

True {'FName': '??????'}
True {'FName': '??????'}

out.json（UTF-8 編碼）：

{"FName": "??????"}

out2.json（ASCII 編碼）：

{"FName": "\u092a\u094d\u0930\u0935\u0940\u0923"}

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/318436.html

標籤：Python json 文件 utf-8

上一篇：Python-讀取檔案，重新排列日期并將年份從yy更改為yyyy

下一篇：如何在一行中將輸入寫入檔案并將多個輸入存盤到檔案中并能夠讀取它們？