首先,我是 Python 新手,但我已經使用其他編程語言(主要是 C 、PHP 和 Java)進行了開發。我無法讓 python 做我想做的事:創建一個正確的 json 字串。我的問題不是創建 json 字串本身,而是創建內容。讓我解釋一下我有這個代碼:
import spacy
...
def eng_pos(textstr):
x = english() #A class which i developed to get the infinitive form of each verb
data = {}
nlp = spacy.load("en_core_web_trf")
doc = nlp(textstr)
for token in doc:
data[token.text] = token.pos_
if token.pos_ == "VERB":
data['Tense']=token.morph.get("Tense")
data['Infinitive'] = x.infinitive(token.text)
print(token.text " " token.pos_)
json_data = json.dumps(data)
print(json_data)
return json_data
它基本上從包含每個單詞的詞性(pos)的字典中創建了一個 json 資料,并且對于每個動詞,它給了我時態和不定式形式。它還列印每個令牌及其位置。完成后,它將所有內容轉儲到 json 字串中并將其列印在螢屏上,然后回傳。到目前為止沒有問題,因為它給了我一個有效的 json 但沒有正確的內容:
對于資訊,我將這句話用作 textstr 作為示例:
“想象一下,如果你愿意,一艘可能適合你手掌的玩具船。在船的中部,在它的側面添加一個蹲下的縫紉線軸。將它放大大約一千倍,結果是 150 -米長的耐克森 Aurora。問題中的線是數公里長的高壓電力線,準備從船尾穿過海底部署。”
這給了我這個json:
{"IMAGINE": "VERB", "Tense": ["Past"], "Infinitive": "deploy", ",": "PUNCT", "IF": "SCONJ", "YOU": "PRON", "will": "AUX", "a": "DET", "toy": "NOUN", "boat": "NOUN", "that": "SCONJ", "might": "AUX", "fit": "VERB", "in": "ADP", "the": "DET", "palm": "NOUN", "of": "ADP", "your": "PRON", "hand": "NOUN", ".": "PUNCT", "At": "ADP", "mid": "NOUN", "-": "PUNCT", "ship": "NOUN", "add": "VERB", "squat": "ADJ", "spool": "NOUN", "sewing": "NOUN", "thread": "NOUN", "lying": "VERB", "on": "ADP", "its": "PRON", "side": "NOUN", "Scale": "VERB", "up": "ADP", "about": "ADP", "thousand": "ADV", "fold": "ADV", "and": "CCONJ", "result": "NOUN", "is": "AUX", "150": "NUM", "metre": "NOUN", "long": "ADJ", "Nexans": "PROPN", "Aurora": "PROPN", "The": "DET", "question": "NOUN", "kilometres": "NOUN", "high": "ADJ", "voltage": "NOUN", "power": "NOUN", "line": "NOUN", "ready": "ADJ", "to": "PART", "be": "AUX", "deployed": "VERB", "from": "ADP", "aft": "NOUN", "across": "ADP", "sea": "NOUN", "floor": "NOUN"}
如果您仔細觀察 json 字串,您會注意到僅對段落最后一句中的最后一個動詞“deployed”給出了時態和不定式。(而不是針對這個短段落中的每個動詞,因為我想要那樣)。為什么?這是我的問題。為什么只考慮最后一個動詞而忽略其他動詞?我認為這與我的 python 代碼有關,因為其他一切都是正確的。我被困了兩天,我看不出問題出在哪里,所以如果你能幫助我,請。
uj5u.com熱心網友回復:
那是因為您正在使用鍵Tense和寫入字典Infinitive,并且每次這樣做時,資料都會被覆寫。
您很可能希望保存嵌套字典,它不僅具有pos_,但Tens并Infinitive還有:
data[token.text] = {"pos": token.pos_}
if token.pos_ == "VERB":
data[token.text]['Tense']=token.morph.get("Tense")
data[token.text]['Infinitive'] = x.infinitive(token.text)
這將產生如下內容:
{
...
"deployed": {
"pos_": "VERB",
"Tense": ["PAST"],
"Infinitive": "deploy"
},
"floor": {
"pos_": "NOUN"
}
...
}
但是請記住,這仍然會覆寫重復單詞的資料。但是,由于同一個詞的結果應該始終相同,因此這可能沒問題。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/349246.html
上一篇:在打字稿中將陣列轉換為json
