我有一個 .txt 檔案,我希望回傳每個單詞出現在其中的計數。我讓代碼可以作業,但現在我想細化為只回傳 5 個或更多字符長的單詞。我在 for 陳述句中添加了“len”函式,但它仍然回傳所有單詞。任何幫助將不勝感激。
我也想知道我是否可以按鍵數排序,首先回傳計數最高的單詞。
import string
import os
os.chdir('mydirectory') # Changes directory.
speech = open("obamaspeech.txt", "r") # Opens file.
emptyDict = dict() # Creates dictionary
for line in speech:
line = line.strip() # Removes leading spaces.
line = line.lower() # Convert to lowercase.
line = line.translate(line.maketrans("", "", string.punctuation)) # Removes punctuation.
words = line.split(" ") # Splits lines into words.
for word in words:
if len(word) >= 5 in emptyDict:
emptyDict[word] = emptyDict[word] 1
else:
emptyDict[word] = 1
for key in list(emptyDict.keys()):
print(key, ":", emptyDict[key])
uj5u.com熱心網友回復:
我認為您需要單獨測驗字長:
for word in words:
if len(word) >= 5:
if word in emptyDict:
emptyDict[word] = emptyDict[word] 1
else:
emptyDict[word] = 1
uj5u.com熱心網友回復:
另一個答案向您展示了如何將代碼修改為所需的效果。另一方面,這是另一種實作。請注意,在串列理解和 collections 模塊中的 Counter 物件的幫助下,計算單詞并按頻率對其進行排序變得更加容易。
from collections import Counter
os.chdir('mydirectory')
with open("obamaspeech.txt", "r") as speech:
full_speech = speech.read().lower().translate(str.maketrans("", "", string.punctuation))
words = full_speech.split()
count = Counter([w for w in words if len(w)>=5])
for w,k in count.most_common():
print(f"{w}: {k} time(s)")
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/430963.html
下一篇:在python中計算子序列
