原GitHub地址:https://github.com/Yixiaohan/show-me-the-code
題目:任一個英文的純文本檔案,統計其中的單詞出現的個數,
代碼:
import collections
# 打開文本
with open('words.txt', 'rt') as f:
# 將文本中的標點符號去掉,再依據空格來將單詞分開
str = f.read().replace(',', '').replace('.', '').split(' ')
print(str) # 文本中的單詞
# 使用Counter物件來計數
counters = collections.Counter(str)
# 輸出結果
print(counters)
原文本:
Today, a lot of people’s life pace is so fast that they feel tired about life. Some you
ng people are lost when they talk about future, because they are stuck in their work. Most Chinese people lack of passion about their life, but life needs passion, or we are just walking dead. The way to find passion is to see the beauty of life, such as spending
more time with families. Even some people live far away, they still need to contact with their friends and families. Talking with them can always bring happy memories and forget annoyance. The other effective way to gain passion is to travel. When people are in vacation, they often feel easy and regain the power. After appreciating beautiful scenery, they will broaden their vision and have the passion to fight again.
進行分割單詞處理后:
['Today', 'a', 'lot', 'of', 'people’s', 'life', 'pace', 'is', 'so', 'fast', 'that', 'th
ey', 'feel', 'tired', 'about', 'life', 'Some', 'young', 'people', 'are', 'lost', 'when', 'they', 'talk', 'about', 'future', 'because', 'they', 'are', 'stuck', 'in', 'their', 'work', 'Most', 'Chinese', 'people', 'lack', 'of', 'passion', 'about', 'their', 'life', 'but', 'life', 'needs', 'passion', 'or', 'we', 'are', 'just', 'walking', 'dead', 'The', 'way', 'to', 'find', 'passion', 'is', 'to', 'see', 'the', 'beauty', 'of', 'life', 'such',
'as', 'spending', 'more', 'time', 'with', 'families', 'Even', 'some', 'people', 'live',
'far', 'away', 'they', 'still', 'need', 'to', 'contact', 'with', 'their', 'friends', 'and', 'families', 'Talking', 'with', 'them', 'can', 'always', 'bring', 'happy', 'memories', 'and', 'forget', 'annoyance', 'The', 'other', 'effective', 'way', 'to', 'gain', 'passion', 'is', 'to', 'travel', 'When', 'people', 'are', 'in', 'vacation', 'they', 'often', 'feel', 'easy', 'and', 'regain', 'the', 'power', 'After', 'appreciating', 'beautiful', 'scenery', 'they', 'will', 'broaden', 'their', 'vision', 'and', 'have', 'the', 'passion',
'to', 'fight', 'again']
最終輸出結果:
Counter({'they': 6, 'to': 6, 'life': 5, 'passion': 5, 'people': 4, 'are': 4, 'their': 4, 'and': 4, 'of': 3, 'is': 3, 'about': 3, 'the': 3, 'with': 3, 'feel': 2, 'in': 2, 'The': 2, 'way': 2, 'families': 2, 'Today': 1, 'a': 1, 'lot': 1, 'people’s': 1, 'pace': 1, 's
o': 1, 'fast': 1, 'that': 1, 'tired': 1, 'Some': 1, 'young': 1, 'lost': 1, 'when': 1, 'talk': 1, 'future': 1, 'because': 1, 'stuck': 1, 'work': 1, 'Most': 1, 'Chinese': 1, 'lack': 1, 'but': 1, 'needs': 1, 'or': 1, 'we': 1, 'just': 1, 'walking': 1, 'dead': 1, 'find': 1, 'see': 1, 'beauty': 1, 'such': 1, 'as': 1, 'spending': 1, 'more': 1, 'time': 1, 'Even': 1, 'some': 1, 'live': 1, 'far': 1, 'away': 1, 'still': 1, 'need': 1, 'contact': 1, 'friends': 1, 'Talking': 1, 'them': 1, 'can': 1, 'always': 1, 'bring': 1, 'happy': 1, 'memories': 1, 'forget': 1, 'annoyance': 1, 'other': 1, 'effective': 1, 'gain': 1, 'travel': 1, 'When': 1, 'vacation': 1, 'often': 1, 'easy': 1, 'regain': 1, 'power': 1, 'After': 1, 'appreciating': 1, 'beautiful': 1, 'scenery': 1, 'will': 1, 'broaden': 1, 'vision': 1, 'have': 1, 'fight': 1, 'again': 1})
網上看見的另一種方法:
import collections
import re
file_name = "The Old Man and the Sea.txt"
c = collections.Counter()
with open('words.txt', 'r') as f:
c.update(re.findall(r'\b[a-zA-Z\']+\b', f.read()))
# c.update(re.findall(r'\b[a-zA-Z]+\b', f.read()))
with open("WordCount.txt", 'w') as wf:
for word in c.most_common():
wf.write(word[0]+','+str(word[1])+'\n')
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/193116.html
標籤:Python
上一篇:python基礎資料型別整理
