jieba庫、詞云(wordcloud)的安裝
打開window的CMD(選單鍵+R+Enter)
一般情況下:輸入pip install jiaba(回車),等它下好,建議在網路穩定的時候操作
不行就試試這個:pip install -i https://pypi.tuna.tsinghua.edu.cn/simple jiaba
詞云安裝也是如此:pip install -i https://pypi.tuna.tsinghua.edu.cn/simple wordcloud
顯示Successfully installed....就安裝成功了(如下圖??:)
jieba庫的使用
用jieba庫分析文章、小說、報告等,到詞頻統計,并對詞頻進行排序
代碼??
(僅限用中文):
1 # -*- coding: utf-8 -*- 2 """ 3 Created on Wed Apr 22 15:40:16 2020 4 5 @author: ASUS 6 """ 7 #jiaba詞頻統計 8 import jieba 9 txt = open(r'C:\Users\ASUS\Desktop\創意策劃書.txt', "r", encoding='gbk').read()#讀取檔案 10 words = jieba.lcut(txt)#lcut()函式回傳一個串列型別的分詞結果 11 counts = {} 12 for word in words: 13 if len(word) == 1:#忽略標點符號和其它長度為1的詞 14 continue 15 else: 16 counts[word] = counts.get(word,0) + 1 17 items = list(counts.items())#字典轉串列 18 items.sort(key=lambda x:x[1], reverse=True) #按詞頻降序排列 19 n=eval(input("詞的個數:"))#回圈n次 20 for i in range(n): 21 word, count = items[i] 22 print ("{0:<10}{1:>5}".format(word, count))jieba分詞

(用于英文需要做些許調整):
def getText():
txt=open('hamlet.txt')#檔案的存盤位置
txt = txt.lower()#將字母全部轉化為小寫
for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_‘{|}~':
txt = txt.replace(ch, " ") #將文本中特殊字符替換為空格
return txt
好玩的詞云
做一個詞云圖
1 import jieba 2 import wordcloud 3 import matplotlib.pyplot as plt 4 f = open(r"C:\Users\ASUS\Desktop\創意策劃書.txt", "r", encoding="gbk")#有些電腦適用encoding="utf-8",我電腦只能用encoding="gbk",我也不知道為啥 5 t = f.read() 6 f.close() 7 ls = jieba.lcut(t) 8 9 txt = " ".join(ls) 10 w = wordcloud.WordCloud( \ 11 width = 4800, height = 2700,\ 12 background_color = "black", 13 font_path = "msyh.ttc" #msyh.ttc可以修改字體,在網上下載好自己喜歡的字體替換上去 14 ) 15 myword=w.generate(txt) 16 plt.imshow(myword) 17 plt.axis("off") 18 plt.show() 19 w.to_file("詞頻.png")#生成圖片詞云圖

統計的內容可以忽略,代碼可以認真看看
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/155315.html
標籤:Python
