jieba庫、詞云（wordcloud）的安裝

打開window的CMD（選單鍵+R+Enter）

一般情況下：輸入pip install jiaba（回車），等它下好，建議在網路穩定的時候操作

不行就試試這個：pip install -i https://pypi.tuna.tsinghua.edu.cn/simple jiaba

詞云安裝也是如此：pip install -i https://pypi.tuna.tsinghua.edu.cn/simple wordcloud

顯示Successfully installed....就安裝成功了（如下圖??：）

jieba庫的使用

用jieba庫分析文章、小說、報告等，到詞頻統計，并對詞頻進行排序

代碼??

（僅限用中文）：

 1 # -*- coding: utf-8 -*-
 2 """
 3 Created on Wed Apr 22 15:40:16 2020
 4 
 5 @author: ASUS
 6 """
 7 #jiaba詞頻統計
 8 import jieba
 9 txt = open(r'C:\Users\ASUS\Desktop\創意策劃書.txt', "r", encoding='gbk').read()#讀取檔案
10 words  = jieba.lcut(txt)#lcut()函式回傳一個串列型別的分詞結果
11 counts = {}
12 for word in words:
13     if len(word) == 1:#忽略標點符號和其它長度為1的詞
14         continue
15     else:
16         counts[word] = counts.get(word,0) + 1
17 items = list(counts.items())#字典轉串列
18 items.sort(key=lambda x:x[1], reverse=True) #按詞頻降序排列
19 n=eval(input("詞的個數："))#回圈n次
20 for i in range(n):
21     word, count = items[i]
22     print ("{0:<10}{1:>5}".format(word, count))

jieba分詞

（用于英文需要做些許調整）：

def getText():
    txt=open('hamlet.txt')#檔案的存盤位置
    txt = txt.lower()#將字母全部轉化為小寫
    for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_‘{|}~':
        txt = txt.replace(ch, " ")   #將文本中特殊字符替換為空格
    return txt

好玩的詞云

做一個詞云圖

 1 import jieba
 2 import wordcloud
 3 import matplotlib.pyplot as plt
 4 f = open(r"C:\Users\ASUS\Desktop\創意策劃書.txt", "r", encoding="gbk")#有些電腦適用encoding="utf-8"，我電腦只能用encoding="gbk"，我也不知道為啥
 5 t = f.read()
 6 f.close()
 7 ls = jieba.lcut(t)
 8  
 9 txt = " ".join(ls)
10 w = wordcloud.WordCloud( \
11     width = 4800, height = 2700,\
12     background_color = "black",
13     font_path = "msyh.ttc"    #msyh.ttc可以修改字體，在網上下載好自己喜歡的字體替換上去
14     )
15 myword=w.generate(txt)
16 plt.imshow(myword)
17 plt.axis("off")
18 plt.show()
19 w.to_file("詞頻.png")#生成圖片

詞云圖

統計的內容可以忽略，代碼可以認真看看

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/155315.html

標籤：Python

上一篇：【2020Python修煉記】python并發編程（一）背景知識

下一篇：python字典保存星座性格特點并輸出

jieba庫使用以及好玩的詞云

jieba庫、詞云（wordcloud）的安裝

jieba庫的使用

好玩的詞云