我嘗試計算不同網站中的單詞數,但是我得到了"TypeError: 'str' object does not support item assignment". 這是我的代碼:
import requests
from bs4 import BeautifulSoup as BS
URL = "https://www.coach.com/shop/women/handbags/view-all"
headers = {'User-Agent': 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)'}
page = requests.get(URL, headers = headers)
html_content= page.text
soup = BS(html_content, "lxml")
content = {}
try:
text_counter = 0
x = soup.find_all("h2")
for y in x:
title_length = len(y.get_text().split())
text_counter = title_length
content = y.findNext('p').get_text()
content_length = len(y.findNext('p').get_text().split())
text_counter = content_length
t = soup.find_all("h3")
for q in t:
title_length = len(q.get_text().split())
text_counter = title_length
content = q.findNext('p').get_text()
content_length = len(q.findNext('p').get_text().split())
text_counter = content_length
content["n_words"] = text_counter
except:
content["n_words"] = ""
完整跟蹤:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-34168cb72272> in <module>
26 text_counter = content_length
---> 27 content["n_words"] = text_counter
28 except:
TypeError: 'str' object does not support item assignment
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-2-34168cb72272> in <module>
27 content["n_words"] = text_counter
28 except:
---> 29 content["n_words"] = ""
TypeError: 'str' object does not support item assignment
uj5u.com熱心網友回復:
您只有兩個具有相同名稱的變數:
content = {}content = q.findNext('p')
只需將全域字典的名稱更改為 smt elsedcontent或word_counter,...
dcontent = {} # <-- d stand for dictionary
try:
text_counter = 0
t = soup.find_all("h2")
# ... same
t = soup.find_all("h3")
for q in t:
title_length = len(q.get_text().split())
text_counter = title_length
content = q.findNext('p') # <-- here the content from the soup
if content.get_text() != '':
content_length = len(content.split())
text_counter = content_length
dcontent["n_words"] = text_counter # <-- here update the dictionary
except Exception as e:
print(e)
dcontent["n_words"] = ""
print(dcontent)
#{'n_words': 52}
評論:
- 用于
tag.get_text() != ''檢查標簽是否包含字串,而不是tag.string is not None我在評論中所說的 - 總是在這種情況下應用這樣的過濾器,這也意味著
h2-case
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/488128.html
上一篇:從字典中列印鍵值對
下一篇:如何更改字典中鍵值對的順序?
