將報廢的內容保存到Python中的字典中-有解無憂

我嘗試計算不同網站中的單詞數，但是我得到了"TypeError: 'str' object does not support item assignment". 這是我的代碼：

import requests
from bs4 import BeautifulSoup as BS
URL = "https://www.coach.com/shop/women/handbags/view-all"
headers = {'User-Agent': 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)'}
page = requests.get(URL, headers = headers)
html_content= page.text
soup = BS(html_content, "lxml")
content = {}
try:
    text_counter = 0
    x = soup.find_all("h2")
    for y in x:
        title_length = len(y.get_text().split())
        text_counter  = title_length
        content = y.findNext('p').get_text()
        content_length = len(y.findNext('p').get_text().split())
        text_counter  = content_length

    t = soup.find_all("h3")
    for q in t:
        title_length = len(q.get_text().split())
        text_counter  = title_length
        content = q.findNext('p').get_text()
        content_length = len(q.findNext('p').get_text().split())
        text_counter  = content_length
    content["n_words"] = text_counter 
except:
    content["n_words"] = ""

完整跟蹤：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-34168cb72272> in <module>
     26         text_counter  = content_length
---> 27     content["n_words"] = text_counter
     28 except:

TypeError: 'str' object does not support item assignment

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-2-34168cb72272> in <module>
     27     content["n_words"] = text_counter
     28 except:
---> 29     content["n_words"] = ""

TypeError: 'str' object does not support item assignment

uj5u.com熱心網友回復：

您只有兩個具有相同名稱的變數：

content = {}
content = q.findNext('p')

只需將全域字典的名稱更改為 smt elsedcontent或word_counter,...

dcontent = {} # <-- d stand for dictionary
try:
    text_counter = 0
    
    t = soup.find_all("h2")    
    # ... same

    t = soup.find_all("h3")
    for q in t:
        title_length = len(q.get_text().split())
        text_counter  = title_length
        content = q.findNext('p') # <-- here the content from the soup
        if content.get_text() != '':
            content_length = len(content.split())
            text_counter  = content_length
            dcontent["n_words"] = text_counter # <-- here update the dictionary
except Exception as e:
    print(e)
    dcontent["n_words"] = ""

print(dcontent)
#{'n_words': 52}

評論：

用于tag.get_text() != ''檢查標簽是否包含字串，而不是tag.string is not None我在評論中所說的
總是在這種情況下應用這樣的過濾器，這也意味著h2-case

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/488128.html

標籤：Python 细绳字典

上一篇：從字典中列印鍵值對

下一篇：如何更改字典中鍵值對的順序？