網路抓取亞馬遜評論precentsbs4-有解無憂

所以我試圖獲得亞馬遜產品頁面中每個星級的評論百分比。

這是我想要獲得的輸出：

Awesome Feedback: 72%
Good Feedback: 15%
Regular Feedback: 7%
Bad Feedback: 3%
Awful Feedback: 4%

到目前為止，這是我得到的輸出：

Awesome Feedback: 72%
Traceback (most recent call last):
  File "c:\Users\Nana\Desktop\stuff\Python\Web Scraping\Amazon Smart 
Buyer\amazonR.py", line 34, in <module>    bot()
  File "c:\Users\Nana\Desktop\stuff\Python\Web Scraping\Amazon Smart 
Buyer\amazonR.py", line 14, in __init__    self.r()
  File "c:\Users\Nana\Desktop\stuff\Python\Web Scraping\Amazon Smart 
Buyer\amazonR.py", line 26, in r       
  print(f'Good Feedback: {self.pd[1]}')
IndexError: list index out of range

如您所見，我設法獲得了令人敬畏的反饋，但其他反饋無效...問題是我獲得了孤立串列中的所有 precentage，并且每個 precntage 都有他的一個串列。正如你在這里看到的：

['72%'],  ['15%'], ['7%'], ['3%'], ['4%']

我很糾結……如果有辦法訪問 for 回圈的所有索引并將它們合并到一個串列中，請與我分享……這是我的代碼：

from bs4 import BeautifulSoup
from selenium import webdriver




class bot:
    def __init__(self):
        self.path = 'C:/Users/Nana/Desktop/stuff/Python/Web Scraping/chromedriver.exe'
        self.browser = webdriver.Chrome(self.path)
        self.browser.get('https://www.amazon.com/מקלדת-מוארת-בצבעי-ועכבר-לגיימינג/dp/B016Y2BVKA/ref=sr_1_1_sspa?dchild=1&keywords=keyboard&qid=1633809059&sr=8-1-spons&psc=1&smid=A3TJEO884AOUB3&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUFCRTg0S1dWNjRTQUMmZW5jcnlwdGVkSWQ9QTA2NTEwNzgzNFdKSVA5NEpQODRQJmVuY3J5cHRlZEFkSWQ9QTAwMjcwNDExUFJOUjA4U0pEWDlRJndpZGdldE5hbWU9c3BfYXRmJmFjdGlvbj1jbGlja1JlZGlyZWN0JmRvTm90TG9nQ2xpY2s9dHJ1ZQ==')
        self.r()


    def r(self):
        self.soup = BeautifulSoup(self.browser.page_source, 'lxml')
        self.div5 = self.soup.find('div', id = 'reviewsMedley')
        self.tbody = self.div5.find('tbody')
        self.trs = self.tbody.find_all('tr')
        for self.tr in self.trs:
            self.precents = self.tr.find('td', class_ = 'a-text-right a-nowrap')
            self.pd = [self.precents.text.strip()]
            print(f'Awesome Feedback: {self.pd[0]}')
            print(f'Good Feedback: {self.pd[1]}')
            print(f'Regular Feedback: {self.pd[2]}')
            print(f'Bad Feedback: {self.pd[3]}')
            print(f'Awful Feedback: {self.pd[4]}')



        
bot()

uj5u.com熱心網友回復：

有一個單獨//td[@class='a-text-right a-nowrap']的每個元素里面tr你要使用的元素。
所以，而不是

def r(self):
    self.soup = BeautifulSoup(self.browser.page_source, 'lxml')
    self.div5 = self.soup.find('div', id = 'reviewsMedley')
    self.tbody = self.div5.find('tbody')
    self.trs = self.tbody.find_all('tr')
    for self.tr in self.trs:
        self.precents = self.tr.find('td', class_ = 'a-text-right a-nowrap')
        self.pd = [self.precents.text.strip()]
        print(f'Awesome Feedback: {self.pd[0]}')
        print(f'Good Feedback: {self.pd[1]}')
        print(f'Regular Feedback: {self.pd[2]}')
        print(f'Bad Feedback: {self.pd[3]}')
        print(f'Awful Feedback: {self.pd[4]}')

嘗試這個：

def r(self):
        self.soup = BeautifulSoup(self.browser.page_source, 'lxml')
        self.div5 = self.soup.find('div', id = 'reviewsMedley')
        self.precentages = self.div5.find_all('td', class_ = 'a-text-right a-nowrap')
        for self.precents in self.precentages:
            self.pd = [self.precents.text.strip()]
            print(f'Awesome Feedback: {self.pd[0]}')
            print(f'Good Feedback: {self.pd[1]}')
            print(f'Regular Feedback: {self.pd[2]}')
            print(f'Bad Feedback: {self.pd[3]}')
            print(f'Awful Feedback: {self.pd[4]}')

uj5u.com熱心網友回復：

有兩個選項 #1在回圈外定義pd為空list，附加每個迭代結果并在回圈外列印或執行以下操作：

例子

def r(self):
    self.soup = BeautifulSoup(self.browser.page_source, 'lxml')
    self.pd = [x.text.strip() for x in self.soup.select('div#reviewsMedley tr td.a-text-right.a-nowrap')]
    print(f'Awesome Feedback: {self.pd[0]}')
    print(f'Good Feedback: {self.pd[1]}')
    print(f'Regular Feedback: {self.pd[2]}')
    print(f'Bad Feedback: {self.pd[3]}')
    print(f'Awful Feedback: {self.pd[4]}')

輸出：

Awesome Feedback: 72%
Good Feedback: 15%
Regular Feedback: 7%
Bad Feedback: 3%
Awful Feedback: 4%

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/311608.html

標籤：Python 硒硒网络驱动程序美汤

上一篇：PythonSeleniumWebdriver：無法在瀏覽器上加載所有評論

下一篇：如何使用seleniumpython緩慢向下滾動網頁？