我正在嘗試提取餐廳的所有谷歌評論。這家餐廳有超過 900 條評論。但是,我的腳本只能提取 50 條評論。我不確定我在哪里犯了錯誤。任何解決問題的幫助將不勝感激。這是我的代碼:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.action_chains import ActionChains
import time
driver = webdriver.Chrome()
base_url = 'https://www.google.com/search?tbs=lf:1,lf_ui:9&tbm=lcl&sxsrf=AOaemvJFjYToqQmQGGnZUovsXC1CObNK1g:1633336974491&q=10 famous restaurants in Dunedin&rflfq=1&num=10&sa=X&ved=2ahUKEwiTsqaxrrDzAhXe4zgGHZPODcoQjGp6BAgKEGo&biw=1280&bih=557&dpr=2#lrd=0xa82eac0dc8bdbb4b:0x4fc9070ad0f2ac70,1,,,&rlfi=hd:;si:5749134142351780976,l,CiAxMCBmYW1vdXMgcmVzdGF1cmFudHMgaW4gRHVuZWRpbiJDUjEvZ2VvL3R5cGUvZXN0YWJsaXNobWVudF9wb2kvcG9wdWxhcl93aXRoX3RvdXJpc3Rz2gENCgcI5Q8QChgFEgIIFkiDlJ7y7YCAgAhaMhAAEAEQAhgCGAQiIDEwIGZhbW91cyByZXN0YXVyYW50cyBpbiBkdW5lZGluKgQIAxACkgESaXRhbGlhbl9yZXN0YXVyYW50mgEkQ2hkRFNVaE5NRzluUzBWSlEwRm5TVU56ZW5WaFVsOUJSUkFCqgEMEAEqCCIEZm9vZCgA,y,2qOYUvKQ1C8;mv:[[-45.8349553,170.6616387],[-45.9156414,170.4803685]]'
driver.get(base_url)
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//div[./span[text()='Newest']]"))).click()
title = driver.find_element_by_xpath("//div[@class='P5Bobd']").text
address = driver.find_element_by_xpath("//div[@class='T6pBCe']").text
overall_rating = driver.find_element_by_xpath("//div[@class='review-score-container']//span[@class='Aq14fc']").text
total_reviews_text =driver.find_element_by_xpath("//div[@class='review-score-container']//div//div//span//span[@class='z5jxId']").text
num_reviews = int (total_reviews_text.split()[0])
all_reviews = WebDriverWait(driver, 3).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'div.gws-localreviews__google-review')))
total_reviews = len(all_reviews)
while total_reviews < num_reviews:
driver.execute_script('arguments[0].scrollIntoView(true);', all_reviews[-1])
WebDriverWait(driver, 5, 0.25).until_not(EC.presence_of_element_located((By.CSS_SELECTOR, 'div[class$="activityIndicator"]')))
#all_reviews = driver.find_elements_by_css_selector('div.gws-localreviews__google-review')
all_reviews = WebDriverWait(driver, 3).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'div.gws-localreviews__google-review')))
print(total_reviews)
total_reviews =1
uj5u.com熱心網友回復:
Selenium 預期條件presence_of_all_elements_located并不真正等待與傳遞給該方法定位器的所有元素匹配的存在。
它實際上等待至少 1 個與傳遞的定位符匹配的元素。
所以代替
num_reviews = int (total_reviews_text.split()[0])
all_reviews = WebDriverWait(driver, 3).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'div.gws-localreviews__google-review')))
total_reviews = len(all_reviews)
請試試這個:
num_reviews = int (total_reviews_text.split()[0])
WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'div.gws-localreviews__google-review')))
time.sleep(2)
all_reviews = driver.find_elements_by_css_selector('div.gws-localreviews__google-review')
total_reviews = len(all_reviews)
可能您在第二次使用時也會遇到同樣的問題presence_of_all_elements_located。
一般來說,永遠不要相信presence_of_all_elements_located,它只會給你第一個捕獲的匹配項。
uj5u.com熱心網友回復:
如果您在滾動時發現評論是延遲加載的。您可能需要添加一些代碼以向下滾動并等待所有評論出現。
uj5u.com熱心網友回復:
嘗試如下:
能夠提取 50 多個評論者姓名。
i = 0
try:
while True:
time.sleep(.5)
reviews = driver.find_elements_by_xpath("//div[@id='reviewSort']//div[contains(@class,'google-review')]")
driver.execute_script("arguments[0].scrollIntoView(true);",reviews[i])
print(f"{i 1}:{reviews[i].find_element_by_xpath('./div/div/div/a').text}")
i =1
except Exception as e:
print(e)
pass
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/311607.html
標籤:Python 蟒蛇-3.x 硒 硒网络驱动程序 网络驱动程序等待
