我一直在嘗試通過網路抓取酒店評論,但在多個頁面跳轉時,網頁的 url 沒有改變。所以我使用 selenium 的 webdriver 來解決這個問題。但我首先不能在 google collab 中使用它。任何快速幫助將不勝感激。謝謝!
代碼 :
from selenium import webdriver
import requests
from bs4 import BeautifulSoup
import pandas as pd
# install chromium, its driver, and selenium
!apt-get update
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
!pip install selenium
# set options to be headless, ..
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
# open it, go to a website, and get results
wd = webdriver.Chrome('chromedriver',options=options)
driver = webdriver.chrome()
driver.get("https://www.goibibo.com/hotels/highland-park-hotel-in-trivandrum-1383427384655815037/?hquery={"ci":"20211209","co":"20211210","r":"1-2-0","ibp":"v15"}&hmd=766931490eb7863d2f38f56c6185a1308de782c89dfeeea59d262b827ca15441bf50472cbfdc1ee84aeed8af756809a2e89cfd6eaea0fa308c1ca839e8c313d016ac0f5948658353cf30f1cd83050fd8e6adb2e55f2a5470cadeb0c28b7becc92ac44d81966b82408effde826d40fbff47525e09b5f145e321fe6d104e12933c066323798e33a911e0cbed7312fc1634f8f92fe502c8602556c9a02f34c047d04ff1400c995799156776c1a04e218d6486493edad5b0f7e51a5ea25f5f1cb4f5ed497ee9368137f6ec73b3b1166ee7c1a885920b90c98542e0270b4fa9004005cfe87a4d1efeaedc8e33a848f73345f09bec19153e8bf625cc7f9216e692a1bcc313e7f13a7fc091328b1fb43598bd236994fdc988ab35e70cf3a5d1856c0b0fa9794b23a1a958a5937ac6d258d121a75b7ce9fc70b9a820af43a8e9a3f279be65b5c6fbfff2ba20bfb0f3e3ee425f0b930bf671c50878a540c6a9003b197622b6ab22ae39e07b5174cb12bebbcd2a132bb8570e01b9e253c1bd83cb292de97a&cc=IN&reviewType=gi&vcid=3877384277955108166&srpFilters={"type":["Hotel"]}")
錯誤:

uj5u.com熱心網友回復:
當您發出命令時:
!pip install selenium
默認情況下,它安裝最新的Selenium 4.1.0
最初這行代碼:
wd = webdriver.Chrome('chromedriver',options=options)
啟動Selenium驅動ChromeDriver啟動谷歌瀏覽器 瀏覽背景關系。
但是下面這行代碼:
driver = webdriver.chrome()
chrome()與模塊一樣容易出錯,并且不可呼叫,如下所示:
from selenium.webdriver.chrome.options import Options
因此,您會看到錯誤:
'module' object is not callable
解決方案
第一行啟動ChromeDriver / Chrome組合,但有兩個DeprecationWarning為:
- 棄用警告:executable_path 已被棄用,請傳入一個 Service 物件
- find_element_by * 命令在 selenium_ 中被棄用
暫時你可以忽略DeprecationWarning但你需要洗掉這行代碼:
driver = webdriver.chrome()
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/372889.html
下一篇:如何在html中下載.lnk檔案
