我想從 MCQ 中抓取資料,但它們會給我一個錯誤,并且還想去next page
以及我如何繼續next pages抓取所有 MCQ 資料是否有任何可行的解決方案,請告訴我們
import time
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager
options = webdriver.ChromeOptions()
# options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1920x1080")
options.add_argument("--disable-extensions")
chrome_driver = webdriver.Chrome(
service=Service(ChromeDriverManager().install()),
options=options
)
def supplyvan_scraper():
with chrome_driver as driver:
driver.implicitly_wait(15)
URL = 'http://www.tulsithakur.com/bankingquiztwo.php'
driver.get(URL)
time.sleep(3)
title = driver.find_element_by_xpath("//span[@id='quest']//text()")
option_1 = driver.find_element_by_xpath("//span[@id='onee']//text()")
option_2 = driver.find_element_by_xpath("//span[@id='two']//text()")
option_3 = driver.find_element_by_xpath("//span[@id='three']//text()")
option_4 = driver.find_element_by_xpath("//span[@id='four']//text()")
print(title,option_1,option_2,option_3,option_4)
supplyvan_scraper()
uj5u.com熱心網友回復:
此頁面不包含 MCQ 問題和選項中的文本。而且,如果您僅單擊下一個按鈕,它就會獲取資料,但在每個欄位(問題、答案)中都顯示未定義。
你可以這樣檢查 -
driver.find_element(By.XPATH, '//*[@id="next"]').click()
title = driver.find_element(By.XPATH, "//span[@id='quest']").text
option_1 = driver.find_element(By.XPATH, "//span[@id='onee']").text
option_2 = driver.find_element(By.XPATH, "//span[@id='two']").text
option_3 = driver.find_element(By.XPATH, "//span[@id='three']").text
option_4 = driver.find_element(By.XPATH, "//span[@id='four']").text
print(title, option_1, option_2, option_3, option_4)
如果你想通過點擊下一步按鈕從所有頁面中抓取資料,你可以試試這個 -
try:
while True:
driver.find_element(By.XPATH, '//*[@id="next"]').click()
title = driver.find_element(By.XPATH, "//span[@id='quest']").text
option_1 = driver.find_element(By.XPATH, "//span[@id='onee']").text
option_2 = driver.find_element(By.XPATH, "//span[@id='two']").text
option_3 = driver.find_element(By.XPATH, "//span[@id='three']").text
option_4 = driver.find_element(By.XPATH, "//span[@id='four']").text
print(title, option_1, option_2, option_3, option_4)
except Exception as e:
print(e)
如果您先單擊左側邊欄 ( Available Quiz Sets),那么undefined問題就會消失。
因此,理想的步驟是 -
- 單擊設定選項(左側欄)
- 刮掉 qs 并點擊下一步按鈕
設定選項按鈕 -
driver.find_element(By.XPATH, '//*[@id="features-wrapper"]/div[1]/div/div[1]/section/div/ul/form[1]/div/li/input')
每個選項的值form都會改變。您的頁面有 70 個選項,因此您可以遍歷每個選項并抓取資料
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/486864.html
