我在下面有這行代碼,可以從第 1 頁刮取/列印 250 個股票代碼。
print([my_elem.text for my_elem in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id^='gridview-1070-record']")))])
然后我有這行代碼可以單擊下一頁并帶我到第 2 頁。
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="button-1157"]'))).click()
然后我從第 2 頁刮取/列印接下來的 250 個股票代碼,并通過重復 2 行代碼繼續瀏覽所有頁面。
print([my_elem.text for my_elem in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id^='gridview-1070-record']")))])
有人可以告訴我如何撰寫回圈代碼,這樣我就不必為所有 60 頁列出這兩行嗎?
完整代碼
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
import pandas as pd
import requests
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
ser = Service("./chromedriver.exe")
browser = driver = webdriver.Chrome(service=ser)
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
})
"""
})
driver.execute_cdp_cmd("Network.enable", {})
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.53 Safari/537.36'})
wait = WebDriverWait(driver, 30)
driver.get("https://stockrover.com")
wait.until(EC.visibility_of_element_located((By.XPATH, "/html/body/div[1]/div/section[2]/div/ul/li[2]"))).click()
user = driver.find_element(By.NAME, "username")
password = driver.find_element(By.NAME, "password")
user.clear()
user.send_keys("vibajajo64")
password.clear()
password.send_keys("vincer64")
driver.find_element(By.NAME, "Sign In").click()
wait = WebDriverWait(driver, 30)
print([my_elem.text for my_elem in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id^='gridview-1070-record']")))])
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="button-1157"]'))).click()
print([my_elem.text for my_elem in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id^='gridview-1070-record']")))])
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="button-1157"]'))).click()
print([my_elem.text for my_elem in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id^='gridview-1070-record']")))])
uj5u.com熱心網友回復:
您可以嘗試如下。
如前所述,您可以使用for loop回圈瀏覽頁面。
# Get the number of pages - 18
pages = driver.find_element(By.XPATH,"//div[contains(@id,'tbtext')][2]").text.split()
num_pages = int(pages[1])
# Iterate over that number of pages
for i in range(num_pages-1):
print([my_elem.text for my_elem in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id^='gridview-1070-record']")))])
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="button-1157"]'))).click()
或者您可以繼續嘗試提取詳細資訊并單擊“下一步”按鈕,直到禁用“下一步”按鈕。
try:
while True:
# Print the stock symbols
print([my_elem.text for my_elem in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id^='gridview-1070-record']")))])
# Click on next page button
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="button-1157"]'))).click()
except:
print("Next button disabled")
更新以將所有股票存盤在一個串列中。
stocks_list = []
try:
while True:
# Print the stock symbols
stocks_list.extend([my_elem.text for my_elem in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id^='gridview-1070-record']")))])
# Click on next page button
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="button-1157"]'))).click()
except:
print("Next button disabled")
print(stocks_list) # Prints entire list of stocks
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/436000.html
