PythonSelenium迭代鏈接表單擊每個鏈接-有解無憂

所以這個問題之前已經被問過，但我仍然在努力讓它發揮作用。

該網頁有一個帶有鏈接的表格，我想通過單擊每個鏈接進行迭代。

Python Selenium 迭代鏈接表單擊每個鏈接

所以這是我到目前為止的代碼

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome(executable_path=r'C:\Users\my_path\chromedriver_96.exe')
driver.get(r"https://www.fidelity.co.uk/shares/ftse-350/")

try:
    element = WebDriverWait(driver, 20).until(
        EC.presence_of_element_located((By.CLASS_NAME, "table-scroll")))

    table = element.find_elements_by_xpath("//table//tbody/tr")
 
    for row in table[1:]:
        print(row.get_attribute('innerHTML'))
        # link.click()

finally:
    driver.close()

輸出樣本

            <td>FOUR</td>
            <td><a href="/factsheets/4IMPRINT-GROUP/GB0006640972-GBP/?id=GB0006640972GBP&amp;idType=isin&amp;marketCode=&amp;idCurrencyid=" target="_parent">4imprint Group plc</a></td>
            <td>Media &amp; Publishing</td>
        

            <td>888</td>
            <td><a href="/factsheets/888-HOLDINGS/GI000A0F6407-GBP/?id=GI000A0F6407GBP&amp;idType=isin&amp;marketCode=&amp;idCurrencyid=" target="_parent">888 Holdings</a></td>
            <td>Hotels &amp; Entertainment Services</td>
        

            <td>ASL</td>
            <td><a href="/factsheets/ABERFORTH-SMALLER-COMPANIES-TRUST/GB0000066554-GBP/?id=GB0000066554GBP&amp;idType=isin&amp;marketCode=&amp;idCurrencyid=" target="_parent">Aberforth Smaller Companies Trust</a></td>
            <td>Collective Investments</td>

如何單擊 href 并迭代到下一個 href？

非常感謝。

編輯我采用了這個解決方案（對 Prophet 的解決方案進行了一些小調整）

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
from selenium.webdriver.common.action_chains import ActionChains


driver = webdriver.Chrome(executable_path=r'C:\Users\my_path\chromedriver_96.exe')
driver.get(r"https://www.fidelity.co.uk/shares/ftse-350/")
actions = ActionChains(driver)
#close the cookies banner
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.ID, "ensCloseBanner"))).click()
#wait for the first link in the table
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table//tbody/tr/td/a")))
#extra wait to make all the links loaded
time.sleep(1)
#get the total links amount
links = driver.find_elements_by_xpath('//table//tbody/tr/td/a') 

for index, val in enumerate(links):
    try:
        #get the links again after getting back to the initial page in the loop
        links = driver.find_elements_by_xpath('//table//tbody/tr/td/a')
        #scroll to the n-th link, it may be out of the initially visible area
        actions.move_to_element(links[index]).perform()
        links[index].click()
        #scrape the data on the new page and get back with the following command
        driver.execute_script("window.history.go(-1)") #you can alternatevely use this as well: driver.back()
        WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table//tbody/tr/td/a")))
        time.sleep(2)
    except StaleElementReferenceException:  
        pass

uj5u.com熱心網友回復：

要在此處執行您想要執行的操作，您首先需要關閉頁面底部的 cookie 橫幅。
然后您可以遍歷表中的鏈接。
因為通過單擊每個鏈接，您將打開一個新頁面，在嚇跑那里的資料之后，您將不得不回傳主頁并獲取下一個鏈接。您不能只是將所有鏈接放入某個串列，然后遍歷該串列，因為通過導航到另一個網頁，Selenium 在初始頁面上抓取的所有現有元素都會變得陳舊。
您的代碼可能是這樣的：

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time


driver = webdriver.Chrome(executable_path=r'C:\Users\my_path\chromedriver_96.exe')
driver.get(r"https://www.fidelity.co.uk/shares/ftse-350/")
actions = ActionChains(driver)
#close the cookies banner
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.ID, "ensCloseBanner"))).click()
#wait for the first link in the table
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table//tbody/tr/td/a")))
#extra wait to make all the links loaded
time.sleep(1)
#get the total links amount
links = driver.find_elements_by_xpath('//table//tbody/tr/td/a') 
for index, val in enumerate(links):
    #get the links again after getting back to the initial page in the loop
    links = driver.find_elements_by_xpath('//table//tbody/tr/td/a')
    #scroll to the n-th link, it may be out of the initially visible area
    actions.move_to_element(links[index]).perform()
    links[index].click()
    #scrape the data on the new page and get back with the following command
    driver.execute_script("window.history.go(-1)") #you can alternatevely use this as well: driver.back()
    WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table//tbody/tr/td/a")))
    time.sleep(1)

uj5u.com熱心網友回復：

您基本上必須執行以下操作：

如果可用，請單擊 cookie 按鈕
獲取頁面上的所有鏈接。
遍歷鏈接串列，然后單擊第一個（首先滾動到 web 元素并為串列項執行此操作），然后導航回原始螢屏。

代碼：

driver = webdriver.Chrome(driver_path)
driver.maximize_window()
wait = WebDriverWait(driver, 30)

driver.get("https://www.fidelity.co.uk/shares/ftse-350/")

try:
    wait.until(EC.element_to_be_clickable((By.ID, "ensCloseBanner"))).click()
    print('Click on the cookies button')
except:
    print('Could not click on the cookies button')
    pass

driver.execute_script("window.scrollTo(0, 750)")

try:
    all_links = wait.until(EC.presence_of_all_elements_located((By.XPATH, "//table//tbody/tr/td/a")))
    print("We have got to deal with", len(all_links), 'links')
    j = 0
    for link in range(len(all_links)):
        links = wait.until(EC.presence_of_all_elements_located((By.XPATH, f"//table//tbody/tr/td/a")))
        driver.execute_script("arguments[0].scrollIntoView(true);", links[j])
        time.sleep(1)
        links[j].click()
        # here write the code to scrape something once the click is performed
        time.sleep(1)
        driver.execute_script("window.history.go(-1)")
        j = j   1
        print(j)
except:
    print('Bot Could not exceute all the links properly')
    pass

進口：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

PS 要處理陳舊的元素參考，您必須在回圈內再次定義 Web 元素串列。

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/409787.html

標籤：

上一篇：網格視圖中的空參考下拉串列

下一篇：Selenium：如何在不使用隱式等待的情況下檢查元素是否在頁面上？