我正在使用 Selenium 從以下頁面中提取資料。
頁面網址:www2.miami-dadeclerk.com/cef/CitationSearch.aspx
單擊對開本:0131230371470。點擊第一個。
我使用以下代碼來提取某些資訊:
templist = []
status = driver.find_element_by_xpath('.//*[@id="lblCitationHeader"]').text
total_due = driver.find_element_by_xpath('.//*[@id="lblCitationHeader"]').text
issue_dept = driver.find_element_by_xpath('.//*[@id="form1"]/div[4]/div[9]/div/div/div[2]/table/tbody/tr[5]/td[2]').text
lien_placed = driver.find_element_by_xpath('.//*[@id="lblLienPlaced"]').text
Table_dict = {
'Status': status,
'Total Due': total_due,
'Issuing Department': issue_dept,
'Lien_Placed': lien_placed
}
templist.append(Table_dict)
df = pd.DataFrame(templist)
結果如下:
Status Total Due Issuing Department Lien_Placed
0 Citation No.: 2010 - S001916 Issue Date: 1/ ... Citation No.: 2010 - S001916 Issue Date: 1/ ... 05 ANIMAL SERVICES DEPARTMENT (305) 629-7387
在這里, lblCitationHeader 下的所有資料都在 Status 和 Total due 下。
為此,我提取了他們的 Xpath:
Status: //*[@id="lblCitationHeader"]/text()[3]
Total Due: //*[@id="lblCitationHeader"]/text()[4]
當我在代碼中輸入上述內容時:
status = driver.find_element_by_xpath('.//*[@id="lblCitationHeader"]/text()[3]').text
將出現以下錯誤:
Message: invalid selector: Unable to locate an element with the xpath expression .//*[@id="lblCitationHeader"]/text()[3]"] because of the following error:
SyntaxError: Failed to execute 'evaluate' on 'Document': The string './/*[@id="lblCitationHeader"]/text()[3]"]' is not a valid XPath expression.
(Session info: chrome=96.0.4664.110)
我知道 Xpath 用于定位元素而不是文本。但是,我無法找到存盤文本的部分并將其回傳。
圖片供參考:

我要提取的資料是:-
狀態 到期發行部門留置權總額
uj5u.com熱心網友回復:
對于當前檔案,STATUS TOTAL DUE和ISSUING DEPT欄位有一個值,要提取您需要為visibility_of_element_located()引入WebDriverWait的值,您可以使用以下任一定位器策略:
代碼塊:
driver.get("https://www2.miami-dadeclerk.com/cef/CitationSearch.aspx")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.LINK_TEXT, "Folio"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input#txtFolioNumber"))).send_keys("0131230371470")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input#btnFolioSearch"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "table tbody>tr>td>a>span"))).click()
status = driver.execute_script('return arguments[0].childNodes[5].textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span#lblCitationHeader")))).strip()
total_due = driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span#lblCitationHeader")))).strip()
issue_dept = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[contains(., 'Issuing Department:')]//following::td[1]/span"))).text
print(f"{status}--{total_due}--{issue_dept}")
控制臺輸出:
* DEPARTMENT CLOSED *--$0.00--05 ANIMAL SERVICES DEPARTMENT (305) 629-7387
注意:您必須添加以下匯入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/397009.html
