我無法解決以下問題。我試圖從以下網頁收集資料:https ://localhelp.healthcare.gov/#/results?q=UTAH&lat=0&lng=0&city=&state=UT&zip_code=&mp=FFM
我的方法是使用 Selenium chrome 驅動程式從這個網頁上為每個醫療保健代理收集資料,但不知道我將如何遍歷每條記錄并將資料添加到每個創建的串列中。到目前為止,我可以收集一條記錄的資料,但我的問題在于我的回圈。我如何將每條記錄識別為代理,并將其添加到我的資料框中以進行輸出?這是我的代碼:
from selenium import webdriver # connect python with webbrowser-chrome
import time
import pandas as pd
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome('C:/Users/picka/Documents/chromedriver.exe')
driver.maximize_window()
url = 'https://localhelp.healthcare.gov/#/results?q=UTAH&lat=0&lng=0&city=&state=UT&zip_code=&mp=FFM'
name = []
phone = []
email = []
def go_to_network():
driver.get(url)
for agent in driver.find_elements_by_xpath('class.qa-flh-results-list'):
get_name = (WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.qa-flh-resource-name"))).text)
get_phone = (WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a.qa-flh-resource-phone"))).text)
get_email = (WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a.ds-u-overflow--hidden.ds-u-truncate.ds-u-display--inline-block"))).text)
name.append(get_name)
phone.append(get_phone)
email.append(get_email)
go_to_network()
record_output = {'Agent Name': name, 'Phone': phone, 'Email': email}
df = pd.DataFrame(record_output)
df.to_csv(r'C:\Users\picka\Documents\Dev\Reports\Agent-data.csv', header=True, index=False)
print(df)
uj5u.com熱心網友回復:
要使用Selenium提取和列印所有代理名稱、電話和電子郵件,您可以使用List Comprehension inducing WebDriverWait for visibility_of_all_elements_located()并且您可以使用以下任一定位器策略:
代碼塊:
driver.get('https://localhelp.healthcare.gov/#/results?q=UTAH&lat=0&lng=0&city=&state=UT&zip_code=&mp=FFM') get_name = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.qa-flh-resource-name")))] get_phone = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a.qa-flh-resource-phone")))] get_email = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a.ds-u-overflow--hidden.ds-u-truncate.ds-u-display--inline-block")))] for i,j,k in zip(get_name, get_phone, get_email): print(f"{i}'s' phone number is {j} and email is {k}") driver.quit()控制臺輸出:
Wesley Elton's' phone number is (801) 404 - 2424 and email is [email protected] Raquel Bell's' phone number is (801) 842 - 2870 and email is [email protected] Brandon Berglund's' phone number is (801) 981 - 9414 and email is [email protected] Steven Cochran's' phone number is (801) 800 - 8360 and email is [email protected] victoria dang's' phone number is (801) 462 - 5190 and email is [email protected] Dan Jessop's' phone number is (435) 232 - 8833 and email is [email protected] Billy Gerdts's' phone number is (801) 280 - 1162 and email is [email protected] Michael Saldana's' phone number is (801) 879 - 1032 and email is [email protected] Brandon Johnson's' phone number is (435) 249 - 0725 and email is [email protected] Matthew Selph's' phone number is (801) 918 - 3945 and email is [email protected]注意:您必須添加以下匯入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/424481.html
標籤:python-3.x 硒 循环 列表理解 网络驱动程序等待
下一篇:Python中的簡單小乘法表程式
