我有一些簡單的Selenium刮削代碼,可以回傳所有的搜索結果,但是當我運行for回圈時,它顯示了一個錯誤。訊息:無效的引數:'url'必須是一個字串
。(會話資訊:chrome=93.0.4577.82)
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_path = r'C:Windowschromedriver.exe'/span>
driver = webdriver.Chrome(chrome_path)
driver.get("https://www.youtube.com/results?search_query=python course")
user_data = driver.find_elements_by_xpath('//*[@id="video-title"] ')
鏈接 = []
for i in user_data:
links.append(i.get_attribute('href')
print(link)
wait = WebDriverWait(driver, 10)
for x in links:
driver.get(x)
v_id = x.strip('https://www.youtube.com/watch?v=')
#//*[@id="video-title"]/yt-formatted-string
v_title = wait.until(EC.presence_of_element_located(
(By.CSS_SELECTOR,"h1.title yt-formatted-string") ).text
我想尋求一些幫助。如何避免這個錯誤? 謝謝。
uj5u.com熱心網友回復:
你正試圖獲取 "user_data"
user_data = driver.find_elements_by_xpath('//*[@id="video-title"]'/span>)
在打開YouTube網址后立即進行
driver.get("https://www.youtube.com/results?search_query=python course" )
這導致 "user_data "是一個空串列。 這就是為什么當你試圖用
迭代 "鏈接 "的時候for x in links:
要遍歷 "NoneType "物件的單個 "x "值,而不是一個字串。 為了解決這個問題,你應該在兩端添加一個等待/延遲
。driver.get("https://www.youtube.com/results?search_query=python course" )
而且
user_data = driver.find_elements_by_xpath('//*[@id="video-title"] ')
最簡單的方法是在那里添加一個延遲,就像這樣:
driver.get("https://www.youtube.com/results?search_query=python course" )
time.sleep(8)
user_data = driver.find_elements_by_xpath('//*[@id="video-title"] ')
然而,推薦的方法是使用由預期條件實作的顯式等待,就像這樣:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_path = r'C:Windowschromedriver.exe'/span>
driver = webdriver.Chrome(chrome_path)
wait = WebDriverWait(driver, 20)
driver.get("https://www.youtube.com/results?search_query=python course")
wait.until(EC.visibility_of_element_located((By.XPATH, "//*[@id="/span>video-title"]"))
#添加一些暫停,使所有的視頻加載。
time.sleep(0.5)
user_data = driver.find_elements_by_xpath('//*[@id="video-title"] ')
鏈接 = []
for i in user_data:
links.append(i.get_attribute('href')
print(link)
for x in links:
driver.get(x)
v_id = x.strip('https://www.youtube.com/watch?v=')
#//*[@id="video-title"]/yt-formatted-string
v_title = wait.until(EC.visibility_of_element_located(
(By.CSS_SELECTOR,"h1.title yt-formatted-string") ).text
另外,你應該使用visibility_of_element_located而不是presence_of_element_located,因為presence_of_element_located只等待元素的初始存在,元素狀態,而它的內容如文本等可能還沒有準備好。
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/327087.html
標籤:
