我正在抓取一個動漫網站作為一個專案,但是當我嘗試抓取 src 時,它給了我一個錯誤。src 嵌套在 source 標簽內。我在下面給出螢屏截圖和代碼。
示例截圖
代碼 :
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import re
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
#launch url
url = "https://bestdubbedanime.com/Demon-Slayer-Kimetsu-no-Yaiba/26"
# create a new Firefox session
driver = webdriver.Firefox()
# driver.implicitly_wait(30)
driver.get(url)
# python_button = driver.find_element_by_class_name('playostki') #FHSU
# python_button.click() #click fhsu link
soup1 = BeautifulSoup(driver.page_source, 'html.parser')
video = soup1.find('video', id='my_video_1_html5_api')
# video = driver.find_element_by_id('my_video_1_html5_api')
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".playostki"))).click()
driver.stop_client
driver.close
driver.quit
uj5u.com熱心網友回復:
之所以沒有得到src標簽,是因為點擊視頻后顯示的。您必須首先單擊該視頻,然后嘗試從元素中查找屬性“src”。
driver.maximize_window()
driver.get("https://bestdubbedanime.com/Demon-Slayer-Kimetsu-no-Yaiba/26")
WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='playostki']//img"))).click()
print(WebDriverWait(driver, 60).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#my_video_1_html5_api > source"))).get_attribute("src"))
driver.quit()
輸出:
https://bestdubbedanime.com/xz/api/v.php?u=eVcxb0ZCUEMraFd1Vi9pM2xqWUhtbXZMWjZ0Mlpoc1U0Tmhqc2VFcVViQUc3VUVhR0pZV1EvaW1nY1duaXBMeXYvUUY4RG5ab3p4MEtEMUFHRmVaN0taVG9sY3ZVcTRoeDZoVHhWLzdiYjQ5UStNN2FYSjJBSWNKL0t5S1hLNGEyVlZqV1BYQ2MwaCsyNWcvak1Db01EMnNtWGwwTTBBVld4MkNER0V3eGNCRXJ0cEY4RHFPclhwbTJpWFBPSmJI
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/366535.html
