參考下圖以供參考我想取消疾病名稱、與疾病相關的 URL 和疾病的圖示影像。無法遍歷 div [js-content-images] 標簽!
import requests
from bs4 import BeautifulSoup
URL = "https://dermnetnz.org/image-library"
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
job_elements = soup.find("div", class_="flex [ js-sticky-container ]")
job2 = job_elements.find_all("div", class_="imageList__group")
for job_element in job2:
print(job_element)
uj5u.com熱心網友回復:
你不需要bs4或selenium刮掉這個頁面。如果你去network tab你會得到json url你需要發送請求并捕獲 json 回應。
![美麗湯 | 無法遍歷 div [js-content-images] 類標簽](https://img.uj5u.com/2022/04/10/9dd043354e784344bad562ae76584cdb.png)
![美麗湯 | 無法遍歷 div [js-content-images] 類標簽](https://img.uj5u.com/2022/04/10/f643e517951443869c81b67e56626261.png)
https://dermnetnz.org/image-library/imagesJson
代碼 :
import requests
res=requests.get("https://dermnetnz.org/image-library/imagesJson")
result=res.json()
for r in result:
print("Diseases Name : " r['name'])
print("Image : " r['thumbnail'])
print("Url : " "https://dermnetnz.org" r['url'])
輸出:
Diseases Name : Roseola images
Image : https://dermnetnz.org/assets/Uploads/roseola-001__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/roseola-images/?stage=Live
Diseases Name : Dermatomyositis images
Image : https://dermnetnz.org/assets/Uploads/dermatomyositis-eyelids-4__FocusFillWzE1MCwxMTAsIngiLDhd.jpg
Url : https://dermnetnz.org/topics/dermatomyositis-images/?stage=Live
Diseases Name : Solar keratosis affecting the face images
Image : https://dermnetnz.org/assets/Uploads/248__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/actinic-keratosis-face-images/?stage=Live
Diseases Name : Actinic keratosis affecting the face images
Image : https://dermnetnz.org/assets/Uploads/248__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/actinic-keratosis-face-images/?stage=Live
Diseases Name : Solar keratosis affecting the hand images
Image : https://dermnetnz.org/assets/Uploads/393__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/actinic-keratosis-affecting-the-hand-images/?stage=Live
Diseases Name : Solar keratosis affecting the legs and feet images
Image : https://dermnetnz.org/assets/Uploads/478__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/actinic-keratosis-leg-and-foot-images/?stage=Live
Diseases Name : Solar keratosis affecting the scalp images
Image : https://dermnetnz.org/assets/Uploads/418__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/actinic-keratosis-scalp-images/?stage=Live
Diseases Name : Solar keratosis on the nose images
Image : https://dermnetnz.org/assets/Uploads/sks-nose3-s__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/actinic-keratosis-on-the-nose-images/?stage=Live
Diseases Name : Solar keratosis treated with imiquimod images
Image : https://dermnetnz.org/assets/Uploads/3723__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/actinic-keratosis-imiquimod-images/?stage=Live
Diseases Name : Autoimmune alopecia images
Image : https://dermnetnz.org/assets/Uploads/1323__FocusFillWzE1MCwxMTAsInkiLDIzXQ.jpg
Url : https://dermnetnz.org/topics/alopecia-areata-images/?stage=Live
Diseases Name : Hypomelanotic malignant melanoma images
Image : https://dermnetnz.org/assets/Uploads/12a-amelanotic-melanoma__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/amelanotic-melanoma-images/?stage=Live
Diseases Name : Epiloia images
Image : https://dermnetnz.org/assets/Uploads/angiofibromas-19-s__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/tuberous-sclerosis-images/?stage=Live
Diseases Name : Perleche images
Image : https://dermnetnz.org/assets/Uploads/perleche13-s__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/angular-cheilitis-images/?stage=Live
Diseases Name : Besnier prurigo images
Image : https://dermnetnz.org/assets/Uploads/atopic26-s__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/atopic-dermatitis-images/?stage=Live
Diseases Name : Atopic eczema images
Image : https://dermnetnz.org/assets/Uploads/atopic26-s__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/atopic-dermatitis-images/?stage=Live
Diseases Name : Atypical melanocytic naevus
Image : https://dermnetnz.org/assets/Uploads/604__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/topics/atypical-naevus-images/?stage=Live
Diseases Name : Bacteria images
Image : https://dermnetnz.org/assets/Uploads/syph6-s-2__FocusFillWzE1MCwxMTAsInkiLDFd.jpg
Url : https://dermnetnz.org/image-catalogue/bacterial-skin-infection-images/?stage=Live
...很快
uj5u.com熱心網友回復:
您找不到它的原因與通過 javascript 加載的那些元素有關。這是一個動態的網站。您可以通過阻止 javascript 執行來看到這一點,結果將是缺少影像。
您有兩個選擇:您可以嘗試通過 javascript 進行逆向工程,或者您可以使用瀏覽器渲染引擎渲染 javascript。
有Selenium,通過pip install selenium. 單擊此鏈接以獲取系統的安裝說明,因為您還需要安裝驅動程式,例如 Geckodriver 或 ChromeDriver。
然后,您可能需要稍微更改以下代碼才能使其為您作業......但以下代碼會找到您想要的第一個元素,并且它很簡單:
# setting up
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
# your own application
driver.get('https://dermnetnz.org/image-library')
element = driver.find_element_by_class_name('imageList__group__item')
img_element = element.find_element_by_tag_name('img')
# here is the link:
print(element.get_attribute('href'))
# here is the text:
print(element.text)
# here is the img source:
print(img_element.get_attribute('src'))
想要找到其中的多個?然后它就像使用elements = driver.find_elements_by_class_name('imageList__group__item')而不是element = driver.find_element_by_class_name('imageList__group__item')回圈它們一樣簡單,找到每個它們的 img_element 。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/457772.html
上一篇:如何提取折扣(%off)值?
下一篇:單擊基于文本的鏈接
