從影像抓取器下載的所有影像都具有相同的 130 kb 檔案大小,并且已損壞并且無法在影像查看器中看到。
我真的不知道問題是什么。
任何人都請給我一些關于這個問題的建議。
import requests
import parsel
import os
import time
url = 'https://movie-screencaps.com/movie-directory/'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'}
response = requests.get(url, headers=headers)
selector = parsel.Selector(response.text)
movie_list = selector.xpath('//div[@]/ul/li')
for li in movie_list:
movie_name = li.xpath('.//a/text()').get().strip()
movie_url = li.xpath('.//a/@href').get()
print(movie_name, movie_url)
# dir = f'download/{movie_name}'
dir = f'{movie_name}'
if not os.path.exists(dir):
os.makedirs(dir)
page_response = requests.get(movie_url, headers=headers)
page_selector = parsel.Selector(page_response.text)
page_text = page_selector.xpath('//div[@]/text()').get()
last_page = int(page_text.split(' ')[-1])
for page in range(1, last_page 1):
page_url = f'{movie_url}/page/{page}'
print(f'===== Downloading from page {page} =====')
image_response = requests.get(url=page_url, headers=headers)
image_selector = parsel.Selector(image_response.text)
images_url_list = image_selector.xpath('//div[@align="center"]/a/@href').getall()
for image_url in images_url_list:
image_data = requests.get(url=page_url, headers=headers).content
# print(image_data)
file_name = image_url.split('/')[-1]
with open(f'{dir}/{file_name}', mode='wb') as f:
f.write(image_data)
print(file_name)
time.sleep(2)
uj5u.com熱心網友回復:
我測驗了你的代碼,你只是犯了一個小錯誤
改變:
image_data = requests.get(url=page_url, headers=headers).content
到:
image_data = requests.get(url=image_url, headers=headers).content
經過測驗并且作業正常:)
uj5u.com熱心網友回復:
問題是一個錯字,您正在page_url為每個 image_url 獲取而不是獲取image_url:
...
for image_url in images_url_list:
image_data = requests.get(url=page_url, headers=headers).content
file_name = image_url.split('/')[-1]
...
應該:
...
for image_url in images_url_list:
# Typo is here...
image_data = requests.get(url=image_url, headers=headers).content
file_name = image_url.split('/')[-1]
...
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/460063.html
