Python爬蟲入門教程13：高質量電腦桌面壁紙爬取-有解無憂

前言??

本文的文字及圖片來源于網路,僅供學習、交流使用,不具有任何商業用途,如有問題請及時聯系我們以作處理，

前文內容??

Python爬蟲入門教程01：豆瓣Top電影爬取

Python爬蟲入門教程02：小說爬取

Python爬蟲入門教程03：二手房資料爬取

Python爬蟲入門教程04：招聘資訊爬取

Python爬蟲入門教程05：B站視頻彈幕的爬取

Python爬蟲入門教程06：爬取資料后的詞云圖制作

Python爬蟲入門教程07：騰訊視頻彈幕爬取

Python爬蟲入門教程08：爬取csdn文章保存成PDF

Python爬蟲入門教程09：多執行緒爬取表情包圖片

Python爬蟲入門教程10：彼岸壁紙爬取

Python爬蟲入門教程11：新版王者榮耀皮膚圖片的爬取

Python爬蟲入門教程12：英雄聯盟皮膚圖片的爬取

PS：如有需要 Python學習資料 以及 解答 的小伙伴可以加點擊下方鏈接自行獲取
python免費學習資料以及群交流解答點擊即可加入

基本開發環境??

Python 3.6
Pycharm

相關模塊的使用??

import requests
import re
import os

安裝Python并添加到環境變數，pip安裝需要的相關模塊即可，

一、??明確需求

在這里插入圖片描述
如圖所示爬取里面的高清壁紙

二、??網頁資料分析

在這里插入圖片描述
點擊下載原圖，會自動給你下載壁紙圖片，

所以只需要獲取這個鏈接就可以了爬取壁紙圖片了，

回傳串列的可以發現，網頁是瀑布流加載方式，當你往下滑才會有資料出現，所以可以在下滑網頁的前，先打開開發者工具，當下滑網頁的時候新加載出來的資料會出現，

在這里插入圖片描述

通過對比可以知道，這個資料包中包含了，壁紙圖片下載的地址，

需要注意的就是這個資料鏈接是post請求，并不是get請求
在這里插入圖片描述

需要提交的data引數，就是對應的頁碼，

三、??代碼實作

1、獲取圖片ID

    for page in range(1, 11):
        url = 'https://wallpaper.wispx.cn/cat/%E5%8A%A8%E6%BC%AB'
        headers = {
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',
            'x-requested-with': 'XMLHttpRequest',
        }
        data = https://www.cnblogs.com/Qqun821460695/p/{'page': page
        }
        response = requests.post(url=url, headers=headers)
        result = re.findall('detail(.*?)target=', response.text)
        for index in result:
            image_id = index.replace('\\', '').replace('" ', '')
            page_url = f'https://wallpaper.wispx.cn/detail{image_id}'

2、獲取壁紙url地址，并保存

def main(page_url):
    html_data = https://www.cnblogs.com/Qqun821460695/p/get_response(page_url).text
    image_url = re.findall('<a  href="https://www.cnblogs.com/Qqun821460695/p/(.*?)">', html_data)[0]
    image_title = re.findall('<title>(.*?)</title>', html_data)[0].split(' - ')[0]
    image_content = get_response(image_url).content
    path = 'images\\'
    if not os.path.exists(path):
        os.makedirs(path)
    with open(path + image_title + '.jpg', mode='wb') as f:
        f.write(image_content)
        print('正在保存：', image_title)

`需要注意的點：`

請求頭里面要防盜鏈，不然就下載不了，

def get_response(html_url):
    header = {
        'referer': 'https://wallpaper.wispx.cn/detail/1206',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
    }
    resp = requests.get(url=html_url, headers=header)
    return resp

四、??實作效果

在這里插入圖片描述

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/255471.html

標籤：Python

上一篇：Python爬蟲入門教程12：英雄聯盟皮膚圖片的爬取

下一篇：450. 洗掉二叉搜索樹中的節點