前言
今天的“受害者”為【貓耳FM】,一個音頻網站
地址:https://www.missevan.com/sound/m/110


對于本篇文章有疑問的同學可以加【資料白嫖、解答交流群:1039649593】
知識點:
- requests
- time
- re
- concurrent.futures
開發環境:
- 版 本:anaconda5.2.0(python3.6.5)
- 編輯器:pycharm
【付費VIP完整版】只要看了就能學會的教程,80集Python基礎入門視頻教學
匯入模塊
import time import requests import concurrent.futures import re
通過函式式編程,實作各個功能模塊
發送請求
def get_html(url): response = requests.get(url) return response
第一次決議
def parse(response): mp3_ids = re.findall('<a target="_player" href="https://www.cnblogs.com/sound/(.*?)" title=".*?">', response.text) return mp3_ids
第二次決議
def parse_2(response): json_data = response.json() title = json_data['info']['sound']['soundstr'] soundurl = json_data['info']['sound']['soundurl'] return title, soundurl
保存資料
def save(title, mp3_data): with open('mp3\\' + title + '.mp3', mode='wb') as f: f.write(mp3_data) print(title, '下載完成!!!')
修改標題
def change_title(title): new_title = re.sub(r'[\//|:?<>"*]', '_', title) return new_title
主函式,呼叫里面包含的整體連貫
# 1. 發送請求 response = get_html(url) # 2. 決議資料 soundid mp3_ids = parse(response) for mp3_id in mp3_ids: # 3. 請求另外詳情頁 地址拼接 https://www.missevan.com/sound/getsound?soundid=3922170 mp3_url = 'https://www.missevan.com/sound/getsound?soundid=' + mp3_id resp_2 = get_html(mp3_url) # 4. 決議音頻url地址 音頻標題 title, soundurl = parse_2(resp_2) # 修改標題 title = change_title(title) # 5. 請求音頻url地址 音頻 二進制資料 content mp3_data =https://www.cnblogs.com/qshhl/archive/2021/09/24/ get_html(soundurl).content # 6. 下載保存 到本地 save(title, mp3_data)
翻頁
start_time = time.time() for page in range(1, 5): print(f'----------正在爬取第{page}頁-------------') run(f'https://www.missevan.com/sound/m?id=110&p={page}') print('一共花費了:', time.time()-start_time)

多執行緒
if __name__ == '__main__': start_time = time.time() with concurrent.futures.ThreadPoolExecutor(max_workers=1000) as executor: for page in range(1, 5): url = f'https://www.missevan.com/sound/m?id=110&p={page}' executor.submit(run, url) print('一共花費了:', time.time()-start_time)
速度提升了一分鐘左右

轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/302649.html
標籤:其他
上一篇:Python 基礎編碼風格
