python音樂下載，小白也可以寫爬蟲-有解無憂

**
簡介：使用BeautifulSoup和request模塊進行抓取和決議，
最后保存音樂（注：音樂質量是普通品質的）
**
關于模塊的安裝,打開cmd輸入

pip install bs4 //安裝BeautifulSoup
pip install requests //安裝requests
pip install fake_useragent //這個模塊可以隨機生成一個headers

（不能安裝請升級pip或者以管理員模式打開cmd)

我們這里爬取的是網易云音樂

https://music.163.com/artist?id=4292 //爬取的鏈接
http://music.163.com/song/media/outer/url?id= //音樂播放外鏈鏈接

首先：我們先進行網頁原始碼獲取

https://music.163.com/#/artist?id=4292

https://weibo.com/6056761580/IAhNGivsJ
最開始我們是直接來用這個鏈接來請求網頁的，但是我們會發現回傳的href元素是空的（#），這個鏈接并不是真正的歌單鏈接，

https://weibo.com/6056761580/IAhNGivsJ?type=comment
但經尋找會發現source中有個不一樣的網頁鏈接

接下來我們會找到這個鏈接https://music.163.com/song?id=1407551413，看一眼是不是不太一樣，和原鏈接就一個‘/#’之差，內容就不一樣，這是網易云隱藏了源網頁，
然后仔細查看就可以找到音樂id和名稱

接下來就是代碼實作了

import requests
from fake_useragent import UserAgent
from bs4 import BeautifulSoup
import time
import os

def createFile(file_path):
if os.path.exists(file_path) is False:
os.makedirs(file_path)
# 切換路徑至上面創建的檔案夾
os.chdir(file_path)

def get_html():
url = 'https://music.163.com/artist?id=4292'
headers = {
'User-Agent': UserAgent().random #隨機一個模仿瀏覽器請求頭
}
response = requests.get(url, headers=headers)
res = BeautifulSoup(response.text, 'lxml')
id_lists = res.find(class_='f-hide').find_all('a')
return id_lists

def download(names,hrefs):
#獲取音樂id后還要進行決議并保存
headers = {
'User-Agent': UserAgent().random
}
#這里還要再加個headers，不然會假資料
url = 'http://music.163.com/song/media/outer/url?id='
#網易云外鏈地址，通過這個可以免費下載
response = requests.get(url+hrefs,headers=headers).content
#回傳二進制
f = open('D:\\music\\{}.mp3'.format(names),'wb') #這里下載的歌曲儲存到E盤music檔案夾
f.write(response)
print('正在下載{}'.format(names))

if __name__ == '__main__':
createFile('D:\music')
get_html()
for id_url in get_html():
names = id_url.text
hrefs = id_url['href'][9:]
#用變數來接收歌曲名和id
download(names,hrefs)
time.sleep(1) #睡眠一秒，防止過于頻繁訪問

如果你想下載其他歌曲可以把url中的鏈接換掉，注意把‘/#’洗掉以獲取真正的鏈接，這個代碼適用于歌單和專輯下載

最后謝謝大家看完，本人小白一個，今年大一自學了3個月python，第一次發博客，如有不好之處，望指出，

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/165064.html

標籤：Python

上一篇：第六章第三十九題（幾何：點的位置）(Geometry: point position) - 編程練習題答案

下一篇：python內置函式