前言
現在的招聘網站很多,比如:拉勾網、Boss直聘、智聯招聘、前程無憂等,那么多的網站,如何才能在眾多招聘資訊中找到符合自己的,或者說作業的相關要求,
受難目標
地址
https://search.51job.com/list/010000%252c020000%252c030200%252c040000,000000,0000,00,9,99,python,2,{}.html
PS:如有需要Python學習資料的小伙伴可以加下方的群去找免費管理員領取
可以免費領取原始碼、專案實戰視頻、PDF檔案等
部分爬蟲代碼
匯入工具
import requests import parsel import re import json import time
請求網頁,爬取資料
for page in range(1, 11): url = 'https://search.51job.com/list/010000%252c020000%252c030200%252c040000,000000,0000,00,9,99,python,2,{}.html'.format(page) params = { 'lang': 'c', 'postchannel': '0000', 'workyear': '99', 'cotype': '99', 'degreefrom': '99', 'jobterm': '99', 'companysize': '99', 'ord_field': '0', 'dibiaoid': '0', 'line': '', 'welfare': '', } cookies = { 'Cookie': 'guid=b672753be2ff4b5c3694a1ff805e8c1b; 51job=cenglish%3D0%26%7C%26; nsearch=jobarea%3D%26%7C%26ord_field%3D%26%7C%26recentSearch0%3D%26%7C%26recentSearch1%3D%26%7C%26recentSearch2%3D%26%7C%26recentSearch3%3D%26%7C%26recentSearch4%3D%26%7C%26collapse_expansion%3D; search=jobarea%7E%60190200%7C%21ord_field%7E%600%7C%21recentSearch0%7E%60190200%A1%FB%A1%FA000000%A1%FB%A1%FA0000%A1%FB%A1%FA00%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FApython%A1%FB%A1%FA2%A1%FB%A1%FA1%7C%21' } headers = { 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9', 'Host': 'search.51job.com', 'Referer': 'https://search.51job.com/list/190200,000000,0000,00,9,99,python,2,1.html?lang=c&postchannel=0000&workyear=99&cotype=99°reefrom=99&jobterm=99&companysize=99&ord_field=0&dibiaoid=0&line=&welfare=', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36', } response = requests.get(url=url, params=params, headers=headers, cookies=cookies) response.encoding = response.apparent_encoding # 想要完整原始碼的同學可以關注我的公眾號:松鼠愛吃餅干 # 回復“51job”即可免費獲取
運行代碼,效果如下
TXT格式
CSV格式
我們還可以用詞云圖來現實招聘的需求
部分詞云代碼
import jieba import wordcloud import imageio import re py = imageio.imread("python.png") f = open('python招聘資訊.txt', encoding='utf-8') re_txt = f.read() # result = re.findall(r'[a-zA-Z]+', re_txt) # txt = ' '.join(result) # jiabe 分詞 分割詞匯 txt_list = jieba.lcut(re_txt) string = ' '.join(txt_list) # 給詞云輸入文字 wc.generate(string) # 詞云圖保存圖片地址 wc.to_file(r'python招聘資訊.png') # 想要完整原始碼的同學可以關注我的公眾號:松鼠愛吃餅干 # 回復“51job詞云”即可免費獲取
從詞云圖看來,需求還蠻多的
下次想看爬什么網站,可以發在評論區(太難的就算了,我還是個小菜雞)
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/1957.html
標籤:Python
