我正在嘗試使用 BeautifulSoup 從求職網站中提取資料。我已經能夠提取我需要的所有資料,但顯示的工資。
網頁是https://mx.indeed.com/jobs?q=operador&l=Ciudad de México
我的問題是薪水在<span>沒有班級名稱或頭銜的情況下。
示例 html 代碼如下所示:
<div class="heading6 tapItem-gutter metadataContainer"><div class="metadata salary-snippet-container"><div aria-label="$12,000 al mes" class="salary-snippet"><span>$12,000 al mes</span></div></div></div>
我試過:
salary = card.find("div", {"class" : "salary-snippet"}).find("span").text
但我收到以下錯誤:
AttributeError: 'NoneType' object has no attribute 'find'
誰能解釋一下我如何解決這個問題?
uj5u.com熱心網友回復:
發生什么了?
樣本看起來很完美,但如果仔細觀察,所有卡片中都沒有薪水元素。
怎么修?
只需檢查元素是否存在,然后在其上呼叫文本:
salary = card.select_one('div.salary-snippet').text if card.select_one('div.salary-snippet') else None
例子
import requests
from bs4 import BeautifulSoup
headers ={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
r =requests.get('https://mx.indeed.com/trabajo?q=operador&l=Ciudad de México&vjk=970d586d3023d4d0')
soup=BeautifulSoup(r.content, 'lxml')
data = []
for card in soup.select('#mosaic-provider-jobcards a'):
companyName = card.select_one('span.companyName').text if card.select_one('span.companyName') else None
companyLocation = card.select_one('div.companyLocation').text if card.select_one('div.companyLocation') else None
salary = card.select_one('div.salary-snippet').text if card.select_one('div.salary-snippet') else None
data.append({
'companyName':companyName,
'companyLocation':companyLocation,
'salary':salary
})
data
只想加薪作業?
data = []
for card in soup.select('#mosaic-provider-jobcards a'):
companyName = card.select_one('span.companyName').text if card.select_one('span.companyName') else None
companyLocation = card.select_one('div.companyLocation').text if card.select_one('div.companyLocation') else None
salary = card.select_one('div.salary-snippet').text if card.select_one('div.salary-snippet') else None
if salary:
data.append({
'companyName':companyName,
'companyLocation':companyLocation,
'salary':salary
})
data
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/341161.html
