我正在嘗試從 Udemy 網站中提取價格資料以及學生人數。我在 Windows 上,并且在 conda 環境中使用 Python 3.8 和 BeautifoulSoup。
這是我的代碼:
url = 'https://www.udemy.com/course/business-analysis-conduct-a-strategy-analysis/'
html = requests.get(url).content
bs = BeautifulSoup(html, 'lxml')
searchingprice = bs.find('div', {'class':'price-text--price-part--2npPm udlite-clp-discount-price udlite-heading-xxl','data-purpose':'course-price-text'})
searchingstudents = bs.find('div', {'class':'','data-purpose':'enrollment'})
print(searchingprice)
print(searchingstudents)
我只得到關于學生的資訊,而不是價格。我做錯了什么?
None
<div class="" data-purpose="enrollment">
13,490 students
</div>
這是有關該網站的螢屏截圖:


謝謝!
uj5u.com熱心網友回復:
html = """<div
data-purpose="price-text-container"><div
data-purpose="course-price-text">
<span >Current price</span>
<span><span>$14.99</span></span></div>
<div data-purpose="original-price-container">
<div data-purpose="course-old-price-text"><span >Original Price</span>
<span><s><span>$99.99</span></s></span></div></div>
<div
data-purpose="discount-percentage"><span >Discount</span><span>85% off</span>
</div></div>"""
soup = BeautifulSoup(html, 'lxml')
# find the children of the main div class
lst = soup.find('div', class_='price-text--container--103D9 udlite-clp-price-text').findChildren('span')
# list comprehension to find the span text that starts with $ and keep the first element
print([span.text for span in lst if span.text.startswith('$')][0]) # -> '$14.99'
uj5u.com熱心網友回復:
價格不在來源中,它是用 javascript 獲取的。我們將不得不采取同樣的步驟。此代碼遵循您自己的代碼,bs 已加載
# get id of the course
course_id=bs.body.attrs['data-clp-course-id']
# build proper request, feel free to delete unneeded data requests
link=f'https://www.udemy.com/api-2.0/pricing/?course_ids={course_id}&fields[pricing_result]=price,discount_price,list_price,price_detail,price_serve_tracking_id'
# fetch the data
res=requests.get(link).json()
print(res)
>>> {'courses': {'1596446': {'_class': 'pricing_result', 'price_serve_tracking_id': 'rbNYz3yCSiS2G1J62gtSzg', 'price': {'amount': 16.99, 'currency': 'EUR', 'price_string': '€16.99', 'currency_symbol': '€'}, 'list_price': {'amount': 119.99, 'currency': 'EUR', 'price_string': '€119.99', 'currency_symbol': '€'}, 'discount_price': {'amount': 17.0, 'currency': 'EUR', 'price_string': '€17', 'currency_symbol': '€'}, 'price_detail': {'amount': 119.99, 'currency': 'EUR', 'price_string': '€119.99', 'currency_symbol': '€'}}}, 'bundles': {}}
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/346795.html
