我想從 booking.com 獲取資訊(如酒店名稱、價格......),但是當我使用 BeautifulSoup 通過 python 訪問網站時找不到這些資訊。
這就是我所做的:
from bs4 import BeautifulSoup
from urllib.request import urlopen
import requests
url="https://www.booking.com/searchresults.en-gb.html?label=gen173nr-1DCAEoggI46AdIM1gEaGKIAQGYAQm4AQfIAQzYAQPoAQGIAgGoAgO4AtDrhJMGwAIB0gIkZjQzNmY0MTQtMjY3OS00NGE0LTkwOWEtNGQ3YzQ0OTY1Mjc42AIE4AIB&lang=en-gb&sid=b9d75b447deb2624c8cfaadad9969120&sb=1&sb_lp=1&src=index&src_elem=sb&error_url=https://www.booking.com/index.en-gb.html?label=gen173nr-1DCAEoggI46AdIM1gEaGKIAQGYAQm4AQfIAQzYAQPoAQGIAgGoAgO4AtDrhJMGwAIB0gIkZjQzNmY0MTQtMjY3OS00NGE0LTkwOWEtNGQ3YzQ0OTY1Mjc42AIE4AIB;sid=b9d75b447deb2624c8cfaadad9969120;sb_price_type=total&;&ss=Hong Kong&is_ski_area=0&ssne=Hong Kong&ssne_untouched=Hong Kong&dest_id=-1353149&dest_type=city&checkin_year=2022&checkin_month=4&checkin_monthday=25&checkout_year=2022&checkout_month=4&checkout_monthday=30&group_adults=2&group_children=0&no_rooms=1&b_h4u_keep_filters=&from_sf=1"
requests.get(url)
response = requests.get(url)
response.status_code
soup = BeautifulSoup(response.content,'html.parser')
print(soup)
列印湯后,我只能看到分數等資訊,但使用 find() 時找不到有關酒店名稱的任何資訊,你能告訴我我做錯了什么,我該如何做對嗎?非常感謝!!
uj5u.com熱心網友回復:
您只需要檢查在湯中回傳的頁面的 HTML,例如,如果您在瀏覽器中檢查酒店標題,您會注意到酒店的前 10 個結果顯示在帶有卡片類別的標簽中
然后最后你可以使用 find 來獲取所有資訊,例如檢查你的代碼的以下修改版本
from bs4 import BeautifulSoup
from urllib.request import urlopen
import requests
url="https://www.booking.com/searchresults.en-gb.html?label=gen173nr-1DCAEoggI46AdIM1gEaGKIAQGYAQm4AQfIAQzYAQPoAQGIAgGoAgO4AtDrhJMGwAIB0gIkZjQzNmY0MTQtMjY3OS00NGE0LTkwOWEtNGQ3YzQ0OTY1Mjc42AIE4AIB&lang=en-gb&sid=b9d75b447deb2624c8cfaadad9969120&sb=1&sb_lp=1&src=index&src_elem=sb&error_url=https://www.booking.com/index.en-gb.html?label=gen173nr-1DCAEoggI46AdIM1gEaGKIAQGYAQm4AQfIAQzYAQPoAQGIAgGoAgO4AtDrhJMGwAIB0gIkZjQzNmY0MTQtMjY3OS00NGE0LTkwOWEtNGQ3YzQ0OTY1Mjc42AIE4AIB;sid=b9d75b447deb2624c8cfaadad9969120;sb_price_type=total&;&ss=Hong Kong&is_ski_area=0&ssne=Hong Kong&ssne_untouched=Hong Kong&dest_id=-1353149&dest_type=city&checkin_year=2022&checkin_month=4&checkin_monthday=25&checkout_year=2022&checkout_month=4&checkout_monthday=30&group_adults=2&group_children=0&no_rooms=1&b_h4u_keep_filters=&from_sf=1"
requests.get(url)
response = requests.get(url)
response.status_code
soup = BeautifulSoup(response.content,'html.parser')
#filter all elements with tag span, class bui-card__title and itemprop as name
hotels = soup.findAll("span", {"class": "bui-card__title", "itemprop": "name"})
for hotel in hotels:
print(hotel.decode_contents().strip())
輸出如下

轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/460959.html
