from bs4 import BeautifulSoup
import requests
from urllib.request import urlopen
url = f'https://www.apple.com/kr/search/youtube?src=globalnav'
response = requests.get(url)
html = response.text
soup = BeautifulSoup(html, 'html.parser')
links = soup.select(".rf-serp-productname-list")
print(links)
我想瀏覽顯示的應用程式的所有鏈接。當我搜索關鍵字時,我認為links = soup.select(".rf-serp-productname-list")會起作用,但鏈接串列是空的。
我應該怎么辦?
uj5u.com熱心網友回復:
只需檢查此代碼,我認為這就是您想要的:
import re
import requests
from bs4 import BeautifulSoup
pages = set()
def get_links(page_url):
global pages
pattern = re.compile("^(/)")
html = requests.get(f"your_URL{page_url}").text # fstrings require Python 3.6
soup = BeautifulSoup(html, "html.parser")
for link in soup.find_all("a", href=pattern):
if "href" in link.attrs:
if link.attrs["href"] not in pages:
new_page = link.attrs["href"]
print(new_page)
pages.add(new_page)
get_links(new_page)
get_links("")
來源: https ://gist.github.com/AO8/f721b6736c8a4805e99e377e72d3edbf
您可以更改零件:
for link in soup.find_all("a", href=pattern):
#do something
檢查我認為的關鍵字
uj5u.com熱心網友回復:
您正在烹飪,soup所以首先品嘗它并檢查您期望的所有東西是否包含在其中。
ResultSet您的選擇是空的,因為回應結構與您對開發人員工具的預期有所不同。
要獲取鏈接串列,請選擇更具體的:
links = [a.get('href') for a in soup.select('a.icon')]
輸出:
['https://apps.apple.com/kr/app/youtube/id544007664', 'https://apps.apple.com/kr/app/쿠팡플레이/id1536885649', 'https://apps.apple.com/kr/app/youtube-music/id1017492454', 'https://apps.apple.com/kr/app/instagram/id389801252', 'https://apps.apple.com/kr/app/youtube-kids/id936971630', 'https://apps.apple.com/kr/app/youtube-studio/id888530356', 'https://apps.apple.com/kr/app/google-chrome/id535886823', 'https://apps.apple.com/kr/app/tiktok-틱톡/id1235601864', 'https://apps.apple.com/kr/app/google/id284815942']
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/483229.html
