BeautifulSoup不回傳鏈接-有解無憂

對于我的 python 訓練營，我正在嘗試從該站點創建文章日志，并回傳最高贊成票。其余代碼有效，但我無法讓它正確回傳 href。我得到“沒有”。我已經嘗試了我所知道的一切......任何人都可以提供任何指導嗎？

from bs4 import BeautifulSoup
import requests


response = requests.get("https://news.ycombinator.com/")
yc_web_page = response.text


soup = BeautifulSoup(yc_web_page, "html.parser")
articles = soup.find_all(name="span", class_="titleline")

article_texts = []
article_links = []

for article_tag in articles:

    article_text = article_tag.get_text()
    article_texts.append(article_text)

    article_link = article_tag.get("href")
    article_links.append(article_link)



article_upvotes = [int(score.getText().split()[0]) for score in soup.find_all(name="span", class_="score")]


largest_number = max(article_upvotes)
largest_index = article_upvotes.index(largest_number)

print(article_texts[largest_index])
print(article_links[largest_index])
print(article_upvotes[largest_index])`

我試圖將“href”更改為“a”標簽，但它回傳了相同的值“none”

uj5u.com熱心網友回復：

嘗試：


...

    article_link = article_tag.a.get("href")    # <--- put .a here

...

from bs4 import BeautifulSoup
import requests


response = requests.get("https://news.ycombinator.com/")
yc_web_page = response.text


soup = BeautifulSoup(yc_web_page, "html.parser")
articles = soup.find_all(name="span", class_="titleline")

article_texts = []
article_links = []

for article_tag in articles:

    article_text = article_tag.get_text()
    article_texts.append(article_text)

    article_link = article_tag.a.get("href")   # <--- put .a here
    article_links.append(article_link)


article_upvotes = [
    int(score.getText().split()[0])
    for score in soup.find_all(name="span", class_="score")
]


largest_number = max(article_upvotes)
largest_index = article_upvotes.index(largest_number)

print(article_texts[largest_index])
print(article_links[largest_index])
print(article_upvotes[largest_index])

印刷：

Fred Brooks has died (twitter.com/stevebellovin)
https://twitter.com/stevebellovin/status/1593414068634734592
1368

uj5u.com熱心網友回復：

這是一個更短的方法：

import requests
from bs4 import BeautifulSoup

url = "https://news.ycombinator.com/"

soup = BeautifulSoup(requests.get(url).text, "lxml")

all_scores = [
    [
        int(x.getText().replace(" points", "")),
        x["id"].replace("score_", ""),
    ]
    for x in soup.find_all("span", class_="score")
]

votes, tr_id = sorted(all_scores, key=lambda x: x[0], reverse=True)[0]

table_row = soup.find("tr", id=tr_id)
text = table_row.select_one("span a").getText()
link = table_row.select_one("span a")["href"]

print(f"{text}\n{link}\n{votes} votes")

輸出：

Fred Brooks has died
https://twitter.com/stevebellovin/status/1593414068634734592
1377 votes

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/536686.html

標籤：Python解析美汤html解析

上一篇：將具有不規則列寬的文本檔案決議為Pandas

下一篇：計算Python程式中的標記/運算式的數量