from bs4 import BeautifulSoup
import requests
url = 'https://www.mediacorp.sg/en/your-mediacorp/our-artistes/tca/male-artistes/ayden-sng-12357686'
artiste_name = 'celeb-name'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
txt = soup.find_all('h1', attrs={'class':artiste_name})
print(txt)
使用上面的代碼,我得到了輸出:
[<*h1 class="celeb-name">Ayden Sng</h1*>] #asterisks added to show h1 tags
我需要在我的代碼中更改什么,或者我怎樣才能使它只得到“Ayden Sng”作為我的輸出?
uj5u.com熱心網友回復:
迭代txt串列的每個條目并提取其txt屬性:
txt = [element.text for element in txt] # ['Ayden Sng']
復制
uj5u.com熱心網友回復:
from bs4 import BeautifulSoup
import requests
url = 'https://www.mediacorp.sg/en/your-mediacorp/our-artistes/tca/male-artistes/ayden-sng-12357686'
artiste_name = 'celeb-name'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
txt = soup.find_all('h1', attrs={'class':artiste_name})
print(txt[0].text)
如果有多個 reuslt,您可以使用以下代碼:
from bs4 import BeautifulSoup
import requests
url = 'https://www.mediacorp.sg/en/your-mediacorp/our-artistes/tca/male-artistes/ayden-sng-12357686'
artiste_name = 'celeb-name'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
txt = soup.find_all('h1', attrs={'class':artiste_name})
for i in txt:
print(i.text)
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/341162.html
