大家好我已經撰寫了一個python程式來檢索頁面的標題它作業正常但是對于某些頁面,它還會收到一些不需要的文本如何避免這種情況
這是我的程式
# importing the modules
import requests
from bs4 import BeautifulSoup
# target url
url = 'https://atlasobscura.com'
# making requests instance
reqs = requests.get(url)
# using the BeaitifulSoup module
soup = BeautifulSoup(reqs.text, 'html.parser')
# displaying the title
print("Title of the website is : ")
for title in soup.find_all('title'):
title_data = title.get_text().lower().strip()
print(title_data)
這是我的輸出
atlas obscura - curious and wondrous travel destinations
aoc-full-screen
aoc-heart-solid
aoc-compass
aoc-flipboard
aoc-globe
aoc-pocket
aoc-share
aoc-cancel
aoc-video
aoc-building
aoc-clock
aoc-clipboard
aoc-help
aoc-arrow-right
aoc-arrow-left
aoc-ticket
aoc-place-entry
aoc-facebook
aoc-instagram
aoc-reddit
aoc-rss
aoc-twitter
aoc-accommodation
aoc-activity-level
aoc-add-a-photo
aoc-add-box
aoc-add-shape
aoc-arrow-forward
aoc-been-here
aoc-chat-bubbles
aoc-close
aoc-expand-more
aoc-expand-less
aoc-forum-flag
aoc-group-size
aoc-heart-outline
aoc-heart-solid
aoc-home
aoc-important
aoc-knife-fork
aoc-library-books
aoc-link
aoc-list-circle-bullets
aoc-list
aoc-location-add
aoc-location
aoc-mail
aoc-map
aoc-menu
aoc-more-horizontal
aoc-my-location
aoc-near-me
aoc-notifications-alert
aoc-notifications-mentions
aoc-notifications-muted
aoc-notifications-tracking
aoc-open-in-new
aoc-pencil
aoc-person
aoc-pinned
aoc-plane-takeoff
aoc-plane
aoc-print
aoc-reply
aoc-search
aoc-shuffle
aoc-star
aoc-subject
aoc-trip-style
aoc-unpinned
aoc-send
aoc-phone
aoc-apps
aoc-lock
aoc-verified
而不是這個我想只收到這條線
"atlas obscura - curious and wondrous travel destinations"
請幫助我一些想法所有其他網站都在作業只有一些網站會出現這些問題
uj5u.com熱心網友回復:
您的問題是您在頁面中發現所有出現的“標題”。美麗的湯有一個title專門針對你想要做的事情的屬性。這是您修改后的代碼:
# importing the modules
import requests
from bs4 import BeautifulSoup
# target url
url = 'https://atlasobscura.com'
# making requests instance
reqs = requests.get(url)
# using the BeaitifulSoup module
soup = BeautifulSoup(reqs.text, 'html.parser')
title_data = soup.title.text.lower()
# displaying the title
print("Title of the website is : ")
print(title_data)
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/409293.html
標籤:
上一篇:盡管使用了EC.visibility_of_element_located().click()方法,但單擊按鈕時出現SeleniumTimeoutException錯誤?
