如何使用python抓取超鏈接的名稱/文本？-有解無憂

我想從這個 URL https://www.ccexpert.us/ccda/best-practices-for-hierarchical-layers.html中提取鏈接的名稱，但是，我無法繼續下一步。以下是我到目前為止的代碼

import requests as re
from bs4 import BeautifulSoup

URL = "https://www.ccexpert.us/ccda/best-practices-for-hierarchical-layers.html"
page = re.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find(class_="post altr")

for result in results:
    print(result)

我仍然不知道如何進行下一步。很感謝任何形式的幫助。謝謝你。

uj5u.com熱心網友回復：

此代碼獲取頁面中鏈接的每個文本：

import requests as re
from bs4 import BeautifulSoup

URL = "https://www.ccexpert.us/ccda/best-practices-for-hierarchical-layers.html"
page = re.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find_all('a')

for result in results:
    print(result.text.strip())

輸出：

CCDA
port channels
RPVST
Dynamic Trunking Protocol
VTP transparent mode
Layer 3 load balancing
user ports
enable PortFast
the core layer
link redundancy
access layer switches
Gateway Load Balancing Protocol
core switches
distribution switches
redundant paths
campus core
Large Building LANs
LAN Design Types and Models
Shutting Down a BGP Neighbor
Core Layer Functionality - Network Design
Distribution Layer Functionality
Characterizing Types of Traffic Flow for New Network Applications
DHCP Starvation and Spoofing Attacks
How to Start an Ecommerce Business
Reply
About
Contact
Advertise
Privacy Policy
Resources

它之所以有效，是因為為了在 html 中創建超鏈接，使用了標簽 <a>。我相信您要的是恰好有超鏈接的文本塊，但是如果您要的是鏈接，那么您可以這樣做：

import requests as re
from bs4 import BeautifulSoup

URL = "https://www.ccexpert.us/ccda/best-practices-for-hierarchical-layers.html"
page = re.get(URL)
soup = BeautifulSoup(page.content, "html.parser")

for a in soup.find_all('a', href=True):
    print(a['href'])

輸出：

/
/reviews/traffic-xtractor.html



/ccda/
/routing-switching/using-routed-ports-and-portchannels-with-mls.html
/root-bridge/rapid-pervlan-spanning-tree-protocol.html
/network-security-2/dynamic-trunking-protocol-dtp.html
/root-bridge/vtp-modes.html
/root-bridge/configuring-etherchannel-load-balancing.html
/routing-switching-2/switch-security-best-practices-for-unused-and-user-ports.html
/global-configuration/enabling-bpdu-guard.html
/network-design/core-layer-functionality.html
/network-design/designing-link-redundancy.html
/network-design/access-layer-functionality.html
/root-bridge/gateway-load-balancing-protocol.html
/switching/collapsed-core.html
/switching/distribution-layer-switches.html
/switching/backbonefast-redundant-backbone-paths.html
/network-design/campus-core-design-considerations.html
/ccda/largebuilding-lans.html
/ccda/lan-design-types-and-models.html
/cisco-internetworks-2/shutting-down-a-bgp-neighbor.html
/network-design/core-layer-functionality.html
/network-design/distribution-layer-functionality.html
/network-design-2/characterizing-types-of-traffic-flow-for-new-network-applications.html
/snrs-3/dhcp-starvation-and-spoofing-attacks.html
/ecommerce.html
/about/
/contact/
/advertise-with-us/
/privacy-policy/
/resources/

這只會刮掉每個標簽的“href”。

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/442017.html

標籤：Python 网页抓取美丽的汤

上一篇：在沒有API的情況下獲取谷歌地圖中永久關閉地點的評論總數

下一篇：Python中的Web抓取問題