
遍歷檔案樹
1.直接子節點:.contents .children屬性
.content
Tag的.content屬性可以將Tag的子節點以串列的方式輸出
#!/usr/bin/python3 # -*- coding:utf-8 -*- from bs4 import BeautifulSoup html = """ <html><head><title>The Dormouse's story</title></head> <body> <p name="dromouse"><b>The Dormouse's story</b></p> <p >Once upon a time there were three little sisters; and their names were <a href="http://example.com/elsie" id="link1"><!-- Elsie --></a>, <a href="http://example.com/lacie" id="link2">Lacie</a> and <a href="http://example.com/tillie" id="link3">Tillie</a>; and they lived at the bottom of a well.</p> <p >...</p> """另外要注意:光理論是不夠的,這里順便送大家一套2020最新python入門到高級專案實戰視頻教程,可以去小編的Python交流.裙 :七衣衣九七七巴而五(數字的諧音)轉換下可以找到了,還可以跟老司機交流討教! # 創建 Beautiful Soup 物件,指定lxml決議器 soup = BeautifulSoup(html, "lxml") # 輸出方式為串列 print(soup.head.contents) print(soup.head.contents[0])
運行結果
[<title>The Dormouse's story</title>] <title>The Dormouse's story</title>
.children
它回傳的不是一個串列,不過我們可以通過遍歷獲取所有的子節點,
#!/usr/bin/python3 # -*- coding:utf-8 -*- from bs4 import BeautifulSoup html = """ <html><head><title>The Dormouse's story</title></head> <body> <p name="dromouse"><b>The Dormouse's story</b></p> <p >Once upon a time there were three little sisters; and their names were <a href="http://example.com/elsie" id="link1"><!-- Elsie --></a>, <a href="http://example.com/lacie" id="link2">Lacie</a> and <a href="http://example.com/tillie" id="link3">Tillie</a>; and they lived at the bottom of a well.</p> <p >...</p> """ # 創建 Beautiful Soup 物件,指定lxml決議器 soup = BeautifulSoup(html, "lxml") # 輸出方式為串列生成器物件 print(soup.head.children) # 通過遍歷獲取所有子節點 for child in soup.head.children: print(child)
運行結果
<list_iterator object at 0x008FF950> <title>The Dormouse's story</title>
相關推薦:《Python相關教程》
2.所有子孫節點:.descendants屬性
上面講的.contents和.children屬性僅包含Tag的直接子節點,.descendants屬性可以對所有Tag的子孫節點進行遞回回圈,和children類似,我們也需要通過遍歷的方式獲取其中的內容,
#!/usr/bin/python3 # -*- coding:utf-8 -*- from bs4 import BeautifulSoup html = """ <html><head><title>The Dormouse's story</title></head> <body> <p name="dromouse"><b>The Dormouse's story</b></p> <p >Once upon a time there were three little sisters; and their names were <a href="http://example.com/elsie" id="link1"><!-- Elsie --></a>, <a href="http://example.com/lacie" id="link2">Lacie</a> and <a href="http://example.com/tillie" id="link3">Tillie</a>; and they lived at the bottom of a well.</p> <p >...</p> """ # 創建 Beautiful Soup 物件,指定lxml決議器 soup = BeautifulSoup(html, "lxml") # 輸出方式為串列生成器物件 print(soup.head.descendants) # 通過遍歷獲取所有子孫節點 for child in soup.head.descendants: print(child)
運行結果
<generator object descendants at 0x00519AB0> <title>The Dormouse's story</title> The Dormouse's story
3.節點內容:.string屬性
如果Tag只有一個NavigableString型別子節點,那么這個Tag可以使用.string得到子節點,如果一個Tag僅有一個子節點,那么這個Tab也可以使用.string方法,輸出結果與當前唯一子節點的.string結果相同,
通俗點來講就是:如果一個標簽里面沒有標簽了,那么.string就會回傳標簽里面的內容,如果標簽里面只有唯一的一個標簽了,那么.string也會回傳里面的內容,例如:
#!/usr/bin/python3 # -*- coding:utf-8 -*- from bs4 import BeautifulSoup html = """ <html><head><title>The Dormouse's story</title></head> <body> <p name="dromouse"><b>The Dormouse's story</b></p> <p >Once upon a time there were three little sisters; and their names were <a href="http://example.com/elsie" id="link1"><!-- Elsie --></a>, <a href="http://example.com/lacie" id="link2">Lacie</a> and <a href="http://example.com/tillie" id="link3">Tillie</a>; and they lived at the bottom of a well.</p> <p >...</p> """ # 創建 Beautiful Soup 物件,指定lxml決議器 soup = BeautifulSoup(html, "lxml") print(soup.head.string) print(soup.head.title.string)
運行結果
The Dormouse's story The Dormouse's story
都懂了嗎?最后注意:光理論是不夠的,這里順便送大家一套2020最新python入門到高級專案實戰視頻教程,可以去小編的Python交流.裙 :七衣衣九七七巴而五(數字的諧音)轉換下可以找到了,還可以跟老司機交流討教!
本文的文字及圖片來源于網路加上自己的想法,僅供學習、交流使用,不具有任何商業用途,著作權歸原作者所有,如有問題請及時聯系我們以作處理,
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/141417.html
標籤:Python
