是否可以使用Beautiful Soup測驗div 是否是 div 的(不一定是直接的)子級?
例如。
<div class='a'>
<div class='aa'>
<div class='aaa'>
<div class='aaaa'>
</div>
</div>
</div>
<div class='ab'>
<div class='aba'>
<div class='abaa'>
</div>
</div>
</div>
</div>
現在我想測驗divwith 類aaaa 和divwith 類abaa是否是 div類的(不一定是直接的)子類aa。
import bs4
with open('test.html','r') as i_file:
soup = bs4.BeautifulSoup(i_file.read(), 'lxml')
div0 = soup.find('div', {'class':'aa'})
div1 = soup.find('div', {'class':'aaaa'})
div2 = soup.find('div', {'class':'abaa'})
print(div1 in div0) # must return True, but returns False
print(div2 in div0) # must return False
如何才能做到這一點?
(當然,實際的 HTML 更復雜,嵌套的 div 也更多。)
uj5u.com熱心網友回復:
嘗試查找所有使用的子元素find_all_next并查看子元素是否具有所需的類屬性。
from bs4 import BeautifulSoup
soup = BeautifulSoup(text, "html.parser")
def is_child(element, parent_class, child_class):
return any(
child_class in i.attrs['class']
for i in soup.find("div", attrs={"class": parent_class}).find_all_next(element)
)
print(is_child("div", "aa", "aaa")) # True
print(is_child("div", "abaa", "aa")) # False
uj5u.com熱心網友回復:
您可以使用Beautifulsoup 中的find_parent方法。
import bs4
with open("test.html", "r") as i_file:
soup = bs4.BeautifulSoup(i_file.read(), "lxml")
div0 = soup.find("div", {"class": "aa"})
div1 = soup.find("div", {"class": "aaaa"})
div2 = soup.find("div", {"class": "abaa"})
print(div1.find_parent(div0.name, attrs=div0.attrs) is not None) # Returns True
print(div2.find_parent(div0.name, attrs=div0.attrs) is not None) # Returns False
uj5u.com熱心網友回復:
好吧,我想我找到了方法。您必須使用以下命令獲取父 div 的所有子 div find_all:
import bs4
with open('test.html','r') as i_file:
soup = bs4.BeautifulSoup(i_file.read(), 'lxml')
div0 = soup.find('div', {'class':'aa'})
div1 = soup.find('div', {'class':'aaaa'})
div2 = soup.find('div', {'class':'abaa'})
children = div0.find_all('div')
print(div1 in children)
print(div2 in children)
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/396700.html
