<span class="sim-posted">
<span class="jobs-status covid-icon clearfix">
<i class="covid-home-icon"></i>Work from Home
</span>
<span>Posted few days ago</span>
</span>
我想用文本“幾天前發布”抓取最后一個跨度標簽我有代碼,但它只用類抓取第一個跨度
date_published=job.find('span',class_='sim-posted').span.text
uj5u.com熱心網友回復:
試試這個,它會在你到達的跨度內找到另一個沒有類的跨度
date_published=job.find('span',class_='sim-posted').find("span", {"class": False}).text
uj5u.com熱心網友回復:
要使用Selenium幾天前用Posted文本抓取最后一個SPAN標簽,您可以使用以下任一定位器策略:
使用CSS有
last-child:span.sim-posted span:last-child使用CSS有
last-of-type:span.sim-posted span:last-of-type使用CSS有
nth-child():span.sim-posted span:nth-child(2)使用CSS有
nth-of-type():span.sim-posted span:nth-of-type(2)
uj5u.com熱心網友回復:
如果它總是最后<span>你可以去css selector last-of-type:
soup.select_one('span.sim-posted span:last-of-type').text
例子
import requests
from bs4 import BeautifulSoup
html='''
<span >
<span >
<i ></i>Work from Home
</span>
<span>Posted few days ago</span>
</span>
'''
soup = BeautifulSoup(html, "html.parser")
soup.select_one('span.sim-posted span:last-of-type').text
輸出
Posted few days ago
另類
您還可以使用:-soup-containscss 偽類選擇器來定位節點的文本。在 Beautiful Soup 4.7.0 中添加了需要 SoupSieve 集成。
soup.select_one('span.sim-posted span:-soup-contains("Posted")').text
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/363278.html
