我正在嘗試刮取以下一行,并提取7.7872的值,我怎樣才能使它作業呢?
<span class='pos'/span>> <span class='arr_ud arrow_u5'> </span> 7. 7872</span>
我嘗試了下面的代碼,但有一些空白的字串,我無法擺脫:
。for a in soupUSD. find_all("span", attrs={"class":"pos"}) [0]。
print(a)
我有如下結果:
<span class='arr_ud arrow_u5'> </span> 7.7872
有什么辦法可以讓我只找到7.7872的文本?
uj5u.com熱心網友回復:
from bs4 import BeautifulSoup
spam = "<span class='pos'><span class='arr_ud arrow_u5'> </span> 7.7872</span>"
soup = BeautifulSoup(spam, 'html.parser')
span = soup.find('span', {'class':'pos')
print('.join(span. stripped_strings))
輸出
7.7872。
uj5u.com熱心網友回復:
因為在你的目標字串的同一層次也有其他標簽,.string屬性不會檢測到字串(在這種情況下)。因此,你可以在標簽內容中回圈尋找字串,實體NavigableString,然后將其轉換為字串。
from bs4 import BeautifulSoup, NavigableString
spam = "<span class='pos'><span class='arr_ud arrow_u5'> </span> 7.7872</span>"
soup = BeautifulSoup(spam, 'lxml')
span = soup.find('span', class_='pos')
nr = ''.join([str(string).strip() for string in span. contents if isinstance(string, NavigableString)】)
print(nr)
# 7.7872[/span
uj5u.com熱心網友回復:
使用核心Python庫(ElementTree)
import xml.etree.ElementTree as ET
dtd = ''<! DOCTYPE html PUBLIC "-/W3C/DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [
<! ENTITY nbsp ' '>
]>''
html = ''<span class='pos'><span class='arr_ud arrow_u5'> </span> 7.7872</span> ''
root = ET.fromstring(dtd html)
print(list(root)[0] .tail)
輸出
7.7872。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/323168.html
標籤:
下一篇:不能再在t體中找到tr
