我的初始字串由<span>和中間的一些內容以及</span></span組成,我想從我的字串中洗掉這一部分(包括span和它里面的內容以及/span),我應該怎么做?
需要洗掉的字串的一部分。"<span class="_5mfr"><span class="_6qdm" style='height: 16px; width: 16px; font-size: 16px; background-image: url("https://static.xx.fbcdn.net/images/emoji.php/v9/t81/1/16/") 14個變數字串 </span> </span
我想洗掉上面提到的那一整塊內容
uj5u.com熱心網友回復:import re
txt = 'Iam a good boy <span>some blahblahblah </span</span and my name is john'。
print(re.sub(r'<span>.*</span</span ', ' , txt)
印刷品:
Iam a good boy and my name is john
對更新的問題
import re
txt = ""<span class="_5mfr"><span class="_6qdm" style='height: 16px; width: 16px; font-size: 16px; background-image: url("https://static.xx.fbcdn.net/images/emoji.php/v9/t81/1/16/") 14 variable strings </span> </span""
print(re.sub(r'<span [^<>]*?</span>?</span', ', txt)
# prints: <span class="_5mfr">
uj5u.com熱心網友回復:
使用BeautifulSoup:
from bs4 import BeautifulSoup
soup = BeautifulSoup(string, 'html.parser')
for x in soup.findAll('span') 。
x.replace_with('')
print(soup.string)
uj5u.com熱心網友回復:
你可以替換regex找到的所有東西,如下圖所示:
import re
regex = r"(<span. ?>)|(</span>)"/span>
test_str = "<span class="_5mfr"><span class="_6qdm" style='height: 16px; width: 16px; font-size: 16px; background-image: url("static.xx.fbcdn.net/images/emoji. php/v9/t81/1/16/...")'>? Dasamoolam Damu (Troll Malayalam)?? ??????<span class="_5mfr" > <span class="_6qdm" style='height: 16px; width: 16px; font-size: 16px; background-image: url(static. xx.fbcdn.net/images/emoji.php/v9/td7/1/16/...")'></span></span></span></span>"_span>
print(re.sub(regex, '', test_str)
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/324475.html
標籤:
下一篇:str方法不能正確列印
