我想抓取表格內的img利的title名稱,請教如何解決
<tr class="Even"><td><a href="https://bbs.csdn.net/genome/1178?genome_assembly_id=170873" target="_blank">Streptomyces scabiei 87.22</a></td><td><a href="https://bbs.csdn.net/genome/1178?genome_assembly_id=170873" target="_blank">87.22</a></td><td><a href="https://bbs.csdn.net/biosample/SAMEA2272773 " target="_blank">SAMEA2272773 </a></td><td><a href="https://bbs.csdn.net/bioproject/PRJEA40749" target="_blank">PRJEA40749</a></td><td><a href="https://bbs.csdn.net/assembly/GCA_000091305.1 " target="_blank">GCA_000091305.1 </a></td><td><img src='https://img.uj5u.com/2020/10/03/147541030422301.gif' alt='Complete Genome' title='Complete Genome'/></td><td>10.1487</td><td>71.50</td><td><table class="projects_replicons" id="proks_replicons_170873"><tr ><td><b>chromosome</b>:<a href="https://bbs.csdn.net/nuccore/NC_013929.1">NC_013929.1</a>/<a href="https://bbs.csdn.net/nuccore/FN554889.1">FN554889.1</a></td></tr></table></td>
uj5u.com熱心網友回復:
我用到Python3uj5u.com熱心網友回復:
用模塊決議比如bs4uj5u.com熱心網友回復:
from html.parser import HTMLParserclass MyHTMLParser(HTMLParser):
def handle_starttag(self, tag, attrs):
if(tag=="img"):
for i in attrs:
if(i[0]=="title"):
print(i[1])
#標簽開始和屬性
html_code = '''
<img href="https://bbs.csdn.net/topics/google" title="id"> google.com</img>
<A Href="https://bbs.csdn.net/topics/pythonclub"> PythonClub </a>
<A HREF = "sina"> Sina </a>
'''
parser=MyHTMLParser()
parser.feed(html_code)
#幫你寫個吧!
uj5u.com熱心網友回復:
太感謝了!問一下哈,if(i[0]=="title")中的i[0]的意思是attrs提取出來的是串列嗎?
uj5u.com熱心網友回復:
回傳的是所有屬性,以串列元組的形式,每個元組就是一個屬性key:valueuj5u.com熱心網友回復:
import re
s = 字串
l = re.findall("<img.*?title=(.*?)/>",s)
print(l)
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/150343.html
上一篇:域名已經有了接下來怎么建站呢?
下一篇:小白求問一個py的檔案問題
