我是網路抓取的新手,我正在使用 BeautifulSoup 來做到這一點
我的問題是,當我將所需內容從另一個包含一些標簽的串列中放入串列時,第二個串列有一些缺失值。
這是我從中獲取值的串列
從這個串列中,我想創建一個包含評論的串列,所以我使用
names = []
for item in basic_info:
for i in item:
names.append(i.find_all("p", attrs = {"class" : "review-body"}))
問題是輸出看起來像這樣
上述代碼的輸出
所以基本上我不是一個一個地獲取值,而是在串列中的每個其他位置獲取它們,所以第一個是空的,第二個有資料,第三個是空的,然后第四個有資料等等
uj5u.com熱心網友回復:
注意 以文本而不是影像的形式提供所有相關資訊會很好。
假設您想提取幾個評論資訊,您應該選擇所有容器并迭代以抓取和存盤結構化資料:
data = []
for e in soup.select('div.consumer-review-container'):
data.append({
'review-title':e.h3.text,
'review-type':e.select_one('div.review-type').text,
'review-section':e.select_one('p.review-body').text
})
例子
sample = '''
<div >
<h3 >Excellent Car</h3>
<div >
<div>June 8, 2021</div>
<div>By ValC from Fairfield, CT</div>
<div ><strong>Owns this car</strong></div>
</div>
<div >
<p >I love that BMW makes a mid size suv that is part electric now. We purchased the x3 edrive. I do mostly local driving during the week so only need to fill up with gas once a month to every six weeks. Excellent car!</p>
</div>
</div>
<div >
<h3 >Best Purchase for the Value and Cost</h3>
<div >
<div>June 7, 2021</div>
<div>By Brandon from Peachtree City from Peachtree City, GA</div>
<div ><strong>Owns this car</strong></div>
</div>
<div >
<p >The BMW X3 is not a crossover that should be ignored. It is more than I expected coming from someone who has owned a 3 Series BMW for the last 6 years. Now I ask myself, why didn't I opt for the X3 much sooner, especially since it is more spacious with better features than my 3 series. Plus the price was only about $3,000 more for plenty more space and amenities. You owe it to yourself to test drive one soon!</p>
</div>
</div>
'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(sample)
data = []
for e in soup.select('div.consumer-review-container'):
data.append({
'review-title':e.h3.text,
'review-type':e.select_one('div.review-type').text,
'review-section':e.select_one('p.review-body').text
})
print(data)
輸出
[{'review-title': 'Excellent Car',
'review-type': 'Owns this car',
'review-section': 'I love that BMW makes a mid size suv that is part electric now. We purchased the x3 edrive. I do mostly local driving during the week so only need to fill up with gas once a month to every six weeks. Excellent car!'},
{'review-title': 'Best Purchase for the Value and Cost',
'review-type': 'Owns this car',
'review-section': "The BMW X3 is not a crossover that should be ignored. It is more than I expected coming from someone who has owned a 3 Series BMW for the last 6 years. Now I ask myself, why didn't I opt for the X3 much sooner, especially since it is more spacious with better features than my 3 series. Plus the price was only about $3,000 more for plenty more space and amenities. You owe it to yourself to test drive one soon!"}]
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/456936.html
上一篇:如何從附加串列中顯示單獨的值
