我從一個網站上抓取了一些資訊,其中一些輸出不存在并且它回傳 null。在這種情況下,有沒有辦法為不同的欄位輸出默認值。示例腳本如下。
腳本檔案
import scrapy
class UfcscraperSpider(scrapy.Spider):
name = 'ufcscraper'
start_urls = ['http://ufcstats.com/statistics/fighters?char=a']
def parse(self, response):
for user_info in response.css(".b-statistics__table-row")[2::]:
result = {
"fname": user_info.css("td:nth-child(1) a::text").get(),
"lname": user_info.css("td:nth-child(2) a::text").get(),
"nname": user_info.css("td:nth-child(3) a::text").get(),
"height": user_info.css("td:nth-child(4)::text").get().strip(),
"weight": user_info.css("td:nth-child(5)::text").get().strip(),
"reach": user_info.css("td:nth-child(6)::text").get().strip(),
"stance": user_info.css("td:nth-child(7)::text").get().strip(),
"win": user_info.css("td:nth-child(8)::text").get().strip(),
"lose": user_info.css("td:nth-child(9)::text").get().strip(),
"draw": user_info.css("td:nth-child(10)::text").get().strip()
}
yield result
例如,第一行中的 nname 欄位的值為 null,而 stance 的值為 "",這是一個空字串左右,我如何為此類事件設定默認值。
樣本結果
[
{"fname": "Tom", "lname": "Aaron", "nname": null, "height": "--", "weight": "155 lbs.", "reach": "--", "stance": "", "win": "5", "lose": "3", "draw": "0"},
{"fname": "Danny", "lname": "Abbadi", "nname": "The Assassin", "height": "5' 11\"", "weight": "155 lbs.", "reach": "--", "stance": "Orthodox", "win": "4", "lose": "6", "draw": "0"},
]
uj5u.com熱心網友回復:
您可以放入邏輯以替換函式中的任何 "" 或者您可以回圈遍歷結果,當您遇到""replaqce時,您可以使用任何您想要的默認值。
data = [
{"fname": "Tom", "lname": "Aaron", "nname": "", "height": "--", "weight": "155 lbs.", "reach": "--", "stance": "", "win": "5", "lose": "3", "draw": "0"},
{"fname": "Danny", "lname": "Abbadi", "nname": "The Assassin", "height": "5' 11\"", "weight": "155 lbs.", "reach": "--", "stance": "Orthodox", "win": "4", "lose": "6", "draw": "0"},
]
for idx, each in enumerate(data):
for k, v in each.items():
if v == '':
data[idx][k] = 'DEFAULT'
輸出:
print(data)
[
{'fname': 'Tom', 'lname': 'Aaron', 'nname': 'DEFAULT', 'height': '--', 'weight': '155 lbs.', 'reach': '--', 'stance': 'DEFAULT', 'win': '5', 'lose': '3', 'draw': '0'},
{'fname': 'Danny', 'lname': 'Abbadi', 'nname': 'The Assassin', 'height': '5\' 11"', 'weight': '155 lbs.', 'reach': '--', 'stance': 'Orthodox', 'win': '4', 'lose': '6', 'draw': '0'}
]
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/347510.html
上一篇:為什么網頁抓取回圈回傳錯誤
