我正在嘗試為我正在從事的專案抓取亞馬遜網站。
到目前為止,我已經建立了這個流程
driver = webdriver.Chrome(executable_path=r"C:\Users\chromedriver.exe")
driver.maximize_window()
wait=WebDriverWait(driver,20)
driver.get('hhttps://www.amazon.it/s?k=shoes&__mk_it_IT=ÅMÅŽÕÑ&crid=3B00FY4A5NJBZ&sprefix=shoes'
',aps,122&ref=nb_sb_noss')
products = driver.find_elements(By.CSS_SELECTOR, 'div[class = "sg-col-4-of-12 s-result-item s-asin sg-col-4-of-16 sg-col '
's-widget-spacing-small sg-col-4-of-20"]')
for product in products:
name = product.find_element(By.CSS_SELECTOR, 'span[class = "a-size-base-plus a-color-base a-text-normal"]').text
brand = product.find_element(By.CSS_SELECTOR, 'span[class = "a-size-base-plus a-color-base"]').text
price = product.find_element(By.CSS_SELECTOR, 'span[class = "a-price-whole"]').text
生成此流程后,我想按價格過濾結果(例如,我想將所有內容保持在 100 歐元以下)并將輸出“保存”在串列/組上以將其與另一個回圈結果連接起來
謝謝車
uj5u.com熱心網友回復:
將dict每個產品的 a 添加到 alist將是保存資料以進行后處理的一種方法:
data = []
for product in products:
data.append({
'name':product.find_element(By.CSS_SELECTOR, 'span[class = "a-size-base-plus a-color-base a-text-normal"]').text,
'brand':product.find_element(By.CSS_SELECTOR, 'span[class = "a-size-base-plus a-color-base"]').text,
'price':product.find_element(By.CSS_SELECTOR, 'span[class = "a-price-whole"]').text
})
您可以使用這個 dicts 串列來簡單地創建一個DataFrame用于過濾和保存:
import pandas as pd
df = pd.DataFrame(data) #create dataframe
df['price'] = df['price'].str.replace(',','.').astype(float) #convert strings to float
df[df['price'] < 100].to_excel('test.xlsx', index=False) #filter dataframe and save to excel
輸出
| 姓名 | 品牌 | 價錢 | |
|---|---|---|---|
| 0 | Graceful-Get Connected, 運動鞋 Donna | 斯凱奇 | 40 |
| 1 | Court Royale 2, Scarpe Uomo | 耐克 | 54.99 |
| 2 | Og 85 Gold'n Gurl, Scarpe da Ginnastica Donna | 斯凱奇 | 54.93 |
| 3 | 粉碎,斯卡普·達·金納斯卡。魚友 | 彪馬 | 28.43 |
| 4 | Wearallday, Scarpe da corsa Donna, NULL, NULL | 耐克 | 55.48 |
| 5 | Uomo Claudio A, Scarpe Stringate Basse Derby | Geox | 46.16 |
| 6 | Court Graffik, Scarpa da Skate Bambini e Ragazzi | 直流鞋 | 19.92 |
| 7 | 徒步旅行,兒童 Shedir Mid Highing 鞋 WP-Scarpe da Ginnastica 中性 - Bambini e Ragazzi | CMP | 33.68 |
| 8 | Court Graffik, Scarpe da Ginnastica Basse Uomo | 直流鞋 | 50.98 |
| ... |
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/439589.html
上一篇:使用硒的xpath問題
下一篇:如何從抓取的鏈接中洗掉后綴?
