我試圖從單個網頁中抓取多個表,但無法將其保存到 .csv 檔案。只有最后一張表被保存在下面是代碼,請提出建議
import time
from selenium import webdriver
import pandas as pd
base_url = 'https://uk.insight.com/en_GB/shop/product/2W1F2EA#ABU/HEWLETT-PACKARD-(HP-INC)/2W1F2EA#ABU/HP-ProBook-440-G8--14"--Core-i7-1165G7--16-GB-RAM--1-TB-SSD--UK/'
print('Opening Chrome Browser Automatically in 5 secs')
time.sleep(5)
options = webdriver.ChromeOptions()
options.add_experimental_option("detach", True)
driver = webdriver.Chrome(options=options)
driver.get(base_url)
df = pd.read_html(driver.page_source)
df2 = df[4:]
for table in df2:
df = pd.DataFrame(table)
df.to_csv('table.csv',index=False)
我不知道如何將所有資料幀保存到單個 .csv 中,如上所述,只有最后一個 df 被保存。
uj5u.com熱心網友回復:
在Pandas .to_csv()檔案中,您可以使用mode引數來附加資料而不是覆寫。默認設定為“w”。
如果要附加資料,可以將模式切換為“a”
df.to_csv('table.csv', mode='a', index=False)
需要注意的一件事是,除非您設定,否則列名也將被附加 header = False
這是一個可快速復制的示例。
import uuid
import pandas as pd
dataframe = pd.DataFrame({
"person_id": [str(uuid.uuid4())[:7] for _ in range(6)],
"hours_worked": [38.5, 41.25, "35.0", 27.75, 22.25, -20.5],
"wage_per_hour": [15.1, 15, 21.30, 17.5, 19.50, 25.50],
})
dataframe2 = pd.DataFrame({
"person_id2": [str(uuid.uuid4())[:7] for _ in range(6)],
"hours_worked2": [38.5, 41.25, "35.0", 27.75, 22.25, -20.5],
"wage_per_hour2": [15.1, 15, 21.30, 17.5, 19.50, 25.50],
})
dataframe.to_csv('TEST.csv', mode='w', index=False)
dataframe2.to_csv('TEST.csv', mode='a', index = False, header=False)
print(pd.read_csv('TEST.csv'))
輸出
person_id hours_worked wage_per_hour
0 1aa66bc 38.50 15.1
1 b7abe05 41.25 15.0
2 15e1779 35.00 21.3
3 3c117d7 27.75 17.5
4 2e6494e 22.25 19.5
5 2a25e45 -20.50 25.5
6 b17d084 38.50 15.1
7 6ca361e 41.25 15.0
8 2cd18e4 35.00 21.3
9 9d120ff 27.75 17.5
10 a0b20d9 22.25 19.5
11 bf9a98d -20.50 25.5
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/376281.html
