我試圖使用 python 網路抓取然后輸出一個 csv 檔案,但列印格式與 csv 格式不匹配。
輸出在這里輸入影像描述
如何列印這個預期的結果? 在此處輸入圖片說明
謝謝
下面是我的腳本
import urllib.request as req
import bs4
import csv
import pandas as pd
import re
from datetime import date, timedelta
def daterange(start_date, end_date):
for n in range(int((end_date - start_date).days)):
yield start_date timedelta(n)
start_date = date(2021, 12, 10)
end_date = date(2021, 12, 15)
url="https://hkgoldprice.com/history/"
with open('gprice.csv','w',newline="") as f1:
for single_date in daterange(start_date, end_date):
udate = single_date.strftime("%Y/%m/%d")
urld = url single_date.strftime("%Y/%m/%d")
writer=csv.writer(f1,delimiter = '\t',lineterminator='\n',)
writer.writerows(udate)
print(udate)
with req.urlopen(urld) as response:
data=response.read().decode("utf-8")
root=bs4.BeautifulSoup(data, "html.parser")
prices=root.find_all("div",class_="gp")
gshops=root.find_all("div",class_="gshop")
gpdate=root.find_all("div",class_="gp_date")
for price in prices:
print(price.text)
row = price
writer.writerows(row)
uj5u.com熱心網友回復:
第一個問題是您使用“writerows”,這將導致csv寫入盡可能地變成幾行。因此,當您的文本為“2021/12/23”時,轉換器將變為 ['2', '0', '2', '1', '/', '1', '2', '/', '2', '3'],并用一個字符寫入每一行。和價格一樣的問題。所以我們使用“writerow”并將行資料保存為串列,以防止 csv 將我們的資料轉換為多行。
第二個是.text在 BeautifulSoup 中使用將記錄所有文本,包括空格和導致 csv 行為不可預測。因此,我將洗掉所有空格,并#首先防止出現這種情況。
這是修改后的代碼
with open('gprice.csv','w',newline="") as f1:
for single_date in daterange(start_date, end_date):
udate = single_date.strftime("%Y/%m/%d")
urld = url single_date.strftime("%Y/%m/%d")
#we will append row by row, so we just use default setting on csv write
writer=csv.writer(f1)
#define empty row list
row_list = []
#append datetime
row_list.append(udate)
with req.urlopen(urld) as response:
data=response.read().decode("utf-8")
root=bs4.BeautifulSoup(data, "html.parser")
prices=root.find_all("div",class_="gp")
gshops=root.find_all("div",class_="gshop")
gpdate=root.find_all("div",class_="gp_date")
for price in prices:
#get inner text and delete '#'
row = price.text.replace('#', '')
#delete all whitespaces and append price
row_list.append("".join(row.split()))
#we only append one row data, so use "writerow" instad of "writerows"
writer.writerow(row_list)
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/391105.html
