我正在處理一項編碼任務,其中應用程式的要求之一是能夠洗掉 CSV 檔案中感興趣的行。當我嘗試洗掉由鍵(名稱)標識的行時,它不僅會洗掉該行,還會將第一行的多個副本添加到我的 CSV 檔案中。我似乎無法弄清楚為什么要添加這些重復的行。任何幫助表示贊賞。
供參考:景點是將 csv 檔案復制到的字典串列
洗掉功能如下
name = entername()
with open('boston.csv', 'r') as csv_read:
reader = csv.reader(csv_read)
for row in reader:
attractions.append(row)
for field in row:
if field == name:
attractions.remove(row)
with open('boston.csv', 'w') as csv_write:
writer = csv.writer(csv_write)
writer.writerows(attractions)
我之前的 CSV 檔案如下所示:
Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green
但結果是:
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green
uj5u.com熱心網友回復:
我已經運行了你的代碼,它似乎可以作業。
我將其修改為:
- 硬編碼名稱(用于除錯)
- 洗掉一行時列印一條訊息
- 不覆寫輸入檔案(除錯時非常有用)
import csv
name = 'Harvard University'
attractions = []
with open('boston.csv', 'r') as csv_read:
reader = csv.reader(csv_read)
for row in reader:
attractions.append(row)
for field in row:
if field == name:
print(f'{field} matches {name}, removing {row}')
attractions.remove(row)
with open('output.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(attractions)
當我運行它時,我看到這個除錯列印訊息:
Harvard University matches Harvard University, removing ['harvard', 'Harvard University', 'university', 'https://www.harvard.edu/', '42.373032', '-71.116661', 'green']
這是我的output.csv:
Short Name,Name,Category,URL,Lat,Lon,Color
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green
當我將 name 更改為name = 'Tourism',這對您的邏輯有效(即使它不是您想要/想要的),它仍然會按照您的預期執行,洗掉“類別”欄位中Tourism所在的兩行:
...
name = 'Tourism'
attractions = []
...
Tourism matches Tourism, removing ['science', 'Museum of Science', 'Tourism', 'https://www.mos.org/', '42.36932', '-71.07151', 'green']
Tourism matches Tourism, removing ['children', "Boston Children's Museum", 'Tourism', 'https://bostonchildrensmuseum.org/', '42.3531', '-71.04998', 'green']
Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green
綜上所述,我建議不要在滿足特定條件時添加然后洗掉。相反,如果不滿足跳過條件,我更喜歡添加:
for row in reader:
skip_row = False
for field in row:
if field == name:
print(f'{field} matches {name}, skipping {row}')
skip_row = True
break # stop searching fields
if not skip_row:
attractions.append(row)
而且,如果您只關心Name欄位,則可以縮短該欄位并使其更加直接:
name_idx = 1 # fields are 0-based, so your 2nd field is index 1
for row in reader:
if row[name_idx] == name:
print(f'Found {name}, skipping {row}')
continue # skip rest of this loop (the append), start with next row
attractions.append(row)
uj5u.com熱心網友回復:
有一個純 python convtools庫,它在后臺生成代碼并提供大量資料處理原語:
from convtools import conversion as c
from convtools.contrib.tables import Table
name = entername()
table = Table.from_csv("boston.csv") # pass header=True if it's there
columns = table.columns
table.filter(
c.not_(
c.or_(*(c.col(column_name) == name for column_name in columns))
if len(columns) > 1
else c.col(columns[0]) == name
)
).into_csv("boston_output.csv")
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/366862.html
上一篇:同時讀取兩個csv檔案并使用變數
