洗掉CSV檔案中的行是添加額外的行-有解無憂

我正在處理一項編碼任務，其中應用程式的要求之一是能夠洗掉 CSV 檔案中感興趣的行。當我嘗試洗掉由鍵（名稱）標識的行時，它不僅會洗掉該行，還會將第一行的多個副本添加到我的 CSV 檔案中。我似乎無法弄清楚為什么要添加這些重復的行。任何幫助表示贊賞。

供參考：景點是將 csv 檔案復制到的字典串列

洗掉功能如下

name = entername()

with open('boston.csv', 'r') as csv_read:
    reader = csv.reader(csv_read)
    for row in reader:
        attractions.append(row)
        for field in row:
            if field == name:
               attractions.remove(row)

with open('boston.csv', 'w') as csv_write:
    writer = csv.writer(csv_write)
    writer.writerows(attractions)

我之前的 CSV 檔案如下所示：

Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green

但結果是：

Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green

uj5u.com熱心網友回復：

我已經運行了你的代碼，它似乎可以作業。

我將其修改為：

硬編碼名稱（用于除錯）
洗掉一行時列印一條訊息
不覆寫輸入檔案（除錯時非常有用）

import csv

name = 'Harvard University'

attractions = []
with open('boston.csv', 'r') as csv_read:
    reader = csv.reader(csv_read)
    for row in reader:
        attractions.append(row)
        for field in row:
            if field == name:
                print(f'{field} matches {name}, removing {row}')
                attractions.remove(row)

with open('output.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(attractions)

當我運行它時，我看到這個除錯列印訊息：

Harvard University matches Harvard University, removing ['harvard', 'Harvard University', 'university', 'https://www.harvard.edu/', '42.373032', '-71.116661', 'green']

這是我的output.csv：

Short Name,Name,Category,URL,Lat,Lon,Color
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green

當我將 name 更改為name = 'Tourism'，這對您的邏輯有效（即使它不是您想要/想要的），它仍然會按照您的預期執行，洗掉“類別”欄位中Tourism所在的兩行：

...
name = 'Tourism'

attractions = []
...

Tourism matches Tourism, removing ['science', 'Museum of Science', 'Tourism', 'https://www.mos.org/', '42.36932', '-71.07151', 'green']
Tourism matches Tourism, removing ['children', "Boston Children's Museum", 'Tourism', 'https://bostonchildrensmuseum.org/', '42.3531', '-71.04998', 'green']

Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green

綜上所述，我建議不要在滿足特定條件時添加然后洗掉。相反，如果不滿足跳過條件，我更喜歡添加：

for row in reader:
    skip_row = False
    for field in row:
        if field == name:
            print(f'{field} matches {name}, skipping {row}')
            skip_row = True
            break  # stop searching fields

    if not skip_row:
        attractions.append(row)

而且，如果您只關心Name欄位，則可以縮短該欄位并使其更加直接：

name_idx = 1  # fields are 0-based, so your 2nd field is index 1
for row in reader:
    if row[name_idx] == name:
        print(f'Found {name}, skipping {row}')
        continue  # skip rest of this loop (the append), start with next row

    attractions.append(row)

uj5u.com熱心網友回復：

有一個純 python convtools庫，它在后臺生成代碼并提供大量資料處理原語：

from convtools import conversion as c
from convtools.contrib.tables import Table

name = entername()

table = Table.from_csv("boston.csv")  # pass header=True if it's there
columns = table.columns
table.filter(
    c.not_(
        c.or_(*(c.col(column_name) == name for column_name in columns))
        if len(columns) > 1
        else c.col(columns[0]) == name
    )
).into_csv("boston_output.csv")

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/366862.html

標籤：Python 文件

上一篇：同時讀取兩個csv檔案并使用變數

下一篇：cloudflare上的SSL數量是多少？有什么限制嗎？