如何合并CSV檔案，以便將具有唯一識別符號的行添加到輸出的同一行中？-有解無憂

我正在使用 Python 將 4 個無標題 CSV 合并到一個輸出檔案中。

每個 CSV 在第一列中都有一個唯一編號，如下面的 2 個示例 CSV 檔案所示：

1.csv

1,Ringo,Beatles
2,John,Beatles
3,Mick,Rolling Stones
4,Keith,Rolling Stones
5,Rivers,Weezer

2.csv

1,TSLA,XNAS,1.0,USD
2,AAPL,XNAS,1.0,USD
3,SPY,ARCX,1.0,USD
4,BP LN,XLON,1.0,GBP
5,ESUD,XCME,1.0,USD

我使用以下代碼從這些 CSV 生成了輸出。

import os
import csv

filenames = ['1.csv', '2.csv', '3.csv', '4.csv']
with open('output_file', 'w') as outfile:
    for fname in filenames:
        with open(fname) as infile:
            outfile.write(infile.read())

這作業正常并輸出一個檔案。資料最終如下

1,Ringo,Beatles
2,John,Beatles
3,Mick,Rolling Stones
4,Keith,Rolling Stones
5,Rivers,Weezer
1,TSLA,XNAS,1.0,USD
2,AAPL,XNAS,1.0,USD
3,SPY,ARCX,1.0,USD
4,BP LN,XLON,1.0,GBP
5,ESUD,XCME,1.0,USD1,5,-600,1043.22,-625932.00
3,5,200,304.89,60978.00
5,4,6,3015.25,904575.005,4,-1,2,3009.50
5,4,1,1,3011.75
4,3,1,1000,308.37
4,3,1,200,309.15
1,3,1,100,309.0125

有沒有辦法將第一列編號用作“唯一”編號來鏈接資料，以便獲取以“1”開頭的三個結果，并將它們添加到同一行？

例如，它們具有相同的“唯一”編號“1”：

1,Ringo,Beatles
1,TSLA,XNAS,1.0,USD
1,3,1,100,309.0125

結果行將是：

(1) Ringo,Beatles,TSLA,XNAS,1.0,USD,3,1,100,309.0125

uj5u.com熱心網友回復：

您可以使用字典將所有資料作為

{
1: [1, "Ringo", "Beatles", 1, "TSLA", "XNAS", 1.0, "USD", 1, 3, 1, 100, 309.0125], 
2: [2, ...],
3: [3, ...],
...
}

然后將所有內容寫入新檔案。

所以首先創建空字典。IE。new_rows = {}

接下來從檔案中獲取行，獲取 ID 并檢查它是否存在于字典中。如果不存在，則使用只有 ID 的串列創建它new_rows[key] = [key]

接下來，您可以將行中的其他值添加到此串列new_rows[key] = values

對所有檔案中的所有行重復它。

稍后您可以使用此字典將所有行寫入新檔案。

我io只用來模擬記憶體中的檔案，但你應該使用open()

text1 = '''1,Ringo,Beatles
2,John,Beatles
3,Mick,Rolling Stones
4,Keith,Rolling Stones
5,Rivers,Weezer'''

text2 = '''1,TSLA,XNAS,1.0,USD
2,AAPL,XNAS,1.0,USD
3,SPY,ARCX,1.0,USD
4,BP LN,XLON,1.0,GBP
5,ESUD,XCME,1.0,USD'''

import os
import csv
import io

new_rows = {} # dict

filenames = [text1, text2]
#filenames = ['1.csv', '2.csv', '3.csv', '4.csv']

for fname in filenames:
    #with open(fname) as infile:
    with io.StringIO(fname) as infile:

        reader = csv.reader(infile)
        for row in reader:

            key = row[0]      # ID
            values = row[1:]  # rest
            
            # create key if not exists
            if key not in new_rows:
                new_rows[key] = [key]
                
            new_rows[key]  = values  # add two lists
            
            # OR

            #if key not in new_rows:
            #    new_rows[key] = values    # only for first file
            #else:
            #     new_rows[key]  = values  # for other file - add two lists 

# --- write it  ---

with open('output_file', 'w') as outfile:
    writer = csv.writer(outfile)
    all_rows = new_rows.values()
    writer.writerows(all_rows)   # `writerows` with `s` to write list with many rows.

順便提一句：

在較舊的 Pythondict中，不必保持順序，因此它可以以不同的順序寫入新行 - 并且它需要在保存之前對行串列進行排序，否則它需要使用collections.OrderedDict()

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/445728.html

標籤：Python CSV 合并

上一篇：Redux沒有在反應中更新狀態

下一篇：將XML決議為CSV的多個for回圈不起作用