我正在嘗試旋轉 csv 檔案,同時保留不應旋轉的列 (dont_pivot)。我已經設法旋轉我的兩列,但我正在努力保留 dont_pivot 列。然后我想將結果輸出到 csv 而不是 csv 字串(因此與此示例不同:使用 python 透視 CSV 字串而不使用熊貓或任何類似的庫)。
在第二步中,我需要從 dont_pivot 列中提取位于兩個下劃線之間的數字。這不是問題 - 它只是意味著這些值不是唯一的。
要求是僅使用標準庫。
輸入:
dont_pivot,key,value
a_9_bc,x,1
a_9_bc,y,2
a_9_bc,z,3
a_9_bc,p,4
a_9_bc,q,5
b_9_bc,x,11
b_9_bc,y,21
b_9_bc,z,31
b_9_bc,p,41
b_9_bc,q,51
期望的輸出:
dont_pivot_num,x,y,z,p,q
a_9_bc,1,2,3,4,5
b_9_bc,11,21,31,41,51
我很高興在第二步中提取 9 和 a/b,而不是在我的資料透視代碼中執行正則運算式:
dont_pivot_letter,dont_pivot_num,x,y,z,p,q
a,9,1,2,3,4,5
a,9,11,21,31,41,51
當前輸出(作為字串,但我不需要一個字串,而是一個 csv 檔案):
x,y,z,p,q
1,2,3,4,5
11,21,31,41,51
我的代碼:
import csv
import re
with open("myfile.csv", "r") as f:
content = csv.reader(f)
next(content)
#### dont_pivot_num ####
dont_pivot_num = []
lines = []
for row in content:
dont_pivot_num.append(re.search(r"(\d)", row[0]).group(1)) # Can be an extra step once I have my desired csv format
dont_pivot_char.append(re.search(r"\b(\w)", row[0]).group(1)) # Can be an extra step once I have my desired csv format
lines.append(",".join(row[1:]))
lines = [l.replace(" ", "") for l in lines]
#### Pivot csv file ####
cols = ["x", "y", "z", "p", "q"]
csvdata = {k: [] for k in cols}
tempcols = list(cols)
for line in lines:
key, value = line.split(",")
try:
csvdata[key].append(value)
tempcols.remove(key)
except ValueError:
for c in tempcols: # now tempcols has only "missing" attributes
csvdata[c].append("")
tempcols = [c for c in cols if c != key]
for c in tempcols:
csvdata[c].append("")
# Instead of doing this, I'd like to combine dont_pivot_num with csvdata and write individual rows to a csv file
csvfile = ""
csvfile = ",".join(csvdata.keys()) "\n"
# print(csvfile)
for row in zip(*csvdata.values()):
csvfile = ",".join(row) "\n"
print(csv)
uj5u.com熱心網友回復:
我只是創建了一個單獨的串列,其中包含來自 dont_pivot 列的唯一值,使用正則運算式提取所需的值并將其添加到字典中,然后再將其全部寫入 csv 檔案。
import csv
import re
with open("myfile.csv", "r") as f:
content = csv.reader(f)
next(content)
#### dont_pivot_num ####
lines = []
dont_pivot = []
for row in content:
dont_pivot.append(row[0])
lines.append(",".join(row[1:]))
lines = [l.replace(" ", "") for l in lines]
dont_pivot_unique = list(dict.fromkeys(dont_pivot))
dont_pivot_num = []
dont_pivot_letter = []
for a in dont_pivot_unique:
dont_pivot_num.append(re.search(r"(\d)", a).group(1))
dont_pivot_letter.append(re.search(r"\b(\w)", a).group())
#### Pivot csv file ####
cols = ["x", "y", "z", "p", "q"]
csvdata = {k: [] for k in cols}
tempcols = list(cols)
for line in lines:
key, value = line.split(",")
try:
csvdata[key].append(value)
tempcols.remove(key)
except ValueError:
for c in tempcols: # now tempcols has only "missing" attributes
csvdata[c].append("")
tempcols = [c for c in cols if c != key]
for c in tempcols:
csvdata[c].append("")
csvdata["dont_pivot_num"] = dont_pivot_num
csvdata["dont_pivot_letter"] = dont_pivot_letter
print(csvdata)
with open("csvfile_out.csv", "w") as f:
w = csv.writer(f)
w.writerow(csvdata.keys())
w.writerows((zip(*csvdata.values())))
uj5u.com熱心網友回復:
您可以使用 adefaultdict來建立每個條目的值:
from collections import defaultdict
import csv
entries = defaultdict(list)
keys = {}
with open('myfile.csv') as f_input:
csv_input = csv.reader(f_input)
header = next(csv_input)
for row in csv_input:
entries[row[0]].append(row[2])
keys[row[1]] = None
with open('output.csv', 'w', newline='') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(['dont_pivot_num', *keys.keys()])
for key, values in entries.items():
csv_output.writerow([key, *values])
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/365290.html
