我有一個巨大的 csv 檔案,大約有 992 行 * 992 列。例如,該檔案如下所示:

我需要創建一個基本上包含標題的輸出檔案,如下所示:

我也嘗試使用 csv 閱讀器和 dict 閱讀器,但我被困在洗掉 NA 列并將列的名稱放入一個串列(或列)并將相應的值放入另一個串列中。我一點也不擅長熊貓,在這方面一無所知。
我試過了:
def csv_reader():
with open("/Users/svadali/Downloads/test_1.csv") as csv_infile, open("/Users/svadali/Downloads/result_file.txt", "w ") as outfile:
reader = csv.reader(csv_infile, delimiter=',')
file_writer = csv.writer(outfile, delimiter="\t")
file_writer.writerow(["SPC", "SPCs_within_0.2_phylo_distance", "Phylo_Distances"])
for row in reader:
for column in reader:
print("this is row", row)
print("this is column", column)
if column == 'NA':
print("this non NA", column)
print("this is supposed to be non NA row", row)
break
我也嘗試轉置,但它們沒有產生我需要的結果。
uj5u.com熱心網友回復:
您可以從標題中提取名稱,將它們與每行中的距離一起壓縮,過濾具有無效距離的那些,然后再次壓縮它們以在單獨的列中生成名稱和距離:
with open("test_1.csv") as infile, open("result_file.txt", "w ") as outfile:
reader = csv.reader(infile, delimiter=',')
writer = csv.writer(outfile, delimiter="\t")
writer.writerow(["SPC", "SPCs_within_0.2_phylo_distance", "Phylo_Distances"])
_, *names = next(reader)
for name, *distances in reader:
writer.writerow((
name,
*map(
','.join,
zip(*((n, d) for n, d in zip(names, distances) if d != 'NA'))
)
))
演示:https ://replit.com/@blhsing/OutrageousInvolvedProtools
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/515549.html
標籤:PythonCSV
