*更新以添加更多行的輸入檔案
我有一個帶有標題和后續資料的 .csv 檔案,如下所示(此處僅顯示前幾行):
gene_name VarXCRep.1 VarX1Rep.1 VarX2Rep.1 VarXCRep.2 VarX3Rep.2 VarX1Rep.2 VarX2Rep.2 VarXCRep.3 VarX3Rep.3 VarX1Rep.3 VarX2Rep.3
1 Soltu.DM.01G000010 360.7000522 395.2279977 323.2595994 361.5910696 327.7380499 386.8290979 336.3997167 333.0843759 317.4954424 377.756613 396.666783
2 Soltu.DM.01G000020 91.12422371 69.30538348 77.36127164 135.060696 61.85252412 110.6099 68.21624475 108.7053612 55.31681029 56.52040232 36.14709293
3 Soltu.DM.01G000030 439.1681337 183.5656103 232.0838149 579.546161 220.9018719 179.6646995 179.2348391 291.2746216 222.4196747 266.8621527 208.321404
4 Soltu.DM.01G000040 268.3102142 185.4387288 192.0217278 301.5640936 130.9345641 237.108515 203.9799475 236.921941 92.19468382 198.1791322 38.04957151
5 Soltu.DM.01G000050 341.7158389 479.5183289 504.229717 322.2876925 528.5579334 390.4957244 470.1570594 342.8399852 554.3205365 424.9761896 634.4766049
6 Soltu.DM.01G000060 468.2772607 839.1570756 759.7982036 514.516937 886.0173261 572.6048416 579.8380803 549.1014398 1011.836655 598.8300854 1077.754113
7 Soltu.DM.01G000070 2.531228436 0 5.525805117 1.429213714 8.032795341 1.83331326 5.350293706 0 4.609734191 0 7.609914302
8 Soltu.DM.01G000090 84.79615262 54.3204357 75.97982036 98.61574626 102.0165008 83.11020113 84.26712586 108.7053612 98.53306833 80.13019064 93.2214502
9 Soltu.DM.01G000100 67.07755356 73.05162042 12.43306151 118.6247383 6.426236273 77.61026135 36.11448251 97.55609336 8.643251608 67.25212429 15.2198286
10 Soltu.DM.01G000110 1.265614218 0 1.381451279 2.143820571 0 1.22220884 4.012720279 0 2.304867095 0.715448131 0.951239288
11 Soltu.DM.01G000120 821.3836276 451.4215518 846.8296342 820.3686718 737.4106123 497.4389979 835.9833915 798.5663071 752.5391067 704.7164087 532.6940011
12 Soltu.DM.01G000130 2.531228436 3.746236945 5.525805117 2.143820571 0.803279534 0.61110442 2.00636014 1.393658477 1.728650322 2.146344392 10.46363217
13 Soltu.DM.01G000140 93.65545214 127.3720561 102.2273947 105.7618148 104.4263394 108.7765868 115.7001014 98.94975183 108.9049703 110.8944603 126.5148253
14 Soltu.DM.01G000150 112.6396654 84.29033126 91.17578444 86.46742969 154.2296705 99.61002047 111.0185944 115.6736536 111.7860541 115.187149 163.6131575
15 Soltu.DM.01G000160 644.197637 573.1742525 222.413656 760.3416958 178.3280566 761.4361074 594.551388 1053.605808 222.4196747 585.2365709 303.4453328
16 Soltu.DM.01G000170 751.7748456 841.0301941 910.3763931 773.9192261 835.4107154 820.7132361 1148.975573 804.140941 849.3435247 710.4399938 946.4830913
17 Soltu.DM.01G000190 6.328071091 1.873118472 5.525805117 6.431461713 8.836074875 5.49993978 8.694227272 11.14926781 4.609734191 7.869929438 0.951239288
18 Soltu.DM.01G000200 88.59299527 73.05162042 66.30966141 74.31911313 63.45908319 78.83247019 74.23532517 86.40682554 59.35032771 59.38219485 44.70824652
19 Soltu.DM.01G000210 108.8428228 112.3871083 85.64997932 111.4786697 73.0984376 123.4430928 113.6937412 143.5468231 67.41736254 77.26839812 86.56277518
20 Soltu.DM.01G000220 5.062456873 86.16344973 93.938687 20.72359885 507.6726655 30.555221 24.74510839 6.968292383 551.4394526 54.37405793 920.7996305
這是檔案在 Bash shell 中的顯示方式
gene_name,VarXCRep.1,VarX1Rep.1,VarX2Rep.1,VarXCRep.2,VarX3Rep.2,VarX1Rep.2,VarX2Rep.2,VarXCRep.3,VarX3Rep.3,VarX1Rep.3,VarX2Rep.3
Soltu.DM.01G000010,360.7000522,395.2279977,323.2595994,361.5910696,327.7380499,386.8290979,336.3997167,333.0843759,317.4954424,377.756613,396.666783
Soltu.DM.01G000020,91.12422371,69.30538348,77.36127164,135.060696,61.85252412,110.6099,68.21624475,108.7053612,55.31681029,56.52040232,36.14709293
Soltu.DM.01G000030,439.1681337,183.5656103,232.0838149,579.546161,220.9018719,179.6646995,179.2348391,291.2746216,222.4196747,266.8621527,208.321404
Soltu.DM.01G000040,268.3102142,185.4387288,192.0217278,301.5640936,130.9345641,237.108515,203.9799475,236.921941,92.19468382,198.1791322,38.04957151
Soltu.DM.01G000050,341.7158389,479.5183289,504.229717,322.2876925,528.5579334,390.4957244,470.1570594,342.8399852,554.3205365,424.9761896,634.4766049
Soltu.DM.01G000060,468.2772607,839.1570756,759.7982036,514.516937,886.0173261,572.6048416,579.8380803,549.1014398,1011.836655,598.8300854,1077.754113
Soltu.DM.01G000070,2.531228436,0,5.525805117,1.429213714,8.032795341,1.83331326,5.350293706,0,4.609734191,0,7.609914302
Soltu.DM.01G000090,84.79615262,54.3204357,75.97982036,98.61574626,102.0165008,83.11020113,84.26712586,108.7053612,98.53306833,80.13019064,93.2214502
Soltu.DM.01G000100,67.07755356,73.05162042,12.43306151,118.6247383,6.426236273,77.61026135,36.11448251,97.55609336,8.643251608,67.25212429,15.2198286
我被要求洗掉我在以下代碼中成功完成的各種型別的列和相關資料。然后我被要求排列資料,以便標題顯示控制 (VarXC) 重復 1、2 和 3,實驗 1 (VarX1) 在彼此相鄰的列中重復,這也在以下代碼中完成:
empty_list = []
for ln in open("FinalXVartest.csv").readlines():
col = ln.split(",")
del col[3]
del col[4]
del col[5]
del col[6]
del col[7]
col.append(col.pop(2))
col.append(col.pop(3))
col.append(col.pop(4))
empty_list = col
empty_list = '\n'
file_out = open("Xtest_2Var.csv", "w")
file_out.write(','.join(empty_list))
file_out.close()
當我嘗試編譯所有這些資訊時,輸出顯示如下: 這是最終輸出
我不確定如何在左側獲得該空間。有人可以幫我洗掉,以便所有行向左移動一個單元格嗎?
uj5u.com熱心網友回復:
您應該稍微更改代碼以使其按預期作業。您的代碼的問題在于您正在構建一個串列,并將 EOL\n作為元素添加到該串列中。因此,當您將此串列寫入檔案時
file_out.write(','.join(empty_list))
每個換行符后會有一個逗號。我構建了一個串列串列并在\n之后添加join以避免您的問題:
empty_list = []
for ln in open("files/FinalXVartest.csv").readlines():
col = ln.split(",")
del col[3]
del col[4]
del col[5]
del col[6]
del col[7]
col.append(col.pop(2))
col.append(col.pop(3))
col.append(col.pop(4))
empty_list.append(col)
file_out = open("files/Xtest_2Var.csv", "w")
for item in empty_list:
file_out.write(','.join(item) '\n')
file_out.close()
但最好使用csv library。適用于讀寫csv檔案。
uj5u.com熱心網友回復:
使用熊貓:
import pandas as pd
import re
df = pd.read_csv('FinalXVartest.csv', index_col='gene_name')
parsed = sorted([(re.match(r'VarX(.)Rep.(\d)', k).groups()[::-1], k) for k in df.columns])
cols = [k for (i, j), k in parsed if j in {'1', 'C'}]
df.to_csv('Xtest_2Var.csv')
>>> df[cols]
VarX1Rep.1 VarXCRep.1 VarX1Rep.2 VarXCRep.2 VarX1Rep.3 VarXCRep.3
gene_name
Soltu.DM.01G000010 395.227998 360.700052 386.829098 361.591070 377.756613 333.084376
Soltu.DM.01G000020 69.305383 91.124224 110.609900 135.060696 56.520402 108.705361
Soltu.DM.01G000030 183.565610 439.168134 179.664700 579.546161 266.862153 291.274622
Soltu.DM.01G000040 185.438729 268.310214 237.108515 301.564094 198.179132 236.921941
Soltu.DM.01G000050 479.518329 341.715839 390.495724 322.287692 424.976190 342.839985
Soltu.DM.01G000060 839.157076 468.277261 572.604842 514.516937 598.830085 549.101440
Soltu.DM.01G000070 0.000000 2.531228 1.833313 1.429214 0.000000 0.000000
Soltu.DM.01G000090 54.320436 84.796153 83.110201 98.615746 80.130191 108.705361
Soltu.DM.01G000100 73.051620 67.077554 77.610261 118.624738 67.252124 97.556093
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/397067.html
上一篇:CSV檔案資料替換
