我的目標是讀取一個 csv 檔案并將其寫入另一個 csv 檔案。我有一個 csv 檔案,其標題包含 x 列。有些行中有 x 1、x 2、x-1 個值(其中一些是空的)。我想遍歷每一行,找出值的數量,如果值的數量不是 x,那么我想跳過這一行,然后移動到下一行。
#this for loop iterates through multiple csv files in a folder and applies the specified changes
for filename in os.listdir("csv files"):
filenames=os.path.join("csv files",filename)
print(filenames)
df=pd.read_csv(filenames)
for row in df:
#used the below line to rename the column names
df.columns=["id","FirstName","LastName","UserName","Phone","IsContact","RestrictionReason","Status","IsScam","Date"]
#removed the Status column
df.pop("Status")
#used the line below to reorder the arrangement of columns in the dataframe
df = df.reindex(columns=['id', 'Phone', 'FirstName', 'LastName', 'UserName',"IsContact","IsScam","Date","RestrictionReason"])
df.to_csv(filenames,index=False)
如您所見,在我洗掉該Status列及其值之后,最后有 10 列 / 但有些資料有 11 或 12 個值。我不希望它們出現在我的新 csv 檔案中,所以我想跳過那些特定的行,但我不知道該怎么做。
以下是前 5 個值,包括資料幀的標頭:
id Phone FirstName LastName UserName IsContact IsScam Date RestrictionReason
0 MT103 WIRE TRANSFER 9.477897e 10 Rooban Naan NaN False False 5/5/2022 11:51:37 PM NaN
1 MT103 WIRE TRANSFER 9.199007e 11 Vbanna Corp Vbannacorp True False 5/5/2022 11:51:14 PM NaN
2 MT103 WIRE TRANSFER 9.197899e 11 Chennail B Party RamaRaoTadimeti True False 5/5/2022 11:51:14 PM NaN
3 MT103 WIRE TRANSFER 9.196008e 11 Sahai NaN JAS2777 True False 5/5/2022 11:51:14 PM NaN
4 MT103 WIRE TRANSFER 8.801818e 12 Md Shah Alam NaN shahalamtrading True False 5/5/2022 11:51:14 PM NaN
忽略沒有標題的最左邊的列,這是我在撰寫 csv 檔案時沒有注意的結果。
uj5u.com熱心網友回復:
使用 Pandas 1.3.0 ,您可以:
df = pd.read_csv('your.csv', on_bad_lines='skip')
使用 Pandas 1.4.0 ,您可以使用可呼叫函式做更多事情,請參閱Pandas 資料幀 read_csv on bad data(1.4.0 新增),以更深入地了解使用可呼叫函式和on_bad_lines.
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/471449.html
