我是 python 新手。請幫助我。在這里,我有兩組 csv 檔案。我需要比較并輸出差異,例如更改的資料/洗掉的資料/添加的資料。這是我的例子
file 1:
Sn Name Subject Marks
1 Ram Maths 85
2 sita Engilsh 66
3 vishnu science 50
4 balaji social 60
file 2:
Sn Name Subject Marks
1 Ram computer 85 #subject name have changed
2 sita Engilsh 66
3 vishnu science 90 #marks have changed
4 balaji social 60
5 kishor chem 99 #added new line
Output - i need to get like this :
Changed Items:
1 Ram computer 85
3 vishnu science 90
Added item:
5 kishor chem 99
Deleted item:
.................
我匯入了 csv 并通過帶有紅線的 for 回圈進行了比較。我沒有得到期望的輸出。 標記檔案 1 和檔案 2(csv 檔案)之間添加和洗掉的專案時,這讓我很困惑。請建議有效的代碼人員。
uj5u.com熱心網友回復:
這里的想法是扁平化您的資料框melt以比較每個值:
# Load your csv files
df1 = pd.read_csv('file1.csv', ...)
df2 = pd.read_csv('file2.csv', ...)
# Select columns (not mandatory, it depends on your 'Sn' column)
cols = ['Name', 'Subject', 'Marks']
# Flat your dataframes
out1 = df1[cols].melt('Name', var_name='Item', value_name='Old')
out2 = df2[cols].melt('Name', var_name='Item', value_name='New')
out = pd.merge(out1, out2, on=['Name', 'Item'], how='outer')
# Flag the state of each item
condlist = [out['Old'] != out['New'],
out['Old'].isna(),
out['New'].isna()]
out['State'] = np.select(condlist, choicelist=['changed', 'added', 'deleted'],
default='unchanged')
輸出:
>>> out
Name Item Old New State
0 Ram Subject Maths computer changed
1 sita Subject Engilsh Engilsh unchanged
2 vishnu Subject science science unchanged
3 balaji Subject social social unchanged
4 Ram Marks 85 85 unchanged
5 sita Marks 66 66 unchanged
6 vishnu Marks 50 90 changed
7 balaji Marks 60 60 unchanged
8 kishor Subject NaN chem changed
9 kishor Marks NaN 99 changed
uj5u.com熱心網友回復:
count, flag = 0, 1
for i, j in zip(df1.values, df2.values):
if sum(i == j) != 4:
if flag:
print("Changed Items:")
flag = 0
print(j)
count = 1
if count != len(df2):
print("Newly added:")
print(*df2.iloc[count:, :].values)

轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/407258.html
標籤:
下一篇:排序和拆分csv
