我有兩個 20 行和 4 列的 DataFrame。列的名稱和值型別相同。其中一列是title,其他 3 列是值。
df1
title col1 col2 col3
apple a d g
pear b e h
grape c f i
df2
title col1 col2 col3
carrot q t w
pumpkin r u x
sprouts s v y
現在我想創建 3 個單獨的表/串列減去df1.col1 - df2.col1|的每個值。df1.col2 - df2.col2| df1.col3 - df2.col3. 因為df1.col1 - df2.col1我希望輸出看起來像以下幾行:
df1.title df2.title score
apple carrot (a - q)
apple pumpkin (a - r)
apple sprouts (a - s)
pear carrot (b - t)
pear pumpkin (b - u)
pear sprouts (b - v)
grape carrot (c - w)
grape pumpkin (c - x)
grape sprouts (c - y)
我嘗試使用以下代碼創建一個 for 回圈:
for i in df1.iterrows():
score_col1 = df1.col1[[i]] - df2.col2[[j]]
score_col2 = df1.col2[[i]] - df2.col2[[j]]
score_col3 = df1.col3[[i]] - df2.col3[[j]]
score_total = score_col1 score_col2 score_col3
i = i 1
作為回報,我收到了score_col1如下所示的輸出:
df1.title df2.title score
apple carrot (a - q)
pear carrot (b - t)
grape carrot (c - w)
有人可以幫我獲得預期的輸出嗎?
uj5u.com熱心網友回復:
a1 = ['apple','pear', 'banana']
b1 = [56,32,23]
c1 = [12,34,90]
d1 = [87,65,23]
a2 = ['carrot','pumpkin','sprouts']
b2 = [16,12,93]
c2 = [12,32,70]
d2 = [81,55,21]
df1 = pd.DataFrame({'title':a1, 'col1':b1, 'col2':c1, 'col3':d1})
df2 = pd.DataFrame({'title':a2, 'col1':b2, 'col2':c2, 'col3':d2})
res_df = pd.DataFrame([])
cols = ['col1','col2','col3']
for c in cols:
res_df = pd.DataFrame([])
for i,j in df1.iterrows():
for k,l in df2.iterrows():
res_df = res_df.append(pd.DataFrame({'title_df1':j.title, 'title_df2':l.title, 'score':j[str(c)] - l[str(c)]},index=[0]), ignore_index=True)
print(res_df)
uj5u.com熱心網友回復:
由于您需要 3 個單獨的 DataFrame,我們可以使用一個回圈(如果您想要一個 DataFrame,我們可以做類似的作業,但略有不同)。
我們可以unstack df2迭代地從repeated 列中減去它df1:
out = []
df2_stacked = df2.set_index('title').unstack().droplevel(0).reset_index(name='score')
for col in df1.filter(like='col'):
tmp = (df1[['title', col]]
.loc[df1.index.repeat(len(df2))]
.reset_index(drop=True)
.join(df2_stacked, lsuffix='_df1', rsuffix='_df2'))
tmp['score'] = tmp[col] - tmp['score']
out.append(tmp.drop(columns=col))
讓我們在一個數值示例上對其進行測驗:
df1:
title col1 col2 col3
0 apple 1000 100 10
1 pear 2000 200 20
2 grape 3000 300 30
df2:
title col1 col2 col3
0 carrot 1 4 7
1 pumpkin 2 5 8
2 sprouts 3 6 9
然后如果運行上面的代碼并列印out,它包含以下三個 DataFrame:
title_df1 title_df2 score
0 apple carrot 999
1 apple pumpkin 998
2 apple sprouts 997
3 pear carrot 1996
4 pear pumpkin 1995
5 pear sprouts 1994
6 grape carrot 2993
7 grape pumpkin 2992
8 grape sprouts 2991
title_df1 title_df2 score
0 apple carrot 99
1 apple pumpkin 98
2 apple sprouts 97
3 pear carrot 196
4 pear pumpkin 195
5 pear sprouts 194
6 grape carrot 293
7 grape pumpkin 292
8 grape sprouts 291
title_df1 title_df2 score
0 apple carrot 9
1 apple pumpkin 8
2 apple sprouts 7
3 pear carrot 16
4 pear pumpkin 15
5 pear sprouts 14
6 grape carrot 23
7 grape pumpkin 22
8 grape sprouts 21
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/445060.html
上一篇:計算R中某些動作/變數的持續時間和關鍵數字(平均、標準、最小值、最大值)?
下一篇:如何滾動過濾?
