我有下表作為輸入:
| X | 是 | |
|---|---|---|
| 0 | -0.872803 | 137.097977 |
| 1 | -0.418766 | 821.549805 |
| 2 | -0.657833 | 712.427856 |
| 3 | -0.922091 | 126.871956 |
| 4 | -0.847130 | 217.126068 |
| 5 | 0.692070 | 2166.090820 |
| 6 | -0.858773 | 297.893188 |
| 7 | -0.466285 | 634.510315 |
| 8 | -0.774720 | 91.447876 |
| 9 | -0.111050 | 1200.390625 |
| 10 | 0.325138 | 1759.597900 |
我需要生成這樣的東西:
| X | 是 | pos_when_sorted_by_x | pos_when_sorted_by_y | |
|---|---|---|---|---|
| 0 | -0.872803 | 137.097977 | 9 | 8 |
| 1 | -0.418766 | 821.549805 | 3 | 3 |
| 2 | -0.657833 | 712.427856 | 5 | 4 |
| 3 | -0.922091 | 126.871956 | 10 | 9 |
| 4 | -0.847130 | 217.126068 | 7 | 7 |
| 5 | 0.692070 | 2166.090820 | 0 | 0 |
| 6 | -0.858773 | 297.893188 | 8 | 6 |
| 7 | -0.466285 | 634.510315 | 4 | 5 |
| 8 | -0.774720 | 91.447876 | 6 | 10 |
| 9 | -0.111050 | 1200.390625 | 2 | 2 |
| 10 | 0.325138 | 1759.597900 | 1 | 1 |
pos_when_sorted_by_x并且pos_when_sorted_by_y基于這些列中的每一列在排序資料框中的位置。
uj5u.com熱心網友回復:
使用rank:
df[['x_pos', 'y_pos']] = df.agg('rank', ascending=False).sub(1).astype(int)
print(df)
# Output:
x y x_pos y_pos
0 -0.872803 137.097977 9 8
1 -0.418766 821.549805 3 3
2 -0.657833 712.427856 5 4
3 -0.922091 126.871956 10 9
4 -0.847130 217.126068 7 7
5 0.692070 2166.090820 0 0
6 -0.858773 297.893188 8 6
7 -0.466285 634.510315 4 5
8 -0.774720 91.447876 6 10
9 -0.111050 1200.390625 2 2
10 0.325138 1759.597900 1 1
numpy 和的替代方法argsort:
df[['x_pos', 'y_pos']] = np.argsort(np.argsort(-1*df, axis=0), axis=0)
print(df)
# Output:
x y x_pos y_pos
0 -0.872803 137.097977 9 8
1 -0.418766 821.549805 3 3
2 -0.657833 712.427856 5 4
3 -0.922091 126.871956 10 9
4 -0.847130 217.126068 7 7
5 0.692070 2166.090820 0 0
6 -0.858773 297.893188 8 6
7 -0.466285 634.510315 4 5
8 -0.774720 91.447876 6 10
9 -0.111050 1200.390625 2 2
10 0.325138 1759.597900 1 1
注意:-1*是因為argsort沒有降序選項。
uj5u.com熱心網友回復:
您可以使用pd.rankwithascending=False和減去 1,這樣排名從零開始。
import pandas as pd
df = pd.DataFrame({'x': [-0.872803,
-0.418766,
-0.657833,
-0.922091,
-0.84713,
0.69207,
-0.858773,
-0.466285,
-0.77472,
-0.11105,
0.325138],
'y': [137.097977,
821.549805,
712.427856,
126.871956,
217.126068,
2166.09082,
297.893188,
634.510315,
91.447876,
1200.390625,
1759.5979]})
df['pos_x'] = (df.x.rank(ascending=False)-1).astype(int)
df['pos_y'] = (df.y.rank(ascending=False)-1).astype(int)
輸出
x y pos_x pos_y
0 -0.872803 137.097977 9 8
1 -0.418766 821.549805 3 3
2 -0.657833 712.427856 5 4
3 -0.922091 126.871956 10 9
4 -0.847130 217.126068 7 7
5 0.692070 2166.090820 0 0
6 -0.858773 297.893188 8 6
7 -0.466285 634.510315 4 5
8 -0.774720 91.447876 6 10
9 -0.111050 1200.390625 2 2
10 0.325138 1759.597900 1 1
uj5u.com熱心網友回復:
您也可以執行以下操作:
dfs_x = df.sort_values(by='x', ascending=False)
dfs_y = df.sort_values(by='y', ascending=False)
df['pos_x'] = df.index.map(dfs_x.index.get_loc)
df['pos_y'] = df.index.map(dfs_y.index.get_loc)
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/383796.html
上一篇:將Lambda函式應用于多列
下一篇:將一列資料框拆分為多列
