目標:Model從 Dataframe 的每個子集(列)中洗掉最大值(最慢時間)。
資料框:
Model Time
1 bert-base-uncased 6.570979
2 bert-base-uncased 11.570979
3 bert-base-uncased 6.788779
4 albert-large-v1 5.785576
5 albert-large-v1 5.603203
6 albert-large-v1 9.727373
所需的資料框:
Model Time
1 bert-base-uncased 6.570979
2 bert-base-uncased 6.788779
3 albert-large-v1 5.785576
4 albert-large-v1 5.603203
代碼:
subsets = df['Model']
for s in subsets:
df[s]
如果還有什么我可以添加到帖子中,請告訴我。
uj5u.com熱心網友回復:
您可以使用drop以下標識的每個組的最大行數idxmax:
df.drop(df.groupby('Model')['Time'].idxmax().values)
輸出:
Model Time
1 bert-base-uncased 6.570979
3 bert-base-uncased 6.788779
4 albert-large-v1 5.785576
5 albert-large-v1 5.603203
uj5u.com熱心網友回復:
通過GroupBy.transformwith洗掉每組的所有最大值max并比較不等于 by 的解決方案Series.ne,過濾boolean indexing:
print (df)
Model Time
1 bert-base-uncased 6.570979
2 bert-base-uncased 11.570979
3 bert-base-uncased 6.788779
2 bert-base-uncased 11.570979 <- added another maximum per bert-base-uncased
4 albert-large-v1 5.785576
5 albert-large-v1 5.603203
6 albert-large-v1 9.727373
df1 = df[df['Time'].ne(df.groupby('Model')['Time'].transform('max'))]
print (df1)
Model Time
1 bert-base-uncased 6.570979
3 bert-base-uncased 6.788779
4 albert-large-v1 5.785576
5 albert-large-v1 5.603203
如果只需要洗掉第一個最大添加條件:
df2 = (df[df['Time'].ne(df.groupby('Model')['Time'].transform('max')) |
df['Time'].duplicated()])
print (df2)
Model Time
1 bert-base-uncased 6.570979
3 bert-base-uncased 6.788779
2 bert-base-uncased 11.570979
4 albert-large-v1 5.785576
5 albert-large-v1 5.603203
uj5u.com熱心網友回復:
您可以groupby“建模”,并使用idxmax方法找到每個組中的最大時間索引。然后使用Index.isin方法,過濾掉這些值的索引。
max_time_indices = df.groupby('Model')['Time'].idxmax()
out = df[~df.index.isin(max_time_indices)]
輸出:
Model Time
1 bert-base-uncased 6.570979
3 bert-base-uncased 6.788779
4 albert-large-v1 5.785576
5 albert-large-v1 5.603203
uj5u.com熱心網友回復:
使用 groupby 和 slice:
df = df.groupby('Model').apply(lambda sub_df: sub_df.sort_values('Time').iloc[:-1]).reset_index(drop=True)
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/409489.html
標籤:
上一篇:用“空”行填寫pythonDataform,我沒有資料
下一篇:如何按日期對組內的輸出進行排序?
