我有這個資料框
data = {'Date': [np.datetime64('2005-02-25 01:30:10'), np.datetime64('2005-02-25 01:31:10'),np.datetime64('2005-02-25 02:36:10'),
np.datetime64('2005-02-25 02:45:10'), np.datetime64('2005-02-25 02:45:50'),np.datetime64('2005-02-25 03:54:20'),
np.datetime64('2005-02-25 03:55:10'),np.datetime64('2005-02-25 05:30:10'), np.datetime64('2005-02-25 06:30:10'),
np.datetime64('2005-02-25 06:30:30')],
'Value':[1,4,6,7,3,6,7,8,3,2]}
df = pd.DataFrame(data)
Date Value
0 2005-02-25 01:30:10 1
1 2005-02-25 01:31:10 4
2 2005-02-25 02:36:10 6
3 2005-02-25 02:45:10 7
4 2005-02-25 02:45:50 3
5 2005-02-25 03:54:20 6
6 2005-02-25 03:55:10 7
7 2005-02-25 05:30:10 8
8 2005-02-25 06:30:10 3
9 2005-02-25 06:30:30 2
當下一行距離第一行不到一分鐘時,我將如何洗掉第一行,而無需手動操作。
所以我的預期輸出是:
Date Value
1 2005-02-25 01:31:10 4
2 2005-02-25 02:36:10 6
4 2005-02-25 02:45:50 3
6 2005-02-25 03:55:10 7
7 2005-02-25 05:30:10 8
9 2005-02-25 06:30:30 2
請讓我知道如何做到這一點
uj5u.com熱心網友回復:
使用Series.shiftwth 減去,然后Series.dt.total_seconds,除以DataFrame.floordiv和最后一個過濾器,如果大于1或缺少值(用于匹配最后一個值)boolean indexing:
s = df['Date'].shift(-1).sub(df['Date']).dt.total_seconds().floordiv(60)
df = df[s.isna() | s.gt(1)]
print (df)
Date Value
1 2005-02-25 01:31:10 4
2 2005-02-25 02:36:10 6
4 2005-02-25 02:45:50 3
6 2005-02-25 03:55:10 7
7 2005-02-25 05:30:10 8
9 2005-02-25 06:30:30 2
uj5u.com熱心網友回復:
檢查此解決方案,希望它有所幫助
df['flag'] = df[['Date']].apply(lambda x:[False if pd.Timedelta(x[i 1]-x[i]).total_seconds()/60 <=1 else True for i in range(0,len(x)-1)])
df = df[df['flag']!=False].drop('flag',axis=1)
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/313665.html
