當前 df:
Date Power
2011-04-18 17:00:00 243.56
2011-04-18 17:00:01 245.83
2011-04-18 17:00:02 246.02
2011-04-18 17:00:03 245.72
2011-04-18 17:00:04 244.71
2011-04-18 17:00:05 245.93
2011-04-18 17:00:06 243.12
2011-04-18 17:00:07 244.72
2011-04-18 17:00:08 242.44
2011-04-18 17:00:09 246.42
2011-04-18 17:00:10 245.02
... ...
我有帶日期和浮點數的 df。日期是索引并且是唯一的。我想根據在下一個 df 中找到的日期創建一個新的 df。
date start date end
0 2011-04-18 17:00:01 2011-04-18 17:00:02
1 2011-04-18 17:00:05 2011-04-18 17:00:06
2 2011-04-18 17:00:08 2011-04-18 17:00:10
... ... ...
我希望得到:
Date Power
2011-04-18 17:00:01 245.83
2011-04-18 17:00:02 246.02
2011-04-18 17:00:05 245.93
2011-04-18 17:00:06 243.12
2011-04-18 17:00:08 242.44
2011-04-18 17:00:09 246.42
2011-04-18 17:00:10 245.02
... ...
換句話說,我想過濾初始 df 并找到在第二個 df 中找到的所有日期之間的所有行。
我想過使用pandas.DataFrame.between_time。但問題是這僅適用于 1 個給定的日期開始和日期結束。我怎樣才能在許多不同的日期期間做到這一點?
uj5u.com熱心網友回復:
np.logical_or.reduce與串列理解一起使用 :
L = [df1['Date'].between(s, e) for s, e in df2[['date start','date end']].to_numpy()]
df = df1[np.logical_or.reduce(L)]
print (df)
Date Power
1 2011-04-18 17:00:01 245.83
2 2011-04-18 17:00:02 246.02
5 2011-04-18 17:00:05 245.93
6 2011-04-18 17:00:06 243.12
8 2011-04-18 17:00:08 242.44
9 2011-04-18 17:00:09 246.42
10 2011-04-18 17:00:10 245.02
如果DatetimeIndex可能,請使用:
L = [df1[s:e] for s, e in df2[['date start','date end']].to_numpy()]
df = pd.concat(L)
print (df)
Power
Date
2011-04-18 17:00:01 245.83
2011-04-18 17:00:02 246.02
2011-04-18 17:00:05 245.93
2011-04-18 17:00:06 243.12
2011-04-18 17:00:08 242.44
2011-04-18 17:00:09 246.42
2011-04-18 17:00:10 245.02
L = [(df1.index >= s) & (df1.index <= e)
for s, e in df2[['date start','date end']].to_numpy()]
df = df1[np.logical_or.reduce(L)]
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/342515.html
