我有一個包含 3 個日期時間列的資料框
ItemUid HireStart DCompleteDate OffHire
14055 2021-01-01 2021-12-17 2021-01-09
14065 2021-08-12 2021-12-17 2021-11-17
14534 2018-12-21 NaT NaT
11639 NaT NaT NaT
43268 2020-09-07 2020-09-03 2020-11-03
36723 2021-01-03 Nat 2021-01-10
我正在嘗試回傳一個資料框,該資料框回傳在用戶輸入的日期范圍之間租用的專案。
即:如果用戶輸入:開始日期 = '2021-01-02' & 結束日期 = '2021-01-08' 預期結果將是:
ItemUid HireStart DCompleteDate OffHire
14055 2021-01-01 2021-01-23 2021-01-09
14534 2018-12-21 NaT NaT
36723 2021-01-03 Nat 2021-01-10
我的代碼:)
def date_range(df):
start_date = input("Enter start date dd/mm/yyyy: ")
end_date = input("Enter end date dd/mm/yyyy: ")
df = df[(df['OffHire'] <= end_date) &
((df['HireStart'].notna()) | (df['HireStart'] >= start_date))]
return df
result = df_hire.apply(date_range, axis=1)
這是當前收到錯誤:
TypeError Traceback (most recent call last)
<ipython-input-60-6d4d17020cba> in <module>()
9 return df
10
---> 11 result = df_hire.apply(date_range, axis=1)
4 frames
<ipython-input-60-6d4d17020cba> in date_range(df)
3 end_date = input("Enter end date dd/mm/yyyy: ")
4
----> 5 df = df[(df['OffHire'] <= end_date) &
6 ((df['HireStart'].notna()) | (df['HireStart'] >= start_date))]
7
TypeError: '<=' not supported between instances of 'Timestamp' and 'str'
我可能可以修復錯誤,但是如何應用該函式的實作讓我卡住了!
任何幫助將不勝感激,這對我來說將是另一個教訓!
提前致謝
uj5u.com熱心網友回復:
IIUC,你想要這樣的東西:
#convert the date columns to datetime
df["HireStart"] = pd.to_datetime(df["HireStart"])
df["DCompleteDate"] = pd.to_datetime(df["DCompleteDate"])
df["OffHire"] = pd.to_datetime(df["OffHire"])
#convert inputs to datetime
start_date = pd.to_datetime(start_date, format="%d/%m/%Y")
end_date = pd.to_datetime(end_date, format="%d/%m/%Y")
#select the required rows
output = df[df["HireStart"].le(end_date)&df["DCompleteDate"].fillna(start_date).ge(start_date)]
uj5u.com熱心網友回復:
我認為最好的方法是HireStart用作索引并利用熊貓切片作為日期時間索引。就像是:
df.set_index('HireStart')['2021-01-02':'2021-01-08']
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/390725.html
下一篇:過濾后如何保留該行?
