給定日期時間連續性的Pandas輸出日期、開始和結束時間以及事件狀態-有解無憂

我有以下資料框：

                    Site
       Date
2021-07-01 08:00:00  54
2021-07-01 09:00:00  23
2021-07-01 10:00:00  13
2021-07-01 11:00:00  23
2021-07-01 15:00:00  345
2021-07-01 16:00:00  313
2021-07-05 08:00:00  3
2021-07-05 09:00:00  31
2021-07-13 08:00:00  76
2021-07-13 09:00:00  34
2021-07-13 10:00:00  94
2021-07-13 11:00:00  55
2021-07-13 12:00:00  43
2021-07-13 13:00:00  423
2021-07-13 14:00:00  231
2021-07-13 15:00:00  23
2021-07-13 16:00:00  563
2021-07-13 17:00:00  424

我正在嘗試獲取事件的日期、開始和結束時間。條件是這樣的：

如果時間連續性沒有中斷（如2021-07-13），從 08:00:00到17:00:00是全天事件
如果時間連續性中斷并且不像2021-07-13那樣連續，這將是一個不完整的日事件

最終結果是這樣的：

                Start       End      Result
   Date        
2021-07-01   08:00:00   11:00:00   Incomplete
2021-07-01   15:00:00   16:00:00   Incomplete
2021-07-05   08:00:00   09:00:00   Incomplete
2021-07-13   08:00:00   17:00:00      Full

有沒有一種簡單的方法可以在 Pandas 中執行此操作？

uj5u.com熱心網友回復：

利用：

#if necessary convert to DatetimeIndex
df.index = pd.to_datetime(df.index)

#create column Date
df = df.reset_index()

#test consecutive hours
df['g'] = df['Date'].diff().dt.total_seconds().div(3600).ne(1)

date = df['Date'].dt.date
#created groups
df['g'] = df.groupby(date)['g'].cumsum()

#get minimal and maximal per dates
df1 = (df.groupby([date, 'g'])
         .agg(Start=('Date','min'),End=('Date','max'))
         .reset_index(level=1, drop=True))

#convert to HH:MM:SS
df1['Start'] = df1['Start'].dt.strftime('%H:%M:%S')
df1['End'] = df1['End'].dt.strftime('%H:%M:%S')

#result column
df1['Result'] = np.where(df1['Start'].eq('08:00:00') & 
                         df1['End'].eq('17:00:00'), 'Full','Incomplete')
print (df1)
               Start       End      Result
Date                                      
2021-07-01  08:00:00  11:00:00  Incomplete
2021-07-01  15:00:00  16:00:00  Incomplete
2021-07-05  08:00:00  09:00:00  Incomplete
2021-07-13  08:00:00  17:00:00        Full

與times 的替代：

df.index = pd.to_datetime(df.index)

df = df.reset_index()

df['g'] = df['Date'].diff().dt.total_seconds().div(3600).ne(1)

date = df['Date'].dt.date
df['g'] = df.groupby(date)['g'].cumsum()

df1 = (df.groupby([date, 'g'])
         .agg(Start=('Date','min'),End=('Date','max'))
         .reset_index(level=1, drop=True))
df1['Start'] = df1['Start'].dt.time
df1['End'] = df1['End'].dt.time

from datetime import time

df1['Result'] = np.where(df1['Start'].eq(time(8,0,0)) & 
                         df1['End'].eq(time(17,0,0)), 'Full','Incomplete')
print (df1)
               Start       End      Result
Date                                      
2021-07-01  08:00:00  11:00:00  Incomplete
2021-07-01  15:00:00  16:00:00  Incomplete
2021-07-05  08:00:00  09:00:00  Incomplete
2021-07-13  08:00:00  17:00:00        Full

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/390728.html

標籤：Python 熊猫约会时间

上一篇：當一個資料框中的某些日期僅存在于其他資料框中的其他兩個日期之間時，如何連接兩個資料框？

下一篇：Pyton-strftime'需要一個'datetime.date'但收到了一個'datetime.time'