我有一分鐘的日期時間資料(下面的示例)
2021-11-08 00:10:00
2021-11-08 01:10:00
2021-11-08 02:25:00
2021-11-08 03:55:00
2021-11-08 06:55:00
2021-11-08 12:35:00
2021-11-08 16:05:00
2021-11-08 17:10:00
2021-11-08 18:45:00
2021-11-08 19:10:00
2021-11-08 20:25:00
2021-11-08 20:55:00
2021-11-08 22:55:00
我需要在下面為該資料集分配一個自定義時隙。有些時段從整整一小時(9:00)開始,有些在中間(12:30)
'0000-0259'
'0300-0859'
'0900-1229'
'1230-1659'
'1700-1929'
'1930-2029'
'2030-2359'
我一直在嘗試通過 dict 來做到這一點。每小時都有一個時間段,但 1230 個時間段很棘手。
嘗試 2 與between_time但它需要 DateTimeIndex - 在這里不起作用
def time_slot(ref):
if ref.between_time('00:00','02:59'):
return '0000-0259'
elif ref.between_time('03:00','08:59'):
return '0300-0859'
elif ref.between_time('09:00','12:29'):
return '0900-1229'
elif ref.between_time('12:30','16:59'):
return '1230-1659'
elif ref.between_time('17:00','19:29'):
return '1700-1929'
elif ref.between_time('19:30','20:29'):
return '1930-2029'
else:
return '2030-2359'
如果 < 低于所選時間丟失,則嘗試 3 設定為嵌套
format = '%H:%M'
def time_slot(ref):
if ref < dt.strptime('03:00', format):
return '0000-0259'
elif ref < dt.strptime('09:00', format):
return '0300-0859'
elif ref < dt.strptime('12:30', format):
return '0900-1229'
elif ref < dt.strptime('17:00', format):
return '1700-1929'
elif ref < dt.strptime('19:30', format):
return '1930-2029'
else:
return '2030-2359'
但我沒有datetime.time與datetime.datetime.
uj5u.com熱心網友回復:
鑒于初始時間資料采用字串格式,這就是我的處理方式:給定一個資料框形式:
Time
0 2021-11-08 00:10:00
1 2021-11-08 01:10:00
2 2021-11-08 02:25:00
3 2021-11-08 03:55:00
4 2021-11-08 06:55:00
5 2021-11-08 12:35:00
步驟 1. 添加時間戳列
df['TimeStamp'] = df.apply(lambda row: du.parser.parse(row.Time), axis = 1)
生產:
Time TimeStamp
0 2021-11-08 00:10:00 2021-11-08 00:10:00
1 2021-11-08 01:10:00 2021-11-08 01:10:00
2 2021-11-08 02:25:00 2021-11-08 02:25:00
3 2021-11-08 03:55:00 2021-11-08 03:55:00
4 2021-11-08 06:55:00 2021-11-08 06:55:00
5 2021-11-08 12:35:00 2021-11-08 12:35:00
第 2 步,創建一個函式,該函式將為每個時間戳回傳一個時隙標簽,如下所示:
def getLabel(tval):
""" Return the label associated with the timestamp """
labels = ['0000-0259', '0300-0859', '0900-1229', '1230-1659', '1700-1929', '1930-2029', '2030-2359' ]
slot_start = [(0, 0), (3, 0), (9, 0), (12, 30), (17, 0), (19,30), (20, 30)]
for lidx, tme in enumerate(slot_start):
if tme[0] > tval.hour:
return labels[lidx-1]
elif tval.hour == tme[0] and tme[1] <= tval.minute:
return labels[lidx]
return labels[-1]
步驟 3 應用 getLabel 函式創建 Time_Ref 列,如下所示:
df['Time_Ref'] = df.apply(lambda row: getLabel(row.TimeStamp), axis=1)
其中產生:
Time TimeStamp Time_Ref
0 2021-11-08 00:10:00 2021-11-08 00:10:00 0000-0259
1 2021-11-08 01:10:00 2021-11-08 01:10:00 0000-0259
2 2021-11-08 02:25:00 2021-11-08 02:25:00 0000-0259
3 2021-11-08 03:55:00 2021-11-08 03:55:00 0300-0859
4 2021-11-08 06:55:00 2021-11-08 06:55:00 0300-0859
5 2021-11-08 12:35:00 2021-11-08 12:35:00 1230-1659
6 2021-11-08 16:05:00 2021-11-08 16:05:00 1230-1659
7 2021-11-08 17:10:00 2021-11-08 17:10:00 1700-1929
8 2021-11-08 18:45:00 2021-11-08 18:45:00 1700-1929
9 2021-11-08 19:10:00 2021-11-08 19:10:00 1930-2029
10 2021-11-08 20:25:00 2021-11-08 20:25:00 2030-2359
11 2021-11-08 20:55:00 2021-11-08 20:55:00 2030-2359
您還可以結合步驟 2 和 3,從而消除添加時間戳列與以下內容:
df['Time_Ref'] = df.apply(lambda row: getLabel(du.parser.parse(row.Time)), axis=1)
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/355493.html
上一篇:我想在雪花中加載我的時間戳列
