如何獲得日期范圍（Python）的每小時分鐘數？-有解無憂

我有一個這樣的表，其中包含某個行程的開始時間和結束時間。

開始時間	時間結束
2019-07-01 11:25:00	2019-07-01 11:40:00
2019-07-01 21:40:00	2019-07-01 22:10:00
2019-07-03 22:00:00	2019-07-04 22:00:00

我想在start_time和之間的每一小時獲得end_time屬于該小時的分鐘數。換句話說，我想知道行程在指定時間內運行了多少分鐘end_hours

例如，第一行將回傳如下內容，因為在結束時間 12:00 之前已經過去了 15 分鐘。

結束小時	總分鐘數
2019-07-01 12:00:00	15

同樣，對于第二行，輸出將是

結束小時	總分鐘數
2019-07-01 22:00:00	20
2019-07-01 23:00:00	10

對于最后一行，輸出將是

結束小時	總分鐘數
2019-07-03 23:00:00	60
2019-07-03 00:00:00	60
2019-07-04 01:00:00	60
...	...
2019-07-04 22:00:00	60

我如何在 Python 中實作這樣的目標？

uj5u.com熱心網友回復：

您可以使用to_datetimePandas 內置函式將日期轉換為日期時間和減法結束 - 開始：

import pandas as pd
df = pd.DataFrame([['2019-07-01 11:25:00','2019-07-01 11:40:00'], ['2019-07-01 21:40:00', '2019-07-01 22:10:00'], ['2019-07-03 22:00:00', '2019-07-04 22:00:00']], columns=['start_time', 'end_time'])
df['total_minutes'] = (pd.to_datetime(df['end_time']) - pd.to_datetime(df['start_time'])).astype('timedelta64[m]')
>>> df
            start_time             end_time  total_minutes
0  2019-07-01 11:25:00  2019-07-01 11:40:00           15.0
1  2019-07-01 21:40:00  2019-07-01 22:10:00           30.0
2  2019-07-03 22:00:00  2019-07-04 22:00:00         1440.0

uj5u.com熱心網友回復：

持續時間具有分鐘精度，因此讓我們向上采樣到該頻率，并計算在 start_time - end_time 間隔之一內的每小時分鐘數。

import pandas as  pd

df = pd.DataFrame(
       {"start_time": ["2019-07-01 11:25:00", "2019-07-01 21:40:00", "2019-07-03 22:00:00"],
        "end_time":   ["2019-07-01 11:40:00", "2019-07-01 22:10:00", "2019-07-04 22:00:00"]}
       )

df['start_time'] = pd.to_datetime(df['start_time'])
df['end_time'] = pd.to_datetime(df['end_time'])
df['minutes'] = (df['end_time'] - df['start_time']).dt.total_seconds()/60

# create an IntervalIndex which we can set as the axis (needed for re-indexing).
# subtract one minute from end_time so that the minute of the termination is excluded.
iv_idx = pd.IntervalIndex.from_arrays(df['start_time'],
                                      df['end_time']-pd.Timedelta(minutes=1),
                                      closed='both')

# create a new index with the extended frequency:
new_idx = pd.date_range(df['start_time'].min(), df['end_time'].max(), freq='min')

# set the new index to get the extended frequency;
# all minutes will have the value of the whole interval
result = df['minutes'].set_axis(iv_idx).reindex(new_idx)

# we can now calculate the duration per hour by resampling and summing the
# boolean representation of the duration (1/0):
result= result.fillna(0).astype(int).astype(bool).resample('H').sum()
result.index.name = 'start_hour'

現在您已將結果錨定到 start_hour（您可以通過將索引移動一小時來輕松更改為結束小時）：

print(result.loc["2019-07-01 11:00:00":"2019-07-01 12:00:00"])
# start_hour
# 2019-07-01 11:00:00    15
# 2019-07-01 12:00:00     0
# Freq: H, Name: minutes, dtype: int64

print(result.loc["2019-07-01 20:00:00":"2019-07-01 23:00:00"])
# start_hour
# 2019-07-01 20:00:00     0
# 2019-07-01 21:00:00    20
# 2019-07-01 22:00:00    10
# 2019-07-01 23:00:00     0
# Freq: H, Name: minutes, dtype: int64

轉載請註明出處，本文鏈接：https://www.uj5u.com/qianduan/356604.html

標籤：Python 熊猫日期约会时间数数

上一篇：java.text.ParseException：無法決議的日期

下一篇：在Pandas中將多個每日值列向前移動一年