我正在使用以下代碼搜索日期,以便在日期串列中找到最近的前一個日期:
def nearest_previous_date(list_of_dates, pivot_date):
""" Helper function to find the nearest previous date in a list of dates
Args:
list_of_dates (list): list of datetime objects
pivot_date (datetime): reference date
Returns:
(datetime): datetime immediately before or equal to reference date, if none satisfy criteria returns
first date in list
"""
return min(list_of_dates, key=lambda x: (pivot_date - x).days if x <= pivot_date else float("inf"))
我需要多次呼叫這個函式,所以我希望它盡可能高效,目前它需要大約 200 微秒來搜索 23 個日期的串列并找到相關的日期。這聽起來并不多,但這并不能很好地擴展。有沒有辦法讓這個功能更有效率?
這是一個例子
pivot_date = datetime(day=21, month=7, year=2019)
list_of_dates
DatetimeIndex(['2015-06-30', '2015-09-30', '2015-12-31', '2016-03-31',
'2016-06-30', '2016-09-30', '2016-12-30', '2017-03-31',
'2017-06-30', '2017-09-29', '2017-12-29', '2018-03-30',
'2018-06-30', '2018-10-01', '2019-01-01', '2019-03-29',
'2019-07-01', '2019-10-01', '2019-12-31', '2020-03-31',
'2020-06-30', '2020-09-30', '2020-12-31'],
dtype='datetime64[ns]', name='effectiveDate', freq=None)
%%timeit
min(list_of_dates, key=lambda x: (pivot_date - x).days if x <= pivot_date else float("inf"))
191 μs ± 5.4 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
uj5u.com熱心網友回復:
由于datetime可以訂購物件,參考日期之前的最近日期確實是參考日期之前的“最大”日期:
def nearest_previous_date(list_of_dates, pivot_date):
return max((date for date in list_of_dates if date <= pivot_date), default=list_of_dates[0])
如果假設串列是有序的,那么可以采用二分查找,這更具可擴展性:
from bisect import bisect
def nearest_previous_date(list_of_dates, pivot_date):
return list_of_dates[max(bisect(list_of_dates, pivot_date) - 1, 0)]
uj5u.com熱心網友回復:
@jasonharper 提出的解決方案
def nearest_previous_date_NEW(list_of_dates, pivot_date):
""" Helper function to find the nearest previous date in a list of dates
Important: assumes list_of_dates is sorted ascending
Args:
list_of_dates (list): list of datetime objects
pivot_date (datetime): reference date
Returns:
(datetime): datetime immediately before or equal to reference date, if
none satisfy criteria returns first date in list
"""
return list_of_dates[max(0, bisect.bisect_left(list_of_dates, pivot_date)-1)]
確實快得多:
每個回圈 47.4 μs ± 1.84 μs(7 次運行的平均值 ± 標準偏差,每次 10000 次回圈)
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/383581.html
