函式foo首先匯總給定資料框中的值,p1然后按p2、 wherep1和p2are offset aliases。
import pandas as pd
import numpy as np
# Function
def foo(d, p1, p2, brk):
# assert p2 > p1
s1 = df.groupby(pd.Grouper(freq=p1)).sum().gt(brk)
s2 = s1.groupby(pd.Grouper(freq=p2)).sum()
return s2
# Data
df = pd.DataFrame({"datetime": pd.date_range("2017-01-01", "2017-03-31", freq="1H")})
np.random.seed(42)
df["val"] = np.random.sample(2137)
df = df.set_index("datetime")
foo(df, "7D", "1M", 80)
# val
# datetime
# 2017-01-31 4
# 2017-02-28 3
# 2017-03-31 3
目標是實施assert p2 > p1,使結果foo有意義。一種方法是將兩者轉換p1并p2進行Timedelta比較。但是,某些別名(例如,1M對于轉換為Timedelta.
pandas.Timedelta("1M")給出以下警告:
FutureWarning:單位“M”、“Y”和“y”不代表明確的時間增量值,將在未來版本中洗掉
pd.Grouper(freq="1M") > pd.Grouper(freq="7D")給出以下錯誤:
TypeError:“TimeGrouper”和“TimeGrouper”的實體之間不支持“>”
比較兩個石斑魚freq視窗的正確方法是什么?
uj5u.com熱心網友回復:
基于這個答案,你可能會做這樣的事情:
def foo(d, p1, p2, brk):
from pandas.tseries.frequencies import to_offset
from datetime import datetime
tmp = datetime.now()
assert tmp to_offset(p2) > tmp to_offset(p1), 'p1 must be less than p2'
s1 = d.groupby(pd.Grouper(freq=p1)).sum().gt(brk)
s2 = s1.groupby(pd.Grouper(freq=p2)).sum()
return s2
測驗:
>>> foo(df, "7D", "1M", 80)
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Input In [51], in <cell line: 1>()
----> 1 foo(df, "7D", "1M", 80)
Input In [50], in foo(d, p1, p2, brk)
3 from datetime import datetime
4 tmp = datetime.now()
----> 5 assert tmp to_offset(p1) > tmp to_offset(p2), 'p1 must be less than p2'
7 s1 = d.groupby(pd.Grouper(freq=p1)).sum().gt(brk)
8 s2 = s1.groupby(pd.Grouper(freq=p2)).sum()
AssertionError: p1 must be less than p2
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/444534.html
標籤:熊猫
