我有每個員工的休假資料。我需要按周(周一 - 周日)將其拆分,但在此之前,我需要計算作業日休假和每個作業日休假的小時數,因此如果休假在一周中開始(例如星期三),我們將知道,只有 3 個作業日(周三、周四和周五)的時間將分配給該周。
Id Name Date Start Date End Time Off Hours
1 Tom Holland 2022-04-22 2022-05-06 88.0
我能夠排除 Weekends 并計算Number of WORKING Days Off和Hours per WORKING Day Off。
test = {'Id': [1], 'Name': ['Tom Holland'], 'Date Start': ['2022-04-22'], 'Date End': ['2022-05-06'], 'Time Off Hours': [88.0]}
df = pd.DataFrame(data=test)
time_diff = []
for i in df.index:
time_diff.append(np.busday_count(df["Date Start"][i], df["Date End"][i], weekmask=[1,1,1,1,1,0,0]) 1)
df["Days Off (Working)"] = time_diff
df['Hours per Days Off (Working)'] = df["Time Off Hours"] / df["Days Off (Working)"]
輸出是:
Id Name Date Start Date End Time Off Hours Days Off (Working) Hours per Days Off (Working)
1 Tom Holland 2022-04-22 2022-05-06 88.0 11 8.0
現在我需要將此記錄拆分然后分組為 3 條記錄(在本例中),因為 2022-04-22 和 2022-05-06 日期范圍在 3 周內(周一至周日):
- 從 2022-04-18 到 2022-04-24 周(1 個作業日休息 = 8 小時)
- 從 2022-04-25 到 2022-05-01 周(5 個作業日休息 = 40 小時)
- 從 2022-05-02 到 2022-05-08 周(5 個作業日休息 = 40 小時)
所需的輸出應類似于:
| ID | 姓名 | 周開始 | 周末 | 休息日(作業) | 每天休息小時數(作業) | 總休息時間 |
|---|---|---|---|---|---|---|
| 1 | 湯姆·霍蘭德 | 2022-04-18 | 2022-04-24 | 1 | 8.0 | 8.0 |
| 1 | 湯姆·霍蘭德 | 2022-04-25 | 2022-05-01 | 5 | 8.0 | 40.0 |
| 1 | 湯姆·霍蘭德 | 2022-05-02 | 2022-05-08 | 5 | 8.0 | 40.0 |
uj5u.com熱心網友回復:
這不是最簡潔的方法,但它可以完成作業。首先,我創建了您的示例 df
test = {'Id': [1], 'Name': ['Tom Holland'], 'Date Start': ['2022-04-22'], 'Date End': ['2022-05-06'], 'Time Off Hours': [88.0]}
df = pd.DataFrame(data=test)
然后我創建了一個輔助函式,該函式將幫助我稍后計算每周的作業日Date Start數Date End
# You could try to use np.select to optimize this part
def get_work_days(row: pd.Series) -> int:
start = row["Date Start"]
end = row["Date End"]
week_start = row["Week Start"]
week_end = row["Week End"]
if week_start <= start <= week_end:
bdays = len(pd.bdate_range(start, week_end))
elif week_start <= end <= week_end:
bdays = len(pd.bdate_range(week_start, end))
elif week_start <= end <= week_end and week_start <= start <= week_end:
bdays = len(pd.bdate_range(start, end))
else:
bdays = len(pd.bdate_range(week_start, week_end))
return bdays
最后是回傳所需輸出的程序部分
def process_dataframe(df: pd.DataFrame) -> pd.DataFrame:
# Making sure that these columns are datetime
df["Date Start"] = pd.to_datetime(df["Date Start"])
df["Date End"] = pd.to_datetime(df["Date End"])
# Calculating Working days between date start and date end
df["Days Off (Working)"] = df.apply(lambda row: len(pd.bdate_range(row["Date Start"], row["Date End"])), axis=1)
df['Hours per Days Off (Working)'] = df["Time Off Hours"] / df["Days Off (Working)"]
# Creating Week Start values
df["Week Start"] = df.apply(
lambda row: pd.date_range(
start=row["Date Start"].to_period("W").start_time,
end=row["Date End"].to_period("W").start_time,
freq="7D"
),
axis=1
)
# Creating Week End values
df["Week End"] = df.apply(
lambda row: pd.date_range(
start=row["Date Start"].to_period("W").end_time,
end=row["Date End"].to_period("W").end_time,
freq="7D"
),
axis=1
)
# Exploding the values, since the way they were created made them as a DatetimeIndex
# field.
df = df.explode(["Week Start", "Week End"])
# Just did that because the Week End had a weird time due to .end_time
df["Week End"] = pd.to_datetime(df["Week End"].dt.date)
df["Week Start"] = pd.to_datetime(df["Week Start"].dt.date)
# Using the helper function to calculate the working days
df["Days Off (Working)"] = df.apply(get_work_days, axis=1)
df["Total Off Hours"] = df["Days Off (Working)"] * df["Hours per Days Off (Working)"]
return df[["Name", "Week Start", "Week End", "Days Off (Working)", "Hours per Days Off (Working)", "Total Off Hours"]]
編輯
快速修復get_work_days功能。我們需要先檢查開始日期和結束日期是否都在一周內,然后檢查各個日期,以便新版本看起來像
def get_work_days(row: pd.Series) -> int:
start = row["Date Start"]
end = row["Date End"]
week_start = row["Week Start"]
week_end = row["Week End"]
if week_start <= end <= week_end and week_start <= start <= week_end:
bdays = len(pd.bdate_range(start, end))
elif week_start <= start <= week_end:
bdays = len(pd.bdate_range(start, week_end))
elif week_start <= end <= week_end:
bdays = len(pd.bdate_range(week_start, end))
else:
bdays = len(pd.bdate_range(week_start, week_end))
return bdays
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/453951.html
上一篇:從字串轉換的Postgres日期
