我有一個包含員工時間表、總作業時間和加班時間的 DataFrame。
example = pd.DataFrame({'Employee': ["Alex", "Alex", "Alex", "Bob", "Peter", "Peter"], 'date': ['2021-01-01', '2021-01-01', '2021-01-03', '2021-01-02', '2021-01-01', '2021-01-02'],
'Total Hour': [1.5, 2.2, 7, 1, 3, 6], 'Overtime': [1.5, 0, 1.2, 2.3, 1.7, 5]})
print(example)
Employee date Total Hour Overtime
0 Alex 2021-01-01 1.5 1.5
1 Alex 2021-01-01 2.2 0.0
2 Alex 2021-01-03 7.0 1.2
3 Bob 2021-01-02 1.0 2.3
4 Peter 2021-01-01 3.0 1.7
5 Peter 2021-01-02 6.0 5.0
我想創建一個包含每月所有天數的每月 DataFrame 并僅填充可用的時間表,如下所示:
2021-01-01 2021-01-02 2021-01-03 2021-01-04 2021-01-05 ... 2021-01-31
Employee
0 Alex Total Hour 3.7 7.0
1 Alex Overtime 1.5 1.2
2 Bob Total Hour 1
3 Bob Overtime 2.3
4 Peter Total Hour 3.0 6
5 Peter Overtime 1.7 5
因此,隨著員工進入他們的作業時間,這將變得更加完整。
我試圖弄清楚,但我認為我在這里遺漏了一些非常基本的東西。
uj5u.com熱心網友回復:
讓我們用pivot_table那么stack0級轉換列多指標來排多指標:
result_df = example.pivot_table(
index='Employee',
columns='date',
values=['Total Hour', 'Overtime'],
aggfunc='sum'
).stack(level=0)
或者等效地與groupby sumthen stack unstack交換列和行 MultiIndexes:
result_df = example.groupby(['Employee', 'date']).sum().stack().unstack(level=1)
result_df:
date 2021-01-01 2021-01-02 2021-01-03
Employee
Alex Overtime 1.5 NaN 1.2
Total Hour 3.7 NaN 7.0
Bob Overtime NaN 2.3 NaN
Total Hour NaN 1.0 NaN
Peter Overtime 1.7 5.0 NaN
Total Hour 3.0 6.0 NaN
任何一種方法都可以reindex用于訂購級別 1,以便每個員工首先顯示加班時間,然后顯示總小時數。然后rename_axis和reset_index一些清理索引和列標簽:
result_df = result_df.reindex(
['Overtime', 'Total Hour'], level=1
).rename_axis(
index=['Employee', 'Hours'], columns=None
).reset_index()
result_df:
Employee Hours 2021-01-01 2021-01-02 2021-01-03
0 Alex Overtime 1.5 NaN 1.2
1 Alex Total Hour 3.7 NaN 7.0
2 Bob Overtime NaN 2.3 NaN
3 Bob Total Hour NaN 1.0 NaN
4 Peter Overtime 1.7 5.0 NaN
5 Peter Total Hour 3.0 6.0 NaN
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/353842.html
