我有一個 csv 檔案,它是網頁中的用戶行為資料。這是示例資料:
_time,dataCenter,customer,user,SID,ACT,
2021-11-25T13:45:42.139 0000,dc1,customer1,user1,sid1,open_page,
2021-11-25T13:45:50.139 0000,dc1,customer1,user1,sid1,create_form,
2021-11-25T13:46:51.139 0000,dc1,customer1,user1,sid1,save_form,
2021-11-25T13:50:50.139 0000,dc1,customer2,user2,sid2,open_page,
2021-11-25T13:51:20.139 0000,dc1,customer2,user2,sid2,open_form_detail,
2021-11-25T13:53:50.139 0000,dc1,customer2,user2,sid2,back_to_form_list,
2021-11-25T23:59:50.139 0000,dc3,customer3,user3,sid3,open_page,
2021-11-26T00:02:50.139 0000,dc3,customer3,user3,sid3,show_more,
......
......
......
我想做以下資料轉換:
- 按資料中心、客戶、用戶和 SID 分組 ACT
- 從 _time 列中提取日期并分配給 groupby 結果。
這是預期的結果:
date,dataCenter,customer,user,SID,ACT,
2021-11-25,dc1,customer1,user1,sid1,"open_page,create_form,save_form",
2021-11-25,dc1,customer2,user2,sid2,"open_page,open_form_detail,back_to_form_list"
2021-11-25,dc3,customer3,user3,sid3,"open_page,show_more"
......
......
......
我試過的:
df= df.groupby(['dataCenter','customer','user','sid'])['ACT'].apply(','.join)
但我不確定如何date在groupby結果中添加為一列。
你能幫忙指點一下嗎?
謝謝切麗
uj5u.com熱心網友回復:
IUC:
df = df.groupby(['dataCenter', 'customer', 'user', 'SID']).agg(date = ('_time', 'first'),
ACT= ('ACT', ','.join)).reset_index()
df['date'] = pd.to_datetime(df['date']).dt.date
OUTPUT
dataCenter customer user SID date ACT
0 dc1 customer1 user1 sid1 2021-11-25 open_page,create_form,save_form
1 dc1 customer2 user2 sid2 2021-11-25 open_page,open_form_detail,back_to_form_list
2 dc3 customer3 user3 sid3 2021-11-25 open_page,show_more
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/397689.html
上一篇:公司資料中包含/重疊的日期
