我有一個包含一些銷售機會的大型 DF。這個 opps 在其生命周期中多次更改階段,我們可以看到這些更改是什么以及它們是何時進行的。可能的階段是:
Closed Won
Closed No Deal
Propose
Negotiate
Qualify
Closed Lost
Invalid
Identify
Implemented
Close Lost
Close Won
Close No Deal
我也知道更改 opp 之前的 OldValue。我需要做的是有一個單獨的 DF,我可以在其中獲得機會在每個階段停留多長時間的中值。我已經研究了網路和 SO,但找不到符合我標準的解決方案。
以下是一些示例資料:
reatedDate OpportunityId OldValue NewValue
2020-05-05T12:04:32.000Z 0060N00000TbLneQAF Propose Qualify
2020-07-06T08:44:08.000Z 0060N00000TbLneQAF Qualify Identify
2020-08-05T08:59:45.000Z 0060N00000TbLneQAF Identify Qualify
2020-08-05T12:02:59.000Z 0060N00000TbLneQAF Qualify Propose
2020-09-22T06:47:16.000Z 0060N00000TbLneQAF Propose Qualify
2020-10-08T15:33:29.000Z 0060N00000TbLneQAF Qualify Identify
2020-10-08T15:40:21.000Z 0060N00000TbLneQAF Identify Closed No Deal
2021-07-29T07:57:28.000Z 0060N00000TbLohQAF Identify Closed No Deal
2021-10-17T03:07:24.000Z 0060N00000TbLtwQAF Qualify Closed No Deal
2021-07-27T13:57:34.000Z 0060N00000TbMhkQAF Identify Closed No Deal
2020-04-22T13:35:30.000Z 0060N00000TbMkjQAF Negotiate Closed Lost
2020-09-25T09:37:32.000Z 0060N00000TbN8qQAF Qualify Propose
2020-09-25T09:37:41.000Z 0060N00000TbN8qQAF Propose Negotiate
2021-09-06T14:31:05.000Z 0060N00000TbN8qQAF Negotiate Propose
2021-11-03T11:09:56.000Z 0060N00000TbNF8QAN Identify Qualify
2020-04-29T15:43:58.000Z 0060N00000TbNFSQA3 Identify Invalid
2021-01-07T09:35:56.000Z 0060N00000TbNUDQA3 Identify Closed No Deal
2020-12-03T08:53:12.000Z 0060N00000TbNUSQA3 Qualify Identify
2021-09-08T09:41:54.000Z 0060N00000TbNUSQA3 Identify Closed Lost
2021-04-14T07:31:49.000Z 0060N00000TbNg4QAF Identify Closed No Deal
2020-04-27T12:19:51.000Z 0060N00000TbNwCQAV Qualify Identify
2020-05-04T11:15:00.000Z 0060N00000TbNxPQAV Identify Closed No Deal
2021-05-24T03:13:10.000Z 0060N00000TbNywQAF Qualify Closed No Deal
2021-05-28T14:51:32.000Z 0060N00000TbO3SQAV Identify Invalid
2021-07-27T13:25:50.000Z 0060N00000TbOBlQAN Identify Closed No Deal
2021-07-27T13:25:50.000Z 0060N00000TbOCeQAN Identify Closed No Deal
2021-07-27T13:25:50.000Z 0060N00000TbOELQA3 Identify Closed No Deal
2020-04-28T15:12:53.000Z 0060N00000TbOIrQAN Qualify Negotiate
2020-05-18T14:11:18.000Z 0060N00000TbOIrQAN Negotiate Closed Won
2021-07-27T13:22:09.000Z 0060N00000TbOJzQAN Identify Closed No Deal
2021-07-27T13:22:09.000Z 0060N00000TbOLbQAN Identify Closed No Deal
2020-08-13T05:04:36.000Z 0060N00000TbOOzQAN Propose Identify
2020-09-06T15:36:36.000Z 0060N00000TbOOzQAN Identify Invalid
2021-05-27T14:22:10.000Z 0060N00000TbOWKQA3 Qualify Identify
2021-05-27T14:22:27.000Z 0060N00000TbOWKQA3 Identify Closed Lost
2020-04-27T12:25:52.000Z 0060N00000TbOX3QAN Qualify Identify
2021-01-08T15:27:33.000Z 0060N00000TbOX3QAN Identify Qualify
2020-04-13T10:53:57.000Z 0060N00000TbOY6QAN Qualify Identify
2020-12-03T10:38:35.000Z 0060N00000TbOY6QAN Identify Closed Lost
2020-04-13T10:54:57.000Z 0060N00000TbOYaQAN Qualify Identify
2020-12-01T10:50:41.000Z 0060N00000TbOYaQAN Identify Closed No Deal
2021-07-27T13:57:34.000Z 0060N00000TbOb5QAF Identify Closed No Deal
2021-07-27T13:57:34.000Z 0060N00000TbOeYQAV Identify Closed No Deal
2020-05-29T12:28:44.000Z 0060N00000TbOgyQAF Identify Qualify
2020-12-18T07:34:18.000Z 0060N00000TbOgyQAF Qualify Identify
2020-12-18T07:34:43.000Z 0060N00000TbOgyQAF Identify Invalid
2021-07-27T13:22:09.000Z 0060N00000TbOhSQAV Identify Closed No Deal
2020-04-15T11:30:09.000Z 0060N00000TbOk7QAF Identify Invalid
2020-08-26T03:16:46.000Z 0060N00000TbOnVQAV Qualify Closed No Deal
2020-04-03T13:08:23.000Z 0060N00000TbOy4QAF Identify Closed Lost
到目前為止,我所做的是嘗試使用 group_by 函式,然后嘗試獲取中位數,但我被卡住了,因為我需要查看更改 opp 的日期并將其與獲取所需的時間進行比較新值。我將查看以下結果:
Stage Average Time
Closed Won x days
Closed No Deal x days
Propose x days
Negotiate x days
Qualify x days
Closed Lost x days
Invalid x days
Identify x days
Implemented x days
Close Lost x days
Close Won x days
Close No Deal x days
希望這是有道理的。順便說一下,我是個菜鳥,所以謝謝你的幫助!
uj5u.com熱心網友回復:
這對我有用,但你忘了提到“OldValue”代表OPP的創建。
df['difference'] = df.groupby('OpportunityId').reatedDate.diff()
df['aux'] = df['OldValue'] ' - ' df['NewValue']
df['days_diff'] = df['difference'].dt.days
df.groupby('aux')['days_diff'].mean()
輸出:
aux days_difference
Identify - Closed Lost 170.666667
Identify - Closed No Deal 115.500000
Identify - Invalid 12.000000
Identify - Qualify 143.000000
Negotiate - Closed Lost NaN
Negotiate - Closed Won 19.000000
Negotiate - Propose 346.000000
Propose - Identify NaN
Propose - Negotiate 0.000000
Propose - Qualify 47.000000
Qualify - Closed No Deal NaN
Qualify - Identify 93.000000
Qualify - Negotiate NaN
Qualify - Propose 0.000000
uj5u.com熱心網友回復:
我認為一個可能的解決方案是首先按OpportunityId和對資料框進行排序reatedDate:
sorted_df = df.sort_values(by=['OpportunityId','reatedDate'])
然后你可以得到行明智的差異:
sorted_df['time_diff'] = sorted_df.reatedDate.diff()
您還需要確定 ID 中存在轉換的情況。由于那些需要從統計資料中排除:
sorted_df['id_diff'] = sorted_df.OpportunityId.diff()
filtered_df=sorted_df[sorted_df.id_diff==0]
然后你應該能夠做你的統計:
filtered_df.groupby(['OldValue'])[['time_diff']].median()
如果您只需要幾天,您可以添加一個 .dt.days
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/360957.html
