我有一個存盤值的矩陣,如下表:
| play_tv | play_series | 空值 | 購買 | 轉換 | |
|---|---|---|---|---|---|
| 開始 | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 |
| play_series | 0.07 | 0.08 | 0.09 | 0.10 | 0.11 |
| play_tv | 0.12 | 0.13 | 0.14 | 0.15 | 0.16 |
| 空值 | 0.17 | 0.18 | 0.19 | 0.20 | 0.21 |
| 購買 | 0.22 | 0.23 | 0.24 | 0.25 | 0.26 |
| 轉換 | 0.27 | 0.28 | 0.29 | 0.30 | 0.31 |
我有如下資料框:
| session_id | 小路 | 路徑對 |
|---|---|---|
| T01 | [開始,play_series,空] | [(開始,play_series),(play_series,空)] |
| T02 | [開始,play_tv,購買,轉換] | [(開始,play_tv),(play_tv,購買),(購買,轉換)] |
我想從矩陣中獲取值以替換列 path_pair 或在我當前的資料框中創建新列。它是選擇值串列,我該怎么做?
[(Start, play_series), (play_series, Null)] -> [0.03, 0.09]
[(Start, play_tv), (play_tv, purchase), (purchase, conversion)] -> [0.02, 0.15, 0.26 ]
我想要的結果:
| session_id | 小路 | 路徑對 |
|---|---|---|
| T01 | [開始,play_series,空] | [0.03, 0.09] |
| T02 | [開始,play_tv,購買,轉換] | [0.02, 0.15, 0.26] |
腳本我嘗試從矩陣中獲取值:
trans_matrix[trans_matrix.index=="Start"]["play_series"].values[0]
uj5u.com熱心網友回復:
鑒于您的輸入:
df1 = pd.DataFrame({'play_tv': [0.02, 0.07, 0.12, 0.17, 0.22, 0.27],
'play_series': [0.03, 0.08, 0.13, 0.18, 0.23, 0.28],
'Null': [0.04, 0.09, 0.14, 0.19, 0.24, 0.29],
'purchase': [0.05, 0.1, 0.15, 0.2, 0.25, 0.3],
'Conversion': [0.06, 0.11, 0.16, 0.21, 0.26, 0.31]},
index=['Start','play_series','play_tv','Null','purchase','Conversion'])
df2 = pd.DataFrame({'session_id': ['T01', 'T02'],
'path': [['Start', 'play_series', 'Null'],
['Start', 'play_tv', 'purchase', 'Conversion']],
'path_pair': [[('Start', 'play_series'),( 'play_series', 'Null')],
[('Start', 'play_tv'),('play_tv', 'purchase'),('purchase', 'Conversion')]]})
您可以df2通過將函式應用于'path_pair'在df1以下位置查找值的列來更新:
df2['path_pair'] = df2['path_pair'].apply(lambda lst: [df1.loc[x,y] for (x,y) in lst])
輸出:
session_id path path_pair
0 T01 [Start, play_series, Null] [0.03, 0.09]
1 T02 [Start, play_tv, purchase, Conversion] [0.02, 0.15, 0.26]
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/408115.html
標籤:
