抱歉,如果這已經被問到了。我正在嘗試設定一個小型 A/B 測驗,并將記錄平均 (50%) 分成 3 個類別:Low intent, Medium intent, High intent。我想從 3 個類別中的每一個中隨機選擇 50% 到對照組,50% 到治療組到另一列。
樣本資料:
|ID|Buyer Intent |Email
:--:|:-----------:|:-------------|
|1 |Low Intent |[email protected]|
|2 |Medium Intent|[email protected]|
|3 |Medium Intent|[email protected] |
|4 |Low Intent |[email protected]|
|5 |High Intent |[email protected]|
|6 |High Intent |[email protected]|
所需資料:
|ID|Buyer Intent |Email |Group
:--|:-----------:|:--------------:|:----------:|
|1 |Low Intent |[email protected] |Control |
|2 |Medium Intent|[email protected] |Treatment |
|3 |Medium Intent|[email protected] |Control |
|4 |Low Intent |[email protected] |Treatment. |
|5 |High Intent |[email protected] |Treatment. |
|6 |High Intent |[email protected] | Control. |
uj5u.com熱心網友回復:
使用groupby.sample選擇每組50%的記錄,然后用指定標簽np.where:
control = df.groupby('Buyer Intent').sample(frac=0.5).index
df['Group'] = np.where(df.index.isin(control), 'Control', 'Treatment')
# ID Buyer Intent Email Group
# 0 1 Low Intent [email protected] Control
# 1 2 Medium Intent [email protected] Control
# 2 3 Medium Intent [email protected] Treatment
# 3 4 Low Intent [email protected] Treatment
# 4 5 High Intent [email protected] Control
# 5 6 High Intent [email protected] Treatment
請注意,groupby.sample已經隨機化:
從每個組中回傳一個隨機的專案樣本。
但要明確的洗牌,您可以添加DataFrame.sample使用frac=1:
# shuffle df
df = df.sample(frac=1)
# same as before
control = df.groupby('Buyer Intent').sample(frac=0.5).index
df['Group'] = np.where(df.index.isin(control), 'Control', 'Treatment')
如果你沒有groupby.sample(pandas < 1.1.0):
試試
groupby.applyDataFrame.sample:control = df.groupby('Buyer Intent').apply(lambda g: g.sample(frac=0.5)) df['Group'] = np.where(df.index.isin(control), 'Control', 'Treatment')或
groupby.applynp.random.choice:control = df.groupby('Buyer Intent').apply(lambda g: np.random.choice(g.index, int(len(g)/2))) df['Group'] = np.where(df.index.isin(control), 'Control', 'Treatment')
uj5u.com熱心網友回復:
要選擇 50%,您必須使用從每個組中隨機回傳專案樣本的東西,這個東西稱為“groupby.sample”。
接下來你需要一些東西來根據條件回傳選擇的專案,這個東西叫做 np.where。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/372121.html
