如果滿足所有關聯行的條件,我需要為每月與客戶關聯的所有行分配正確的值(合格或不合格)。
test_data = {'Client Id': [1,1,1,1,1,1,1,1,
2,2,2,2,2,2,2,2],
'Client Name': ['Tom Holland', 'Tom Holland', 'Tom Holland', 'Tom Holland',
'Tom Holland', 'Tom Holland', 'Tom Holland', 'Tom Holland',
'Brad Pitt', 'Brad Pitt', 'Brad Pitt', 'Brad Pitt',
'Brad Pitt', 'Brad Pitt', 'Brad Pitt', 'Brad Pitt',],
'Week': ['01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022',
'01/17/2022 - 01/23/2022', '01/24/2022 - 01/30/2022',
'01/31/2022 - 02/06/2022', '02/07/2022 - 02/13/2022',
'02/14/2022 - 02/20/2022','02/21/2022 - 02/27/2022',
'01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022',
'01/17/2022 - 01/23/2022', '01/24/2022 - 01/30/2022',
'01/31/2022 - 02/06/2022', '02/07/2022 - 02/13/2022',
'02/14/2022 - 02/20/2022','02/21/2022 - 02/27/2022'],
'Month': ['January', 'January', 'January', 'January',
"February", "February", "February", "February",
'January', 'January', 'January', 'January',
"February", "February", "February", "February"],
'Year': [2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022,
2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022],
'Payment Status': ["Pending", "Paid in Full", "Didn't Paid", "Paid in Full",
"Paid in Full", "Paid in Full", "Paid in Full",
"Paid in Full", "Paid in Full", "Paid in Full",
"Paid in Full", "Paid in Full", "Paid in Full",
"Paid in Full", "Paid in Full", "Pending"]}
test_df = pd.DataFrame(data=test_data)
資料:
Client Id Client Name Week Month Year Payment Status
1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending
1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full
1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didn't Paid
1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full
1 Tom Holland 01/31/2022 - 02/06/2022 February 2022 Paid in Full
1 Tom Holland 02/07/2022 - 02/13/2022 February 2022 Paid in Full
1 Tom Holland 02/14/2022 - 02/20/2022 February 2022 Paid in Full
1 Tom Holland 02/21/2022 - 02/27/2022 February 2022 Paid in Full
2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full
2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full
2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full
2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full
2 Brad Pitt 01/31/2022 - 02/06/2022 February 2022 Paid in Full
2 Brad Pitt 02/07/2022 - 02/13/2022 February 2022 Paid in Full
2 Brad Pitt 02/14/2022 - 02/20/2022 February 2022 Paid in Full
2 Brad Pitt 02/21/2022 - 02/27/2022 February 2022 Pending
如果每個月與客戶關聯的每一行(周)都已全額支付,則每個月將合格分配給與客戶關聯的所有行(周)。即使 1 周沒有全額支付(3 周可以全額支付,但 1 周沒有支付或待定),那么所有行都分配給Not Qualified。
期望的輸出:
Client Id Client Name Week Month Year Payment Status Qualification
1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending Not Qualified
1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full Not Qualified
1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didn't Paid Not Qualified
1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full Not Qualified
1 Tom Holland 01/31/2022 - 02/06/2022 February 2022 Paid in Full Qualified
1 Tom Holland 02/07/2022 - 02/13/2022 February 2022 Paid in Full Qualified
1 Tom Holland 02/14/2022 - 02/20/2022 February 2022 Paid in Full Qualified
1 Tom Holland 02/21/2022 - 02/27/2022 February 2022 Paid in Full Qualified
2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/31/2022 - 02/06/2022 February 2022 Paid in Full Not Qualified
2 Brad Pitt 02/07/2022 - 02/13/2022 February 2022 Paid in Full Not Qualified
2 Brad Pitt 02/14/2022 - 02/20/2022 February 2022 Paid in Full Not Qualified
2 Brad Pitt 02/21/2022 - 02/27/2022 February 2022 Pending Not Qualified
我不知道如何實作這一點,但我認為回圈中的 value_counts 是:
for name, month in zip(list(test_df["Client Name"].unique()), list(test_df["Month"])):
print(test_df[(test_df["Client Name"] == name) & (test_df["Month"] == month)].value_counts(["Payment Status"]))
uj5u.com熱心網友回復:
關鍵是創建一個布爾掩碼: if Payment Statusis "Paid in full" then Trueelse False。現在您可以按Client Id, MonthANDYear檢查是否所有值都是True. 用于transform將結果廣播到組的每一行。最后,用True/False其各自的值替換。
通過向資料框添加新列is_paid來動態創建布爾掩碼:
df['Qualification'] = (
df.assign(is_paid=df['Payment Status'] == 'Paid in Full')
.groupby(['Client Id', 'Month', 'Year'])['is_paid']
.transform('all').replace({True: 'Qualified', False: 'Not Qualified'})
)
print(df)
# Output
Client Id Client Name Week Month Year Payment Status Qualification
0 1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending Not Qualified
1 1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full Not Qualified
2 1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didn't Paid Not Qualified
3 1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full Not Qualified
4 1 Tom Holland 01/31/2022 - 02/06/2022 February 2022 Paid in Full Qualified
5 1 Tom Holland 02/07/2022 - 02/13/2022 February 2022 Paid in Full Qualified
6 1 Tom Holland 02/14/2022 - 02/20/2022 February 2022 Paid in Full Qualified
7 1 Tom Holland 02/21/2022 - 02/27/2022 February 2022 Paid in Full Qualified
8 2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full Qualified
9 2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full Qualified
10 2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full Qualified
11 2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full Qualified
12 2 Brad Pitt 01/31/2022 - 02/06/2022 February 2022 Paid in Full Not Qualified
13 2 Brad Pitt 02/07/2022 - 02/13/2022 February 2022 Paid in Full Not Qualified
14 2 Brad Pitt 02/14/2022 - 02/20/2022 February 2022 Paid in Full Not Qualified
15 2 Brad Pitt 02/21/2022 - 02/27/2022 February 2022 Pending Not Qualified
uj5u.com熱心網友回復:
首先轉換Payment Status為布林值:
test_df['Paid'] = test_df['Payment Status'] == 'Paid in Full'
將它們分組Month并Client Id檢查Paid組中的所有值是True:
status = test_df[["Client Id", "Month", "Paid"]].groupby(["Month", "Client Id"]).all()
沒有轉換Paid回文本(Qualified或Not Qualified):
status['Paid'] = status['Paid'].map({True: 'Qualified', False:"Not Qualified"})
現在與原始表合并以獲得所需的結果:
output = pd.merge(test_df, a, on=['Client Id', 'Month']).drop(columns=['Paid_x', 'Paid_y'])
(我將在一秒鐘內添加每個部分的輸出)
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/454532.html
