匹配資料框中的列并分解串列-有解無憂

我有以下資料框，我試圖在其中匹配帳戶代碼。假設列 Account_Spread_v2 和 Account_Codes_v2 已合并到資料框中。這個想法是將 Account_Codes_v2 列與 Account_Codes 相匹配。請參閱下面的功能以應用此功能。

df = pd.DataFrame([[31,1234567890,'USD',3.5,'D12',3.5,'D3'],
                    [10,7854567890,'USD',2.7,'TT',2.7,'TT'],
                    [10,7854567899,'AUS',8,'D1',8,'D1'],
                    [6,7854567893,'USD',2.7,'D55',2.7,'H1'],
                    [10,7854567893,'EUR',2.7,'JG',2.7,'JG'],      
                    [31,9632587415,'USD',1.4,'D55',1.4,'D2']],
columns = ['branch','Account','Cur','Account_Spread','Account_Codes','Account_Spread_v2','Account_Codes_v2'])

輸出：

branch  Account     Cur Account_Spread  Account_Codes   Account_Spread_v2   Account_Codes_v2
0   31  1234567890  USD 3.5             D12             3.5                 D3
1   10  7854567890  USD 2.7             TT              2.7                 TT
2   10  7854567899  AUS 8.0             D1              8.0                 D1
3   6   7854567893  USD 2.7             D55             2.7                 H1
4   10  7854567893  EUR 2.7             JG              2.7                 JG
5   31  9632587415  USD 1.4             D55             1.4                 D2

功能：

def compute_match_codes(row):
    codes = ['D1','D2','D4','D3']
    m = 'NA'
    if row['Account_Codes'] == row['Account_Codes_v2']:
        m = 'MatchOnCodes'
    else:
        m = 'MismatchOnCodes'
    return(m)
df = (pd.concat([df,(df.apply(compute_match_codes, axis=1, result_type='expand')),], axis=1))

branch  Account     Cur Account_Spread  Account_Codes   Account_Spread_v2   Account_Codes_v2 0
0   31  1234567890  USD 3.5             D12             3.5                 D3         MismatchOnCodes
1   10  7854567890  USD 2.7             TT              2.7                 TT           MatchOnCodes
2   10  7854567899  AUS 8.0             D1              8.0                 D1         MatchOnCodes
3   6   7854567893  USD 2.7             D55             2.7                 H1         MismatchOnCodes
4   10  7854567893  EUR 2.7             JG              2.7                 JG        MatchOnCodes
5   31  9632587415  USD 1.4             D55             1.4                 D2     MismatchOnCodes

我面臨的挑戰是，如果一個賬戶是USD，在分行31并且它的賬戶代碼是“ Account_Codes ”列中的D12和D55，它可以替代名為“代碼”的串列中的任何代碼。

通過應用這一行，第 0 行和第 5 行將實際匹配。我嘗試使用 isin() 方法，但沒有奏效。關于如何編輯函式以適應這個的任何想法？

uj5u.com熱心網友回復：

我會先使用嵌套np.where()來消除所有完全匹配，然后再解決您需要的更復雜的邏輯。我相信這也是一個更快的解決方案，因為它的矢量化比使用applywithconcat和自定義函式更快。代碼如下所示：

codes = ['D1','D2','D3','D4']
df['Match'] = np.where(df['Account_Codes'] == df['Account_Codes_v2'],'MatchOnCodes',
                       np.where((df['Cur'] == 'USD') & (df['branch'] == 31) & (df['Account_Codes'].isin(['D12','D55'])) & (df['Account_Codes_v2'].isin(codes)),'MatchOnCodes','NoMatchOnCodes'))

這輸出：

   branch     Account  Cur  ...  Account_Spread_v2 Account_Codes_v2           Match
0      31  1234567890  USD  ...                3.5               D3    MatchOnCodes
1      10  7854567890  USD  ...                2.7               TT    MatchOnCodes
2      10  7854567899  AUS  ...                8.0               D1    MatchOnCodes
3       6  7854567893  USD  ...                2.7               H1  NoMatchOnCodes
4      10  7854567893  EUR  ...                2.7               JG    MatchOnCodes
5      31  9632587415  USD  ...                1.4               D2    MatchOnCodes

每個 OP 評論：

codes = ['D1','D2','D3','D4']
def matching_func(row):
  if row['Account_Codes'] == row['Account_Codes_v2']:
    return 'MatchOnCodes'
  elif (row['Cur'] == 'USD') & (row['branch'] == 31) & (row['Account_Codes'] in ['D12','D55']) & (row['Account_Codes_v2'] in codes):
    return 'MatchOnCodes'
  else:
    return 'NoMatchOnCodes'
df['Match'] = df.apply(lambda x: matching_func(x),axis=1)

輸出：

   branch     Account  Cur  ...  Account_Spread_v2 Account_Codes_v2           Match
0      31  1234567890  USD  ...                3.5               D3    MatchOnCodes
1      10  7854567890  USD  ...                2.7               TT    MatchOnCodes
2      10  7854567899  AUS  ...                8.0               D1    MatchOnCodes
3       6  7854567893  USD  ...                2.7               H1  NoMatchOnCodes
4      10  7854567893  EUR  ...                2.7               JG    MatchOnCodes
5      31  9632587415  USD  ...                1.4               D2    MatchOnCodes

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/358215.html

標籤：Python 熊猫数据框麻木的

上一篇：如何根據條件洗掉一系列行？

下一篇：在Pandas中獲取String的一部分