pandas嵌套的IF和邏輯沒有得到正確的結果-有解無憂

不確定我的流程中有什么問題：

這是樣本df：

df = pd.DataFrame({'customer':['A','B','C','D','E','F'],
                   'Traveled':[1,1,1,0,1,0],
                   'Travel_count':[2,3,5,0,1,0],
                   'country1':['UK','Italy','CA', '0','UK','0'],
                   'country2':['JP','IN','CO','0','EG','0'],
                   'shopping':['High','High','High','High','Medium','Medium']
                   })

給出：

  customer  Traveled  Travel_count country1 country2 shopping
0        A         1             2       UK       JP     High
1        B         1             3    Italy       IN     High
2        C         1             5       CA       CO     High
3        D         0             0        0        0     High
4        E         1             1       UK       EG   Medium
5        F         0             0        0        0   Medium

我想創建一些自動過濾的函式，然后創建一個定制的 df，所以這里有兩個函式可以檢查列上的客戶： Traveled ==1 和shopping == High：

def travel():
    if (df['Traveled'] == 1):
        return True
    else:
        return False

def shop_high():
    if (df['shopping'] == 'High'):
        return True
    else:
        return False

這是一個嵌套的 ifs 代碼，如果上述條件為真，它將檢查那些旅行多于或少于 3 次的人：

def select(df):
    if(travel and shop_high):
        if (df['Travel_count'] > 3):
            return (df['customer'], df['shopping'], ('Customer {} traveled more than 3 times').format(df['customer']))
        elif (df['Travel_count'] < 3):
            return (df['customer'], df['shopping'], ('Customer {} traveled less than 3 times').format(df['customer']))

如果我將此功能應用于原始 df 以自動過濾和檢查旅行計數，則會得到錯誤的結果：

pd.DataFrame(list(df.apply(select, axis = 1).dropna()))

結果：

   0       1                                      2
0  A    High  Customer A traveled less than 3 times
1  C    High  Customer C traveled more than 3 times
2  D    High  Customer D traveled less than 3 times
3  E  Medium  Customer E traveled less than 3 times
4  F  Medium  Customer F traveled less than 3 times

應該：

   0       1                                      2
0  A    High  Customer A traveled less than 3 times
1  C    High  Customer C traveled more than 3 times

uj5u.com熱心網友回復：

我會使用布爾索引和numpy.sign：

import numpy as np

travel = (np.sign(df['Travel_count'].sub(3))
            .map({1: ' traveled more than 3 times',
                  -1: ' traveled less than 3 times'})
         )

m1 = df['Traveled'].eq(1)
m2 = df['shopping'].eq('High')
m3 = travel.notna()

out = (df.loc[m1&m2&m3, ['customer', 'shopping']]
         .assign(new='Customer ' df['customer'] travel)
      )

輸出：

  customer shopping                                    new
0        A     High  Customer A traveled less than 3 times
2        C     High  Customer C traveled more than 3 times

uj5u.com熱心網友回復：

使用isin：

new_df = ( df[df[['Traveled', 'shopping']].isin(['High', 1]).all(axis=1) 
               & df['Travel_count'].ne(3)].reset_index(drop=True))
new_df['new'] = ('Customer '   new_df['customer']     ' traveled '    
            pd.Series(np.where(new_df['Travel_count'].lt(3), 'less', 'more'))  
           ' than 3 times')

uj5u.com熱心網友回復：

您可以按 3 個條件過濾資料框，并為列印應用一個簡單的函式

des = lambda row: f'Customer {row["customer"]} traveled {"more" if row["Travel_count"] > 3 else "less"} than 3 times'

df = df.loc[(df['Traveled'] == 1) & (df['shopping'] == 'High') & (df['Travel_count'] != 3)]
df['description'] = df.apply(lambda row: des(row), axis=1)
df = df[['customer', 'shopping', 'description']]

輸出

  customer shopping                            description
0        A     High  Customer A traveled less than 3 times
2        C     High  Customer C traveled more than 3 times

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/529400.html

標籤：Pythonpython-3.x熊猫if 语句

上一篇：將日期填入列

下一篇：如何在Python3.8中將十進制數轉換為具有固定位數的二進制串列？