將函式應用于Pandas資料幀(lambda)中的所有行-有解無憂

我有以下函式來獲取行的最后一個非零值的列名

import pandas as pd

def myfunc(X, Y):
    df = X.iloc[Y]
    counter = len(df)-1
    while counter >= 0:
        if df[counter] == 0:
            counter -= 1
        else:
            break
    return(X.columns[counter])

使用以下代碼示例

data = {'id':  ['1', '2', '3', '4', '5', '6'],
        'name': ['AAA', 'BBB', 'CCC', 'DDD', 'EEE', 'GGG'],
        'A1': [1, 1, 1, 0, 1, 1],
        'B1': [0, 0, 1, 0, 0, 1],
        'C1': [1, 0, 1, 1, 0, 0],
        'A2': [1, 0, 1, 0, 1, 0]}

df = pd.DataFrame(data)
df

myfunc(df, 5) # 'B1'

我想知道如何將此函式應用于資料框中的所有行，并將結果放入 df

我正在考慮遍歷所有行（這可能不是一個好方法）或將 lambdas 與 apply 函式一起使用。但是，我沒有成功采用最后一種方法。有什么幫助嗎？

uj5u.com熱心網友回復：

我稍微修改了您的函式以跨行作業：

def myfunc(row):
     counter = len(row)-1
     while counter >= 0:
         if row[counter] == 0:
             counter -= 1
         else:
             break
     return row.index[counter]

現在只需呼叫df.apply您的函式并axis=1為資料幀的每一行呼叫該函式：

>>> df.apply(myfunc, axis=1)
0    A2
1    A1
2    A2
3    C1
4    A2
5    B1
dtype: object

但是，您可以放棄自定義函式并使用此代碼以更快、更簡潔的方式執行您正在尋找的操作：

>>> df[df.columns[2:]].T.cumsum().idxmax()
0    A2
1    A1
2    A2
3    C1
4    A2
5    B1
dtype: object

uj5u.com熱心網友回復：

這是使用DataFrame.idxmax.

>>> res = df.iloc[:, :1:-1].idxmax(axis=1)
>>> res

0    A2
1    A1
2    A2
3    C1
4    A2
5    B1
dtype: object

這個想法是只選擇Ai和Bi列并反轉它們的順序 ( df.iloc[:, :1:-1])，然后回傳每行 ( .idxmax(axis=1))第一次出現最大值（在本例中為 1）的列標簽。

請注意，此解決方案（作為另一個答案）假設每一行至少包含一個大于零的條目。

如果我們首先屏蔽非零條目（使用.ne(0)），則可以將此假設放寬為“每行包含至少一個非零條目” 。這是有效的，因為.ne(0)產生了一個布爾掩碼和True > False <=> 1 > 0.

>>> res = df.iloc[:, :1:-1].ne(0).idxmax(axis=1)
res

0    A2
1    A1
2    A2
3    C1
4    A2
5    B1
dtype: object

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/369048.html

標籤：Python 熊猫数据框拉姆达申请

上一篇：提取Pandas中每一列的平均值

下一篇：在PandasPython中分離文本和數字