將Lambda函式應用于多列-有解無憂

我希望在我的 lambda 函式中包含多個列，但遇到了不應該是正確的關鍵問題。我正在尋找這一行來創建一個新列，該串列示如果任務中存在“決策”，然后將其標記為決策。否則，如果“專案”中存在“里程碑”，則將其標記為里程碑。否則，將其保留為當前任務型別。

today['New_Type'] = today[['Task','Projects','Type'].apply(lambda x,y,z: "Decision" if "Decision" in x else "Milestone" if "Milestone" in y else z)

任何想法如何調整？

uj5u.com熱心網友回復：

如果您使用常規的命名函式，這將更容易除錯。請務必在axis呼叫時指定引數apply。您撰寫的函式需要接受一個引數，該引數是三列值的元組，因此最好立即將它們解包以提高可讀性：

import pandas as pd

def task_type(row):
    task, project, old_type = row
    if 'decision' in task.lower():
        return 'Decision'
    if 'milestone' in project.lower():
        return 'Milestone'
    return old_type


today = pd.DataFrame({'Task': ['Make a decision.', 
                               'Do something else.',
                               'Write a function.'],
                      'Projects': ['alpha', 'Milestone 7',
                                   'gamma'],
                      'Type': ['old 1', 'old 2', 'old 3']})

today['New_Type'] = today.apply(task_type, axis=1)
today

    Task                Projects     Type   New_Type
0   Make a decision.    alpha        old 1  Decision
1   Do something else.  Milestone 7  old 2  Milestone
2   Write a function.   gamma        old 3  old 3

uj5u.com熱心網友回復：

避免Series.apply（隱藏回圈）并考慮使用numpy.whereor的矢量化條件邏輯方法numpy.select：

today['New_Type'] = np.where(
    today['Task'].str.contains('Decision', regex = False),
    'Decision',
    np.where(
        today['Task'].str.contains('Milestone', regex = False),
        'Milestone',
        today['Task']
    )
)

today['New_Type'] = np.select(
    condlist = [
        today['Task'].str.contains('Decision', regex = False), 
        today['Task'].str.contains('Milestone', regex = False)
    ],
    choicelist = ['Decision', 'Milestone'],
    default = today['Task']
)

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/383795.html

標籤：Python 数据框

上一篇：在由具有不同列名的DataFrame組成的字典中應用函式

下一篇：Python：如何根據值的順序在熊貓df中生成兩個新列？