如果行中滿足條件，則獲取列名作為值-有解無憂

我有這個例子 df：

data = pd.DataFrame({'id':[1,  2 , 3],

                   'question': ['first country visited?', 'first city visited?' , 'two cities we love?'],
                   'answer1': ['UK', 'Paris', 'CA'],
                   'answer2': ['US', 'New York', 'Paris'],
                   'answer3': ['CA', 'London', 'London'],
                   'answer4': ['JP', 'Toronto', 'Los Angeles'],
                   'correct': [['UK'], ['London'], ['London','Paris']]
                   })

給出：

    id  question                answer1 answer2    answer3  answer4     correct
0   1   first country visited?  UK        US         CA       JP        [UK]
1   2   first city visited?     Paris     New York   London Toronto     [London]
2   3   two cities we love?     CA        Paris      London Los Angeles [London, Paris]

如果在data['correct']名為data['correct_column']

這是我到目前為止所做的：

data['correct_column'] = data.loc[:,'answer1':'answer4'].isin(data['correct']).idxmax(1)

我把所有相同的結果僅僅是值answer1的data['correct_column']，我不知道為什么

所需的輸出：

       id  question                answer1      answer2    answer3    answer4      correct              correct_column
0   1   first country visited?      UK          US           CA         JP          [UK]                   answer1
1   2   first city visited?         Paris       New York    London    Toronto       [London]               answer3
2   3   two cities we love?         CA          Paris       London    Los Angeles   [London, Paris]        answer3,answer2

uj5u.com熱心網友回復：

我看到了幾種實作此任務的方法：

使用apply：

cols = data.filter(like='answer').columns
data['correct_column'] = data[cols].apply(lambda s: ','.join((m:=s.isin(data.loc[s.name, 'correct']))[m].index), axis=1)

使用更復雜的方法exploding，檢查身份并按組再次合并：

cols = data.filter(like='answer').columns
df2 = data.explode('correct')
mask = (df2[cols].filter(like='answer').eq(df2['correct'].values, axis=0)
           .groupby(level=0).any()
        )
data.join(mask.mul(cols).where(mask).apply(lambda x: x.str.cat(sep=','), axis=1).rename('correct_column'))

輸出：

   id                question answer1   answer2 answer3      answer4          correct   correct_column
0   1  first country visited?      UK        US      CA           JP             [UK]          answer1
1   2     first city visited?   Paris  New York  London      Toronto         [London]          answer3
2   3     two cities we love?      CA     Paris  London  Los Angeles  [London, Paris]  answer2,answer3

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/400094.html

標籤：Python 蟒蛇-3.x 熊猫数据框

上一篇：Python-for回圈：TypeError：'int'物件不可迭代

下一篇：一些基本的字典串列回圈并將選擇的鍵值添加到新字典中