如何根據熊貓中的條件洗掉一行？-有解無憂

我有以下資料框：

指數	描述
0	用戶 A 打開的型別為 yyy 的選項卡 tab_1
1	一些值
2	用戶 B 打開的型別為 xxx 的選項卡 tab_1
3	用戶 A 打開的型別為 yyy 的選項卡 tab_4
4	一些值
5	用戶 A 關閉的型別為 yyy 的選項卡 tab_1
6	一些值
7	用戶 B 關閉的型別為 xxx 的選項卡 tab_1
8	由用戶 A 關閉的型別為 yyy 的選項卡 tab_2
9	一些值
10	用戶 C 關閉的 zzz 型別的選項卡 tab_3

我想洗掉“描述”列中的單元格沒有一對的行。我所說的對是指第 0 行和第 5 行以及第 2 行和第 7 行。第 3、8 和 10 行沒有它們的對 - 某些選項卡已由某個用戶打開并且未關閉或已關閉但未打開。

預期輸出：

指數	描述
0	用戶 A 打開的型別為 yyy 的選項卡 tab_1
1	一些值
2	用戶 B 打開的型別為 xxx 的選項卡 tab_1
4	一些值
5	用戶 A 關閉的型別為 yyy 的選項卡 tab_1
6	一些值
7	用戶 B 關閉的型別為 xxx 的選項卡 tab_1
9	一些值

有沒有辦法做到這一點？

uj5u.com熱心網友回復：

你可以試試這個功能duplicated：https ://pandas.pydata.org/docs/reference/api/pandas.DataFrame.duplicated.html

例如：

df_new = df.duplicated(subset=['Description'])

uj5u.com熱心網友回復：

df.drop_duplicates('Description')

uj5u.com熱心網友回復：

老實說，我不確定這是你需要的，但無論如何你可以試試這個：

mask = (df.groupby(df['Description'].str.replace('opened|closed','',regex=True))['Description'].
        transform(lambda x: (x.str.contains('opened').any())&(x.str.contains('closed').any())))

res = df.loc[mask]

>>> res
'''
                                  
Index                             Description           
0      Tab tab_1 of type yyy opened by User A
2      Tab tab_1 of type xxx opened by User B
5      Tab tab_1 of type yyy closed by User A
7      Tab tab_1 of type xxx closed by User B

uj5u.com熱心網友回復：

用 null 替換打開和關閉的文本，然后應用過濾（dataframegroupby 方法）以選擇出現的位置，然后將其洗掉

data.drop(data.groupby(data['Description'].str.replace('opened|closed','',regex=True)).filter(lambda x: x['Description'].count() == 1).index)

Index   Description
    0   Tab tab_1 of type yyy opened by User A
    1   some_value
    2   Tab tab_1 of type xxx opened by User B
    4   some_value
    5   Tab tab_1 of type yyy closed by User A
    6   some_value
    7   Tab tab_1 of type xxx closed by User B
    9   some_value

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/465171.html

標籤：Python python-3.x 熊猫数据框

上一篇：在Pandas中，如何計算bin上的值計數并在另一列中求和值

下一篇：如何通過列中平均值的差異過濾資料集？