我想根據值是否在另一列中進行過濾。然而,這些資料需要在應用 isin 過濾器之前進行分組。當我這樣做時,我收到錯誤
'SeriesGroupBy' object has no attribute 'isin'
解釋我正在嘗試做的事情的示例:
import pandas as pd
dict = {'AttributeName': {0: 'John', 1: 'John', 2: 'John', 3: 'John', 4: 'Sally', 5: 'Sally'}, 'Lineage Step': {0: 1, 1: 2, 2: 3, 3: 4, 4:1, 5:2}, 'From Country': {0: 'Spain', 1: 'Scotland', 2: 'England', 3: 'England', 4: 'Scotland', 5:'England'}, 'From Town': {0: 'Madrid', 1: 'Edinburgh', 2: 'London', 3: 'London', 4: 'Edinburgh', 5: 'Manchester'}, 'FromStreet': {0: 'Spanish St', 1: 'Main St', 2: 'Lower St', 3: 'Middle St', 4: 'London St', 5: 'Scotland St'}, 'ToCountry': {0: 'Scotland', 1: 'England', 2: 'England', 3: 'England', 4: 'England', 5: 'England'}, 'ToTown': {0: 'Edinburgh', 1: 'London', 2: 'London', 3: 'London', 4: 'Liverpool', 5: 'London'}, 'ToStreet': {0: 'Lower St', 1: 'Middle St', 2: 'Upper St', 3: 'Upper St', 4: 'new St', 5: 'Old St'}}
sample_data = pd.DataFrame.from_dict(dict)
#example data set. I want to find every unique 'fromCountry' for both John and Sally. So For John we would just have the first row, where he enters from Spain to Scotland. The second row would be filtered as Scotland appears in the 'ToCountry' column. Sally would just have the 'FromCountry' Edinburgh row.
我試圖這樣做:
sample_grouped = sample_data.groupby('AttributeName')
sample_grouped[~sample_grouped['From Country'].isin(sample_grouped['ToCountry'])]
但我到了那里錯誤 'SeriesGroupBy' object has no attribute 'isin'
有誰知道如何在按資料分組時使用 isin(或可比較的)函式?
謝謝
uj5u.com熱心網友回復:
該錯誤是不言自明的,isin您嘗試使用的方法不在 Pandas Groupby 物件中。
您可以呼叫applypandas groupby 物件,然后傳遞一個lambda僅回傳符合條件的行的函式。
out = (sample_data.groupby('AttributeName')
.apply(lambda x: x[~x['From Country'].isin(x['ToCountry'])])
)
輸出:
AttributeName Lineage Step From Country From Town FromStreet ToCountry ToTown ToStreet
AttributeName
John 0 John 1 Spain Madrid Spanish St Scotland Edinburgh Lower St
Sally 4 Sally 1 Scotland Edinburgh London St England Liverpool new St
uj5u.com熱心網友回復:
使用 pandas 查詢并找到匹配的記錄,然后使用 unique() 進行分組
dict = {'AttributeName': {0: 'John', 1: 'John', 2: 'John', 3: 'John', 4: 'Sally', 5: 'Sally'}, 'Lineage Step': {0: 1, 1: 2, 2: 3, 3: 4, 4:1, 5:2}, 'From Country': {0: 'Spain', 1: 'Scotland', 2: 'England', 3: 'England', 4: 'Scotland', 5:'England'}, 'From Town': {0: 'Madrid', 1: 'Edinburgh', 2: 'London', 3: 'London', 4: 'Edinburgh', 5: 'Manchester'}, 'FromStreet': {0: 'Spanish St', 1: 'Main St', 2: 'Lower St', 3: 'Middle St', 4: 'London St', 5: 'Scotland St'}, 'ToCountry': {0: 'Scotland', 1: 'England', 2: 'England', 3: 'England', 4: 'England', 5: 'England'}, 'ToTown': {0: 'Edinburgh', 1: 'London', 2: 'London', 3: 'London', 4: 'Liverpool', 5: 'London'}, 'ToStreet': {0: 'Lower St', 1: 'Middle St', 2: 'Upper St', 3: 'Upper St', 4: 'new St', 5: 'Old St'}}
df = pd.DataFrame.from_dict(dict)
results=df.query('`From Country` not in ToCountry')
print(results['From Country'].unique())
輸出:
['Spain']
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/371467.html
上一篇:對缺失時間戳的資料幀行求和
