python中不同值的計數和總和-有解無憂

基本上，我擁有的資料集有 4 列，如下所示：-

用戶識別符號	帳戶	姓名	數量
w1	A1	羅希特	10
w2	A1	羅希特	10
w3	A2	羅希特	100
w4	B1	薩克什	10
w5	B2	薩克什	20
w6	B3	薩克什	30

現在對于我嘗試使用 python 查找的每個名稱

不同帳戶的計數，
這些不同賬戶的金額總和
每個名稱的 UID 計數

輸出如下所示

姓名	賬戶數	UID 計數	金額總和
羅希特	2	3	110
薩克什	3	3	60

到目前為止，我能夠使用下面的代碼片段獲得計數，但無法計算數量。

df = df.groupby('Name')['Account','Uid'].nunique()

uj5u.com熱心網友回復：

乍一看，我會建議：

df.groupby('Name').agg({'Account':'count','UID':'count','Amount': 'sum'})

但是正如您所指出的，您想在總和之前對每個名稱的 Amount 值進行重復資料洗掉，我想我會分兩步完成（盡管可能有一種更聰明的方法涉及 lambda 函式）：

s1=df.drop_duplicates(subset=['Name','Amount']).groupby('Name')['Amount'].sum()
df1=df.groupby('Name')['Account','UID'].nunique()
df1.merge(s1.to_frame(), left_index=True, right_index=True)

uj5u.com熱心網友回復：

使用DataFrameGroupBy.apply，我們可以將nuniqueandsum操作合并為一個函式，如下所示：

def f(group):
  # Define transformation dictionary
  transform = {}
  # Set Account to be number of unique accounts
  transform['Account'] = group['Account'].nunique()
  # Set Uid to be number of unique UIDs
  transform['Uid'] = group['Uid'].nunique()
  # Find names of the unique accounts
  unique_accounts = group['Account'].unique()
  # For each unique name, get the corresponding amount of the first matching row
  unique_amounts = [group[group['Account'] == u].iloc[0]['Amount'] for u in unique_accounts]
  # Set Amount to the sum of the unique amounts
  transform['Amount'] = sum(unique_amounts)
  # Return a new Series with our transformed data and its labels
  return pd.Series(transform, index=['Account', 'Uid', 'Amount'])


df = df.groupby('Name').apply(f)

結果如下表：

        Account  UID  Amount
Name                        
Rohit         2    3     110
Sakshi        3    3      60

有關將多個函式應用于 groupby 列的更多資訊，請參閱這篇文章。此外，有關僅獲取與特定條件匹配的資料幀的第一行的資訊，請參閱這篇文章。

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/456788.html

標籤：Python python-3.x 熊猫

上一篇：將具有串列作為值的兩列合并為一列

下一篇：從負載均衡器獲取日志