將Herfindahl-Hirschman索引函式應用于R中個人的一組行-有解無憂

我有一個資料框，每個人都有多行。每個人都有一個ID。此外，每人的每一行都有一列，其百分比在各行中加起來為 100。

下面的資料框DF：

ID	百分比
1	50
1	50
2	25
2	20
2	45
2	10

我想應用 Herfindahl-Hirschman 指數（hhi 函式）來按人計算指數。該函式是 hhi(x, s) 并且有兩個引數。x = 物件，s = 值列（在本例中為百分比）。到目前為止，我已經嘗試了以下方法，但它不起作用。它仍然計算整個資料幀的索引。

setDT(df)[,hhi(df, "百分比"), ID]

uj5u.com熱心網友回復：

摘要：您拼寫錯誤百分比，盡管這似乎是由于未能準確復制您的代碼。正如您所指出的，真正的問題是 data.table 函式每次通過by-loop 時都會使用整個百分比值列。參考構造的資料子集的正確方法是使用by.SD（資料子集）構造。

這是 MCVE

library(hhi)
 
 df <- read.table(text="ID  Percentage
 1  50
 1  50
 2  25
 2  20
 2  45
 2  10", head=T)

library(data.table)

setDT(df)
df[,hhi(df, "percentage"), ID]
#------------------
Error in `[.data.frame`(x, i, j) : undefined columns selected
Error in `[.data.frame`(x, i, j) : undefined columns selected
In addition: Warning message:
In hhi(df, "percentage") : shares, "s", do not sum to 100
#-----------------
df[,hhi(df, "Percentage"), ID]  # correct spelling
   ID   V1
1:  1 8150
2:  2 8150
Warning messages:
1: In hhi(df, "Percentage") : shares, "s", do not sum to 100
2: In hhi(df, "Percentage") : shares, "s", do not sum to 100

這顯然是您所看到的，這是因為您沒有正確告訴[.data.table函式與子集評估的函式df相同。df要正確執行此操作，您需要使用.SD自（子集）參考操作。

df[,hhi(.SD, "Percentage"), by=ID]

#-----------
   ID   V1
1:  1 5000
2:  2 3150    # no warnings, more sensible indices of concentration

將此操作的基本版本與 data.table 和另一個發布者的 dplyr 版本進行比較是很有趣的。我碰巧認為，就優雅而言，贏家是 base-R，盡管[.data.table對于大型資料集來說，學習該函式的有些特殊，有時是優雅的語法肯定是有動機的.

lapply( split(df, df$ID), hhi, s="Percentage")
$`1`
[1] 5000

$`2`
[1] 3150

uj5u.com熱心網友回復：

IRTFM 的解決方案非常出色且優雅。這也是一個 dply 解決方案。使用匿名函式或 dplyr group_by 可能有更簡單的方法

library(dplyr)
library(hhi)
library(purrr)

compute_hhi<-function(df){
  hhi=hhi( df %>% as.data.frame(.),"Percentage")
  id=df %>% pluck("ID") %>% head(1)
  data.frame(id,hhi)
}

df_hhi<-df %>%
  group_split(ID, .keep=TRUE) %>%
  map(compute_hhi) %>%
  bind_rows()

df_hhi
#>   id  hhi
#> 1  1 5000
#> 2  2 3150

^{由reprex 包于 2022-01-14 創建(v2.0.1)}

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/412380.html

標籤：

上一篇：variableisnotNone有效，但notvariable無效

下一篇：如何使用R中的paste()函式洗掉點前的空格