我正在嘗試按組獲取任意數量因子的相關矩陣,理想情況下使用 dplyr。通過按組過濾和匯總來獲取相關矩陣沒有問題,但是使用“group_by”,我不確定如何將因子資料傳遞給 cor。
library(dplyr)
numRows <- 20
myData <- tibble(A = rnorm(numRows),
B = rnorm(numRows),
C = rnorm(numRows),
Group = c(rep("Group1", numRows/2), rep("Group2", numRows/2)))
# Essentially what I'm doing is trying to get these matrices, but for all groups
myData %>%
filter(Group == "Group1") %>%
select(-Group) %>%
summarize(CorMat = cor(.))
# However, I don't know what to pass into "cor". The code below fails
myData %>%
group_by(Group) %>%
summarize(CorMat = cor(.))
# Error looks like this
Error: Problem with `summarise()` column `CorMat`.
i `CorMat = cor(.)`.
x 'x' must be numeric
i The error occurred in group 1: Group = "Group1".
我已經看到了特定因素之間的分組相關性(按組的相關矩陣)或所有因素與特定因素之間的相關性(dplyr 中分組變數的相關矩陣)的解決方案,但對于所有因素與所有因素的分組相關矩陣沒有任何解決方案.
uj5u.com熱心網友回復:
您可以嘗試使用nest_bywhich 將您的資料(沒有Group)放入名為data. 然后您可以使用cor以下方法參考此列:
myData %>%
nest_by(Group) %>%
summarise(CorMat = cor(data))
輸出
Group CorMat[,1] [,2] [,3]
<chr> <dbl> <dbl> <dbl>
1 Group1 1 -0.132 0.638
2 Group1 -0.132 1 -0.284
3 Group1 0.638 -0.284 1
4 Group2 1 0.429 -0.228
5 Group2 0.429 1 -0.235
6 Group2 -0.228 -0.235 1
如果您想要一個命名的矩陣串列,您還可以嘗試以下操作。您可以添加split(或嘗試group_split不使用名稱)然后map洗掉該Group列。
library(tidyverse)
myData %>%
nest_by(Group) %>%
summarise(CorMat = cor(data)) %>%
ungroup %>%
split(f = .$Group) %>%
map(~ .x %>% select(-Group))
輸出
$Group1
# A tibble: 3 x 1
CorMat[,1] [,2] [,3]
<dbl> <dbl> <dbl>
1 1 -0.132 0.638
2 -0.132 1 -0.284
3 0.638 -0.284 1
$Group2
# A tibble: 3 x 1
CorMat[,1] [,2] [,3]
<dbl> <dbl> <dbl>
1 1 0.429 -0.228
2 0.429 1 -0.235
3 -0.228 -0.235 1
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/340001.html
