R回圈問題-遍歷資料框-有解無憂

我是 R 新手，需要一些幫助來解決問題。總之，我有一個具有不同值的資料框：10 行和 6 列。每列代表一個變數：column 1-n1,column2-mean1,column3-variance1,column4-n2,column5-mean2,column6-variance2。每一行都是這些變數的不同組合。我想遍歷每一行并生成兩個樣本 - 樣本 1 - 具有 n1、mean1 和 sd1 （方差 1 sqrt）的樣本 1- 隨機正態變數和具有 n1、mean1 和 sd1 （方差 1 sqrt）的樣本 2 隨機正態變數。有人可以讓我知道最好的方法是什么嗎？謝謝您的幫助。

這是我使用 dput() 函式的示例資料：

structure(list(n1 = c(5, 10, 5, 10, 5, 10), n2 = c(3, 3, 6, 6, 
3, 3), mean1 = c(4, 4, 4, 4, 6, 6), mean2 = c(15, 15, 15, 15, 
15, 15), sd1 = c(1, 1, 1, 1, 1, 1), sd2 = c(10, 10, 10, 10, 10, 
10)), out.attrs = list(dim = c(n1 = 2L, n2 = 2L, mean1 = 2L, 
mean2 = 2L, sd1 = 2L, sd2 = 2L), dimnames = list(n1 = c("n1= 5", 
"n1=10"), n2 = c("n2=3", "n2=6"), mean1 = c("mean1=4", "mean1=6"
), mean2 = c("mean2=15", "mean2=20"), sd1 = c("sd1=1", "sd1=5"
), sd2 = c("sd2=10", "sd2= 4"))), row.names = c(NA, 6L), class = "data.frame")

uj5u.com熱心網友回復：

你沒有說明你打算如何使用這些結果。這會將兩組亂數存盤在一個串列中：

set.seed(42)   # For reproducibility
results <- apply(params, 1, function(x) list(first=rnorm(x[1], x[3], x[5]),
           second=rnorm(x[2], x[4], x[6])))
    results[[1]]
    # $first
    # [1] 5.370958 3.435302 4.363128 4.632863 4.404268
    # 
    # $second
    # [1] 13.93875 30.11522 14.05341
    # 
    results[[1]]$first
    # [1] 5.370958 3.435302 4.363128 4.632863 4.404268
    results[[1]]$second
    # [1] 13.93875 30.11522 14.05341

如果您想使用這些來計算 t 檢驗，那么您可以直接執行此操作，而無需存盤隨機生成的值：

set.seed(42)
results.t <- apply(params, 1, function(x) t.test(rnorm(x[1], x[3], x[5]), 
             rnorm(x[2], x[4], x[6])))
results.t[[1]]
# 
#   Welch Two Sample t-test
# 
# data:  rnorm(x[1], x[3], x[5]) and rnorm(x[2], x[4], x[6])
# t = -2.7736, df = 2.0133, p-value = 0.1083
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
#  -37.938884   8.083236
# sample estimates:
# mean of x mean of y 
#  4.441304 19.369128

或者您可以使用results：

 results.t2 <- lapply(results, function(x) t.test(x$first, x$second))
results.t2[[1]]
# 
#   Welch Two Sample t-test
# 
# data:  x$first and x$second
# t = -2.7736, df = 2.0133, p-value = 0.1083
# alternative hypothesis: true difference in means is not equal to 0
#  95 percent confidence interval:
#  -37.938884   8.083236
# sample estimates:
# mean of x mean of y 
#  4.441304 19.369128

uj5u.com熱心網友回復：

您可以將生成的資料保存在串列中。params是引數的資料框。

data1<-list()
data2<-list()
for(i in 1:dim(params)[1]){
  data_1i<- rnorm(n= params$n1[i], mean= params$mean1[i], sd=params$sd1[i] )
  data_2i<- rnorm(n= params$n2[i], mean= params$mean2[i], sd=params$sd2[i] )

  data1[[i]]<- data_1i
  data2[[i]]<- data_2i
}

uj5u.com熱心網友回復：

一種purrr方式：

 library(purrr) 
 library(dplyr)
  df %>% 
    group_nest(row_number()) %>% 
    pull(data) %>% 
    map(~.x %>% tibble(first = rnorm(n = n1, mean = mean1, sd = sd1),
                       second = rnorm(n = n2, mean = mean2, sd = sd2)) %>% 
          select(first, second)) 
  # if you want them in one df: bind_rows()

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/465268.html

標籤：r 数据框循环 for循环

上一篇：如何修改與回圈相關的代碼以使用java產生預期的輸出？

下一篇：在R中的for回圈內分組