如何創建可重復的日期時間資料隨機洗牌來計算平均差異時間-有解無憂

我有一個包含 1,000 個事件、事件 A 和事件 B 的日期時間的資料集。我希望測驗它們之間是否存在某種依賴關系。為此，我希望隨機打亂 A 和 B 中的時間，計算每次觀察之間的差異時間，即 A 到 B，然后計算所有差異時間的平均值。我希望重復這個測驗 100 次。

因此，我正在尋找回圈或函式，而不是復制粘貼代碼。


# the data frame is structured like this with many more observations

set.seed(10)

A <- sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 12)

B <- sample(seq(as.Date('2000/01/01'), as.Date('2010/01/01'), by="day"), 12)

df <- data.frame(A, B)

我已經能夠生成如下所需的輸出，但需要重復多次，即有 100 個 mean_shuffled 結果


shuffled_A = sample(df$A)
shuffled_B = sample(df$B)

df_shuffled <- data.frame(shuffled_A, shuffled_B)

df_shuffled$diff <- difftime(df_shuffled$shuffled_B, df_shuffled$shuffled_A)

mean_shuffled <- mean(df_shuffled$diff)

以下@jblood94 評論已添加以下內容


# the data frame is structured like this with many more observations

set.seed(100)

A <- sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 120)

B <- A   2 # as I am testing that B is dependent on A, so B always takes place after A

df <- data.frame(A, B)

df = transform(df, C = sample(A), D = sample(B), E = sample(A), G = sample(B) ) # to create two shuffled diff times

df$diff <- difftime(df$B, df$A) # observed data
df$diff_shuffle1 <- abs(difftime(df$D, df$C, units = "days")) # A and B are at random times but I have added abs() as the diff time can be positive or negative
df$diff_shuffle2 <- abs(difftime(df$G, df$E, units = "days")) # A and B are at random times 2

mean(df$diff) # observed mean
mean(df$diff_shuffle1) # shuffled time difference between A and B is they happen at random times
mean(df$diff_shuffle2) # shuffled time difference between A and B is they happen at random times

uj5u.com熱心網友回復：

for()您可以為給定數量的回圈/模擬將您所做的事情包裝在一個回圈中，nsims并在每個模擬回圈時跟蹤sim它，并將每個結果添加到output. 注意靜態data名稱和df回圈中的動態名稱。

set.seed(100)

A <- sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 120)
B <- A   2 # as I am testing that B is dependent on A, so B always takes place after A
data <- data.frame(A, B)

nsims <- 100
sim <- 1
output <- data.frame()

for(i in 1:nsims){
df = transform(data, C = sample(A), D = sample(B), E = sample(A), G = sample(B) ) # to create two shuffled diff times
df$diff <- difftime(df$B, df$A) # observed data
df$diff_shuffle1 <- abs(difftime(df$D, df$C, units = "days")) # A and B are at random times but I have added abs() as the diff time can be positive or negative
df$diff_shuffle2 <- abs(difftime(df$G, df$E, units = "days")) # A and B are at random times 2
obsM <- mean(df$diff) # observed mean
shuf1M <- mean(df$diff_shuffle1) # shuffled time difference between A and B is they happen at random times
shuf2M <- mean(df$diff_shuffle2) # shuffled time difference between A and B is they happen at random times
out <- data.frame(obsM,shuf1M,shuf2M,sim)
output <- rbind(output,out)
sim <- sim 1
}

output

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/521356.html

標籤：r约会时间意思是差异时间

上一篇：datetime如何決定%y中的'22'是2022而不是1922？

下一篇：SpringBootJava時區曼谷(GMT 8)或另一個GMT 8時區和ZonedDateTime使用java.sql.Timestamp