我正在嘗試創建一個回圈,我可以在一個資料框中獲得所有可能的組合。這樣做的原因是我想稍后使用這些對來創建 lm()和 adf.test()稍后。例如,我有一個資料框如下
df <- as.data.frame(cbind(1, 2, 3, 4)):
從這里我想得到所有可能的組合:
pairs <- as.data.frame(cbind(c(1, 1, 1, 2, 2, 3), c(2,3,4,3,4,4))).
為了做到這一點,我嘗試了幾種類似的 for 回圈組合:
all_pairs = matrix(0, ((length(df)*(length(df)-1))/2), 2)
for (ij in 1:((length(df)*(length(df)-1))/2)) {
for (i in 1:(length(df)-1)) {
for (j in (i 1):length(df)) {
all_pairs[ij, 1] = df[i,]
all_pairs[ij,2] = df[j,]
}
}
}
, 的原因 ((length(df)*(length(df)-1))/2)是 comb=n(n-1)/2 是我如何計算所有組合而不用替換。
如前所述,我嘗試了幾種方法來做到這一點,但它們都不起作用。這是實作我的目標的好方法嗎?如果是,我怎樣才能讓它作業?
提前致謝!
uj5u.com熱心網友回復:
使用combnwithm = 2獲得組合對:
data.frame(t(combn(1:4, m = 2)))
X1 X2
1 1 2
2 1 3
3 1 4
4 2 3
5 2 4
6 3 4
uj5u.com熱心網友回復:
嘗試combn
> as.data.frame(t(combn(df, 2)))
V1 V2
1 1 2
2 1 3
3 1 4
4 2 3
5 2 4
6 3 4
uj5u.com熱心網友回復:
據我了解,您正在嘗試創建所有預測變數的可能組合,然后開發線性回歸模型。我前幾天寫了這個函式,也許你應該可以重用它:
首先,x 表示所有預測變數,y 表示目標變數。這將為您提供一個表格以及預測變數的所有組合及其誤差指標(RMSE、MAE、MSE 等)
LinearRegressionDA <- function(y, x, DatasetName,Split_Ratio=0.75) {
set.seed(12334)
split = sample.split(DatasetName, SplitRatio = Split_Ratio)
train = subset(DatasetName, split=="TRUE")
test = subset(DatasetName, split=="FALSE")
Data_list =do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE))
Data_dataframe = data.frame(stringi::stri_list2matrix(
do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE)),
byrow = TRUE
))
Data_dataframe[is.na(Data_dataframe)] <- ""
RMSE = list()
MAE = list()
Adj_R2 = list()
R2 = list()
for (i in 1:length(Data_list)){
model = lm(as.formula((paste(y,"~", paste(Data_list[[i]], collapse = " ")))), data = train)
predictions <- model %>% predict(test)
# Model performance
RMSE_ = MLmetrics::RMSE(predictions, test[,y])
RMSE = append(RMSE, RMSE_)
MAE_ = MAE(predictions, test[,y])
MAE = append(MAE, MAE_)
Adj_R2_ = summary(model)$adj.r.squared
Adj_R2 = append(Adj_R2, Adj_R2_)
R2_ = summary(model)$r.squared
R2 = append(R2, R2_)
}
Data_dataframe$RMSE = round(unlist(RMSE),3)
Data_dataframe$MAE = round(unlist(MAE),5)
Data_dataframe$Adj_R2 = round(unlist(Adj_R2),3)
Data_dataframe$R2 = round(unlist(R2),3)
list(Data_dataframe %>%arrange(desc(R2)))
}
您可以通過以下方式使用此功能:
LinearRegressionDA(y = "Y1", x = c("X1" ,"X2", "X3","X4"), DatasetName = df)[[1]]
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/466652.html
