str1<-c("A","B","C","D","E","F")
str2<-c("Apple", "Mango", "Avocado", "Watermelon", "Banana", "Pineapple")
str3<-c("Mouse","Cat", "Lion", "Shark", "Eagle", "Ladybug")
num1<-c(1:6)
num2<-c(2.3, 3.5, 4, 7, 6.2, 3)
binary1<-c(0,1,0,1,0,0)
binary2<-c(1,1,0,0,0,1)
mydata<-data.frame(str1,str2, str3,num1,num2, binary1, binary2)
人們總是說矢量化比回圈更好。
所以我想知道如何通過向量化而不是使用回圈來重新編碼很多變數:
我的第一個任務是改變str1,str2并且str3在因子中,我使用了:
for (i in c("str1","str2","str3"){
mydata[i]<-as.factor (mydata[i])
}
我的第二個任務是更改變數binary1和binary2因子并更改它們在0=No, 中的值1= Yes。我用了:
for (i in c("binary1","binary2"){
mydata[i]<-factor (mydata[i], levels=c(0,1), labels=c("No","Yes"))
}
如何在每種情況下使用矢量化而不是回圈。
uj5u.com熱心網友回復:
例如,通過使用dplyr:
library(dplyr)
mydata %>%
mutate(across(c(1:3,6:7), ~as.factor(.)),
across(starts_with("bin"), ~ifelse(. == 1, "Yes", "No")))
str1 str2 str3 num1 num2 binary1 binary2
1 A Apple Mouse 1 2.3 No Yes
2 B Mango Cat 2 3.5 Yes Yes
3 C Avocado Lion 3 4.0 No No
4 D Watermelon Shark 4 7.0 Yes No
5 E Banana Eagle 5 6.2 No No
6 F Pineapple Ladybug 6 3.0 No Yes
uj5u.com熱心網友回復:
您可以使用map()來自的功能purrr。
# Change str1, str2 and str3 into factors using the map() function
mydata[, c("str1", "str2", "str3")] <-
purrr::map(mydata[, c("str1", "str2", "str3")],
.f = as.factor)
str(mydata)
# Change variables binary1 and binary2 in factor and change their values in 0 = No, 1 = Yes using the map() function
mydata[, c("binary1", "binary2")] <-
purrr::map(mydata[, c("binary1", "binary2")],
.f = factor, levels = c(0, 1), labels = c("No", "Yes"))
str(mydata)
'data.frame': 6 obs. of 7 variables:
$ str1 : Factor w/ 6 levels "A","B","C","D",..: 1 2 3 4 5 6
$ str2 : Factor w/ 6 levels "Apple","Avocado",..: 1 4 2 6 3 5
$ str3 : Factor w/ 6 levels "Cat","Eagle",..: 5 1 4 6 2 3
$ num1 : int 1 2 3 4 5 6
$ num2 : num 2.3 3.5 4 7 6.2 3
$ binary1: num 0 1 0 1 0 0
$ binary2: num 1 1 0 0 0 1
uj5u.com熱心網友回復:
請使用以下一種替代解決方案 data.table
- 代碼
library(data.table)
sel_cols1 <- c("str1", "str2", "str3")
sel_cols2 <- c("binary1", "binary2")
setDT(mydata)[, (sel_cols1) := lapply(.SD, as.factor), .SDcols = sel_cols1
][, (sel_cols2) := lapply(.SD, function(x) as.factor(fifelse(x == 0, "No", "Yes"))), .SDcols = sel_cols2][]
- 輸出
#> str1 str2 str3 num1 num2 binary1 binary2
#> 1: A Apple Mouse 1 2.3 No Yes
#> 2: B Mango Cat 2 3.5 Yes Yes
#> 3: C Avocado Lion 3 4.0 No No
#> 4: D Watermelon Shark 4 7.0 Yes No
#> 5: E Banana Eagle 5 6.2 No No
#> 6: F Pineapple Ladybug 6 3.0 No Yes
- 檢查
class變數
sapply(mydata,class)
#> str1 str2 str3 num1 num2 binary1 binary2
#> "factor" "factor" "factor" "integer" "numeric" "factor" "factor"
由reprex 包(v2.0.1)于 2021 年 11 月 16 日創建
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/358137.html
上一篇:R將excel與不兼容的列相結合
