我想用一個函式執行以下操作:
categoricalToNumeric <- function(data,...) {
for(i in list(...)) {
data$i <- as.numeric(as.factor(data$i))
}
summary(data)
}
然后打電話,
categoricalToNumeric(data, 'school', 'sex', 'address', 'famsize', 'Pstatus', 'Mjob', 'Fjob', 'reason', 'nursery', 'internet', 'guardian.x', 'schoolsup.x', 'famsup.x', 'paid.x', 'activities.x', 'higher.x', 'romantic.x', 'guardian.y', 'schoolsup.y', 'famsup.y', 'paid.y', 'activities.y', 'higher.y', 'romantic.y')
目前,沒有錯誤,但資料變數不會在categoricalToNumeric呼叫時發生變化。
資料:https : //archive.ics.uci.edu/ml/machine-learning-databases/00320/student.zip
設定:
data_mat=read.table("./data/csv/student-mat.csv",sep=";",header=TRUE)
data_por=read.table("./data/csv/student-por.csv",sep=";",header=TRUE)
data=merge(data_mat,data_por,by=c("school","sex","age","address","famsize","Pstatus","Medu","Fedu","Mjob","Fjob","reason","nursery","internet"))
print(nrow(data)) # 382 data
head(data,5)
uj5u.com熱心網友回復:
這很奇怪,但這有效。為方便起見,我...改為colnames
categoricalToNumeric2 <- function(data,...) {
for(i in colnames(data)) {
data[i] <- as.numeric(as.factor(data$i))
}
summary(data)
}
categoricalToNumeric2(data)
school sex age address famsize Pstatus Medu Fedu
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000
Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000
Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848
3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000
Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000
Mjob Fjob reason nursery internet guardian.x traveltime.x studytime.x
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000
Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000
Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848
3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000
Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000
failures.x schoolsup.x famsup.x paid.x activities.x higher.x romantic.x famrel.x
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000
Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000
Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848
3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000
Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000
freetime.x goout.x Dalc.x Walc.x health.x absences.x G1.x G2.x
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000
Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000
Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848
3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000
Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000
G3.x guardian.y traveltime.y studytime.y failures.y schoolsup.y famsup.y paid.y
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000
Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000
Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848
3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000
Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000
activities.y higher.y romantic.y famrel.y freetime.y goout.y Dalc.y Walc.y
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000
Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000
Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848
3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000
Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000
health.y absences.y G1.y G2.y G3.y
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000
Median :2.000 Median :2.000 Median :2.000 Median :2.000 Median :2.000
Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848 Mean :1.848
3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000
Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000
uj5u.com熱心網友回復:
data$i不是在回圈中提取列的有效方法。您可以[[用于單列或多列[。for回圈的替代方法是使用lapply.
categoricalToNumeric <- function(data,...) {
cols <- c(...)
data[cols] <- lapply(data[cols], function(x) as.numeric(as.factor(x)))
summary(data)
}
categoricalToNumeric(data, 'school', 'sex', ...rest of the columns)
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/330888.html
