我剛開始熟悉這個validate包。不幸的是,一開始我遇到了一個問題,我找不到正確的解決方案。我想創建一個驗證規則,以后可以將其應用于多個變數。我將在一個例子中展示它。我有這樣的tibble:
library(tidyverse)
library(validate)
df = tibble(
id = rep(1:10, each=20),
name = rep(paste0("v", 1:20), 10),
value = rnorm(length(name))
) %>% pivot_wider()
輸出
# A tibble: 10 x 21
id v1 v2 v3 v4 v5 v6 v7 v8 v9 v10
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1.20 0.182 -1.53 2.73 -1.60 -0.976 -0.767 -2.28 -0.257 0.736
2 2 0.484 0.913 -0.873 -0.801 0.172 1.11 -1.71 0.0125 0.0450 0.374
3 3 -0.604 -0.405 0.482 0.998 -0.634 0.212 0.717 0.598 -0.876 0.139
4 4 -0.324 -1.83 0.0195 -1.70 0.506 -0.139 3.21 -0.00169 -0.200 -1.03
5 5 0.268 1.40 0.349 0.667 1.76 0.926 -1.09 -0.487 2.03 0.203
6 6 0.646 0.516 0.849 -0.619 -2.18 0.126 -0.0956 -0.471 0.0342 0.530
7 7 -1.03 -1.27 -0.0716 -2.13 -0.340 1.20 0.746 -0.366 -2.82 -0.431
8 8 0.415 0.313 0.591 -0.0552 0.132 1.86 -0.427 0.390 -0.506 -0.470
9 9 0.309 1.13 -0.472 0.760 -0.549 -0.954 -0.219 -0.653 0.335 -0.870
10 10 1.06 1.30 1.12 0.646 0.279 -1.45 -0.891 -0.278 0.637 0.236
# ... with 10 more variables: v11 <dbl>, v12 <dbl>, v13 <dbl>, v14 <dbl>, v15 <dbl>,
# v16 <dbl>, v17 <dbl>, v18 <dbl>, v19 <dbl>, v20 <dbl>
我可以使用以下規則驗證一個變數:
df %>%
confront(
validator(
num.val = is.numeric(v1),
big.val = !(v1>10),
low.val = !(v1< -10),
NA.val = !is.na(v1)
)
) %>% summary()
# name items passes fails nNA error warning expression
# 1 num.val 1 1 0 0 FALSE FALSE is.numeric(v1)
# 2 big.val 10 10 0 0 FALSE FALSE v1 <= 10
# 3 low.val 10 10 0 0 FALSE FALSE v1 >= -10
# 4 NA.val 10 10 0 0 FALSE FALSE !is.na(v1)
但是,我想使用一些簡單的符號將此規則應用于多個列。不幸的是,下面的代碼不起作用。
df %>%
confront(
validator(
num.val = is.numeric(v1:v20),
big.val = !(v1:v20>10),
low.val = !(v1:v20< -10),
NA.val = !is.na(v1:v20)
)
) %>% summary()
# name items passes fails nNA error warning expression
# 1 num.val 1 1 0 0 FALSE TRUE is.numeric(v1:v20)
# 2 big.val 1 1 0 0 FALSE TRUE v1:v20 <= 10
# 3 low.val 1 1 0 0 FALSE TRUE v1:v20 >= -10
# 4 NA.val 1 1 0 0 FALSE TRUE !is.na(v1:v20)
我知道我總是可以將我的資料轉換為長格式。
df %>%
pivot_longer(v1:v20) %>%
confront(
validator(
num.val = is.numeric(value),
big.val = !(value>10),
low.val = !(value< -10),
NA.val = !is.na(value)
)
) %>% summary()
# name items passes fails nNA error warning expression
# 1 num.val 1 1 0 0 FALSE FALSE is.numeric(value)
# 2 big.val 200 200 0 0 FALSE FALSE value <= 10
# 3 low.val 200 200 0 0 FALSE FALSE value >= -10
# 4 NA.val 200 200 0 0 FALSE FALSE !is.na(value)
但是,在這種情況下,我將無法確定驗證失敗的變數。
關于如何輕松地將一個驗證規則應用于多個選定變數的任何建議?
uj5u.com熱心網友回復:
這種方式來自validate::syntax,.用于放置整個資料,但對于num.val. 我查找了Data Validation Cookbook,但找不到以簡單方式選擇多列的方法。
df %>%
select(-id) %>%
confront(
validator(
num.val = is.numeric(.),
big.val = !(.>10),
low.val = !(.< -10),
NA.val = !is.na(.)
)
) %>% summary()
name items passes fails nNA error warning expression
1 num.val 1 0 1 0 FALSE FALSE is.numeric(.)
2 big.val 200 200 0 0 FALSE FALSE . <= 10
3 low.val 200 200 0 0 FALSE FALSE . >= -10
4 NA.val 200 200 0 0 FALSE FALSE !is.na(.)
uj5u.com熱心網友回復:
如果我們pivot_longer通過group_spliting更改 OP 的代碼,它應該可以作業
library(purrr)
library(dplyr)
library(tidyr)
out <- df %>%
pivot_longer(v1:v20) %>%
group_split(name) %>%
map(~ .x %>% confront(
validator(
num.val = is.numeric(value),
big.val = !(value>10),
low.val = !(value< -10),
NA.val = !is.na(value)
)
) %>% summary())
-輸出
> out[1:4]
[[1]]
name items passes fails nNA error warning expression
1 num.val 1 1 0 0 FALSE FALSE is.numeric(value)
2 big.val 10 10 0 0 FALSE FALSE value <= 10
3 low.val 10 10 0 0 FALSE FALSE value >= -10
4 NA.val 10 10 0 0 FALSE FALSE !is.na(value)
[[2]]
name items passes fails nNA error warning expression
1 num.val 1 1 0 0 FALSE FALSE is.numeric(value)
2 big.val 10 10 0 0 FALSE FALSE value <= 10
3 low.val 10 10 0 0 FALSE FALSE value >= -10
4 NA.val 10 10 0 0 FALSE FALSE !is.na(value)
[[3]]
name items passes fails nNA error warning expression
1 num.val 1 1 0 0 FALSE FALSE is.numeric(value)
2 big.val 10 10 0 0 FALSE FALSE value <= 10
3 low.val 10 10 0 0 FALSE FALSE value >= -10
4 NA.val 10 10 0 0 FALSE FALSE !is.na(value)
[[4]]
name items passes fails nNA error warning expression
1 num.val 1 1 0 0 FALSE FALSE is.numeric(value)
2 big.val 10 10 0 0 FALSE FALSE value <= 10
3 low.val 10 10 0 0 FALSE FALSE value >= -10
4 NA.val 10 10 0 0 FALSE FALSE !is.na(value)
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/339778.html
上一篇:拍照后如何驗證不同的按鈕?
