我在 R 中有一個資料集,其中一些變數名稱是日期,請參閱下面輸入資料的簡化示例(在 Excel 中):

我想對這些資料做的是洗掉一些名稱早于或等于某個日期的列,例如 2019-01-31。請參閱以下所需輸出資料的簡化示例(在 Excel 中):

現在,我可以通過轉置資料、過濾掉日期小于或等于 2019 年 1 月 31 日的行并最終轉回資料來實作這一點。但是我想知道是否有不同的方法可以只使用列名而不來回旋轉?
# Example data to copy and paste into R for easy reproduction of problem:
df <- data.frame (id = c("apples", "pears", "grapes", "tomatoes", "carrots", "cucumber", "rabbit", "cat", "dog"),
type = c("fruit", "fruit", "fruit", "veggies", "veggies", "veggies", "pets", "pets", "pets"),
color = c("red", "green", "purple", "red", "orange", "green", "grey", "black", "brown"),
'2019-04-30' = c(353, 91, 270, 2029, 107, 62, 30, 61, 137),
'2019-03-31' = c(349, 90, 267, 2028, 104, 60, 29, 59, 133),
'2019-02-28' = c(345, 89, 264, 2027, 101, 58, 28, 57, 129),
'2019-01-31' = c(341, 88, 261, 2026, 98, 56, 27, 55, 125),
'2018-12-31' = c(337, 87, 258, 2025, 95, 54, 26, 53, 121),
'2018-11-30' = c(333, 86, 255, 2024, 92, 52, 25, 51, 117),
check.names = FALSE)
uj5u.com熱心網友回復:
我們可以在基本 R 中執行此操作。您的日期可以方便地采用 YYYY-MM-DD 格式,這意味著它們將被>=和<=運算子正確排序。我們還可以使用一個簡單的正則運算式來保留任何不是日期格式的列:
df[!grepl('\\d{4}-\\d{2}-\\d{2}', colnames(df)) | colnames(df) >= '2019-02-28']
id type color 2019-04-30 2019-03-31 2019-02-28
1 apples fruit red 353 349 345
2 pears fruit green 91 90 89
3 grapes fruit purple 270 267 264
4 tomatoes veggies red 2029 2028 2027
5 carrots veggies orange 107 104 101
6 cucumber veggies green 62 60 58
7 rabbit pets grey 30 29 28
8 cat pets black 61 59 57
9 dog pets brown 137 133 129
uj5u.com熱心網友回復:
方法如下:
- 提取列名
Date如果可能,轉換為,NA如果不是日期,則為- 創建布爾向量來過濾太舊的日期和非日期(即
NAs在之前的步驟中)列
樣本資料
## sample data frame
m <- matrix(1, 3, 10)
colnames(m) <- c("a", "b", as.character(seq.Date(as.Date("2021-1-1"), length.out = 8, by = "days")))
(d <- as.data.frame(m))
# a b 2021-01-01 2021-01-02 2021-01-03 2021-01-04 2021-01-05 2021-01-06 2021-01-07 2021-01-08
# 1 1 1 1 1 1 1 1 1 1 1
# 2 1 1 1 1 1 1 1 1 1 1
# 3 1 1 1 1 1 1 1 1 1 1
篩選
r <- vapply(names(d), as.Date, numeric(1), optional = TRUE)
d[, is.na(r) | r <= as.Date("2021-1-3")]
# a b 2021-01-01 2021-01-02 2021-01-03
# 1 1 1 1 1 1
# 2 1 1 1 1 1
# 3 1 1 1 1 1
r <- vapply(names(df), as.Date, numeric(1), optional = TRUE)
df[, is.na(r) | r >= as.Date("2019-1-31")]
# id type color 2019-04-30 2019-03-31 2019-02-28 2019-01-31
# 1 apples fruit red 353 349 345 341
# 2 pears fruit green 91 90 89 88
# 3 grapes fruit purple 270 267 264 261
# 4 tomatoes veggies red 2029 2028 2027 2026
# 5 carrots veggies orange 107 104 101 98
# 6 cucumber veggies green 62 60 58 56
# 7 rabbit pets grey 30 29 28 27
# 8 cat pets black 61 59 57 55
# 9 dog pets brown 137 133 129 125
uj5u.com熱心網友回復:
描述
可以將資料重新整形為長格式并根據日期列進行過濾。
資料
與示例中提供的資料相同
df <- data.frame (id = c("apples", "pears", "grapes", "tomatoes", "carrots", "cucumber", "rabbit", "cat", "dog"),
type = c("fruit", "fruit", "fruit", "veggies", "veggies", "veggies", "pets", "pets", "pets"),
color = c("red", "green", "purple", "red", "orange", "green", "grey", "black", "brown"),
'2019-04-30' = c(353, 91, 270, 2029, 107, 62, 30, 61, 137),
'2019-03-31' = c(349, 90, 267, 2028, 104, 60, 29, 59, 133),
'2019-02-28' = c(345, 89, 264, 2027, 101, 58, 28, 57, 129),
'2019-01-31' = c(341, 88, 261, 2026, 98, 56, 27, 55, 125),
'2018-12-31' = c(337, 87, 258, 2025, 95, 54, 26, 53, 121),
'2018-11-30' = c(333, 86, 255, 2024, 92, 52, 25, 51, 117),
check.names = FALSE)
解決方案
library(dplyr)
library(tidyr)
df %>%
tidyr::pivot_longer(cols = !c(id, type, color), names_to = 'date', values_to = 'value') %>%
dplyr::mutate(date = as.Date(date, format = '%Y-%m-%d')) %>%
dplyr::filter( date >= as.Date('2019-01-31')) %>%
tidyr::pivot_wider(names_from = 'date', values_from = 'value')
期望輸出
id type color `2019-04-30` `2019-03-31` `2019-02-28` `2019-01-31`
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 apples fruit red 353 349 345 341
2 pears fruit green 91 90 89 88
3 grapes fruit purple 270 267 264 261
4 tomatoes veggies red 2029 2028 2027 2026
5 carrots veggies orange 107 104 101 98
6 cucumber veggies green 62 60 58 56
7 rabbit pets grey 30 29 28 27
8 cat pets black 61 59 57 55
9 dog pets brown 137 133 129 125
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/320790.html
上一篇:如何在函式內設定變數?
