如何根據R中小標題中另一列指示的列的值添加列-有解無憂

在下面的示例中，我想根據列“變數”（即 1 和 20）的值添加列“值”。

toy_data <-
  tibble::tribble(
    ~x, ~y, ~variable,
    1,  2,  "x",
    10, 20, "y"
  )

像這樣：

X	是	多變的	價值
1	2	X	1
10	20	是	20

但是，以下方法均無效：

toy_data %>%
  dplyr::mutate(
    value = get(variable)
  )

toy_data %>%
  dplyr::mutate(
    value = mget(variable)
  )

toy_data %>%
  dplyr::mutate(
    value = mget(variable, inherits = TRUE)
  )

toy_data %>%
  dplyr::mutate(
    value = !!variable
  )

我怎樣才能做到這一點？

uj5u.com熱心網友回復：

以下是一些應該可以很好地擴展的選項。

第一個是適用于variable列及其索引的基本選項。（我制作了資料框的副本，以便保留原件以進行更多編程。）

library(dplyr)

toy2 <- toy_data
toy2$value <- mapply(function(v, i) toy_data[[v]][i], toy_data$variable, seq_along(toy_data$variable))
toy2
#> # A tibble: 2 × 4
#>       x     y variable value
#>   <dbl> <dbl> <chr>    <dbl>
#> 1     1     2 x            1
#> 2    10    20 y           20

Second 用于purrr::imap_dbl迭代變數及其索引并回傳雙精度值。

toy_data %>%
  mutate(value = purrr::imap_dbl(variable, function(v, i) toy_data[[v]][i]))
#> # A tibble: 2 × 4
#>       x     y variable value
#>   <dbl> <dbl> <chr>    <dbl>
#> 1     1     2 x            1
#> 2    10    20 y           20

第三個最不直接，但我個人最有可能使用它，也許只是因為它適合我的許多作業流程。Pivotting使得資料的長版，讓你看到的這兩個值variable和對應值x和y，你可以那么對于其中那些2列過濾器匹配。然后自連接回資料框。

inner_join(
  toy_data,
  toy_data %>%
    tidyr::pivot_longer(cols = -variable, values_to = "value") %>%
    filter(variable == name),
  by = "variable"
) %>%
  select(-name)
#> # A tibble: 2 × 4
#>       x     y variable value
#>   <dbl> <dbl> <chr>    <dbl>
#> 1     1     2 x            1
#> 2    10    20 y           20

編輯： @jpiversen 正確地指出，如果variable有重復項，自聯接將不起作用——在這種情況下，向資料添加一個行號并將其用作附加的聯接列。在這里我首先添加一個額外的觀察來說明。

toy3 <- toy_data %>%
  add_row(x = 5, y = 4, variable = "x") %>%
  tibble::rowid_to_column()
inner_join(
  toy3,
  toy3 %>%
    pivot_longer(cols = c(-rowid, -variable), values_to = "value") %>%
    filter(variable == name),
  by = c("rowid", "variable")
) %>%
  select(-name, -rowid)

uj5u.com熱心網友回復：

如果您事先知道資料框中有哪些變數：使用簡單的邏輯，例如ifelse()或dplyr::case_when()在它們之間進行選擇。

如果不是：使用函式式編程。下面是一個例子：

library(dplyr)

f <- function(data, variable_col) {
  
  data[[variable_col]] %>% 
    purrr::imap_dbl(~ data[[.y, .x]])
  
}

toy_data$value <- f(toy_data, "variable")

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/389524.html

標籤：r 数据框 dplyr 小题大做

上一篇：Dask：Seriesgetitem僅支持具有匹配磁區結構的其他系列物件-錯誤

下一篇：有沒有更好的方法將json陣列合并為一個