我有一個名為Fruits的資料框,其中每行最多有 3 個帶有相應顏色的水果。Color1 搭配 Fruit1,Color2 搭配 Fruit2,Color3 搭配 Fruit3。
Color1 Color2 Color3 Fruit1 Fruit2 Fruit3
1 red green green apple mango kiwi
2 yellow green red banana plum mango
3 green red grape apple
4 yellow apple
使用 dplyr,我可以回傳包含蘋果(1、3 和 4)的行。我可以回傳帶有紅色(1、2 和 3)的行。
red <- filter_at(fruits, vars(Color1:Color3), any_vars(. == "red"))
apple <- filter_at(fruits, vars(Fruit1:Fruit3), any_vars(. == "apple"))
但是我如何只回傳紅蘋果,即只回傳第一行(顏色 1 = 紅色,水果 1 = 蘋果)和第三行(顏色 2 = 紅色,水果 2 = 蘋果)?
謝謝。
ps 這是表格的代碼
Color1 <- c("red", "yellow", "green", "yellow")
Color2 <- c("green", "green", "red", "")
Color3 <- c("green", "red", "", "")
Fruit1 <- c("apple", "banana", "grape", "apple")
Fruit2 <- c("mango", "plum", "apple", "")
Fruit3 <- c("kiwi", "mango", "", "")
fruits <- data.frame (Color1, Color2, Color3, Fruit1, Fruit2, Fruit3)
uj5u.com熱心網友回復:
您可以獨立處理列集,創建邏輯矩陣,然后將它們與&.
前面:
- 如果您
NA的資料中有值,則需要一些 mod 才能正常作業; - 這假定所有列的順序相同;例如,如果您的列被排序為“Color1、Color2、Color3”和“Fruit3、Fruit2、Fruit1”,那么這將無法正確配對。
假設dplyr:
select(fruits, starts_with("Color")) == "red"
# Color1 Color2 Color3
# 1 TRUE FALSE FALSE
# 2 FALSE FALSE TRUE
# 3 FALSE TRUE FALSE
# 4 FALSE FALSE FALSE
select(fruits, starts_with("Fruit")) == "apple"
# Fruit1 Fruit2 Fruit3
# 1 TRUE FALSE FALSE
# 2 FALSE FALSE FALSE
# 3 FALSE TRUE FALSE
# 4 TRUE FALSE FALSE
select(fruits, starts_with("Color")) == "red" & select(fruits, starts_with("Fruit")) == "apple"
# Color1 Color2 Color3
# 1 TRUE FALSE FALSE
# 2 FALSE FALSE FALSE
# 3 FALSE TRUE FALSE
# 4 FALSE FALSE FALSE
從這里,
fruits %>%
filter(
rowSums(
select(., starts_with("Color")) == "red" &
select(., starts_with("Fruit")) == "apple"
) > 0)
# Color1 Color2 Color3 Fruit1 Fruit2 Fruit3
# 1 red green green apple mango kiwi
# 3 green red . grape apple .
資料。因為我最初沒有你的,所以我首先制作了這個.(因為閱讀空列比我最初有時間花費更多的精力)。
fruits <- structure(list(Color1 = c("red", "yellow", "green", "yellow"), Color2 = c("green", "green", "red", "."), Color3 = c("green", "red", ".", "."), Fruit1 = c("apple", "banana", "grape", "apple"), Fruit2 = c("mango", "plum", "apple", "."), Fruit3 = c("kiwi", "mango", ".", ".")), class = "data.frame", row.names = c("1", "2", "3", "4"))
uj5u.com熱心網友回復:
我覺得你的資料可能不太理想。(不整潔)。
如果您先整理資料,那么您嘗試完成的任務可能會更容易。
library(tidyverse)
Color1 <- c("red", "yellow", "green", "yellow")
Color2 <- c("green", "green", "red", "")
Color3 <- c("green", "red", "", "")
Fruit1 <- c("apple", "banana", "grape", "apple")
Fruit2 <- c("mango", "plum", "apple", "")
Fruit3 <- c("kiwi", "mango", "", "")
fruits <- data.frame (Color1, Color2, Color3, Fruit1, Fruit2, Fruit3)
long_fruits <- fruits %>%
## following r2evans suggestion to include row identifier in order to allow re-pivoting if needed
rownames_to_column("row_id") %>%
pivot_longer(-"row_id", names_to = c(".value", "ID"), names_pattern = "(\\w )(\\d )")
long_fruits
#> # A tibble: 12 × 4
#> row_id ID Color Fruit
#> <chr> <chr> <chr> <chr>
#> 1 1 1 "red" "apple"
#> 2 1 2 "green" "mango"
#> 3 1 3 "green" "kiwi"
#> 4 2 1 "yellow" "banana"
#> 5 2 2 "green" "plum"
#> 6 2 3 "red" "mango"
#> 7 3 1 "green" "grape"
#> 8 3 2 "red" "apple"
#> 9 3 3 "" ""
#> 10 4 1 "yellow" "apple"
#> 11 4 2 "" ""
#> 12 4 3 "" ""
long_fruits %>%
filter(Fruit == "apple", Color == "red")
#> # A tibble: 2 × 4
#> row_id ID Color Fruit
#> <chr> <chr> <chr> <chr>
#> 1 1 1 red apple
#> 2 3 2 red apple
由reprex 包(v2.0.1)于 2021 年 12 月 21 日創建
uj5u.com熱心網友回復:
這是使用tidyverse/的替代解決方案purrr:
這將匹配以相同數字結尾的列(即,Color1和Fruit1,Color20和Fruit20)
假設每種顏色都有一個匹配的水果,否則索引(.[1]并且.[2]會失敗)。如果需要,您還可以用不同的值替換“紅色”和“蘋果”。
subset_func <- function(data, num) {
out <- data %>%
mutate(id = row_number()) %>%
select(id, ends_with(num)) %>%
filter(.[2] == "red" & .[3] == "apple")
data %>%
mutate(id = row_number()) %>%
filter(id %in% out$id) %>%
select(-id)
}
map_df(as.character(1:3), ~subset_func(fruits, .))
這給了我們:
Color1 Color2 Color3 Fruit1 Fruit2 Fruit3
1 red green green apple mango kiwi
3 green red . grape apple .
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/389582.html
