我在 R 中有一個多部分查找表問題。我有一個資料框,其中每列中的數字代表一個專案名稱。專案名稱可以在相應的查找表中找到。
資料:
> food.dat
Fruit Vegetable Meat Dairy
1 1 2 2 3
2 3 2 1 1
3 3 2 2 2
4 2 2 1 1
5 1 1 1 2
查找表:
> food.lookup
FoodItem Number FoodName
1 Fruit 1 Banana
2 Fruit 2 Apple
3 Fruit 3 Mango
4 Vegetable 1 Carrot
5 Vegetable 2 Broccoli
6 Meat 1 Chicken
7 Meat 2 Fish
8 Dairy 1 Cheese
9 Dairy 2 Yogurt
10 Dairy 3 IceCream
請注意,該數字在食物中并不是唯一的。例如,1 代表列 Fruit (Banana) 中的不同 FoodName 和列Vegetable (Carrot) 中的不同 FoodName。
我想重新編碼 food.dat 資料框以獲取查找表中的 FoodName 值。如果可能的話,我還希望能夠使用一個簡單的函式并提供一個 FoodName 并從 food.dat 回傳一個資料框,該資料框僅包含包含該指定 FoodName 的行。
感謝您的時間和想法:)
uj5u.com熱心網友回復:
splitvector由 'FoodItem'命名為list來自 'food.lookup' 的一個。回圈across“food.dat”列,提取list元素并通過匹配替換值
library(dplyr)
lst1 <- with(food.lookup, split(setNames(FoodName, Number), FoodItem))
food.dat %>%
mutate(across(all_of(names(lst1)), ~ lst1[[cur_column()]][as.character(.)]))
-輸出
Fruit Vegetable Meat Dairy
1 Banana Broccoli Fish IceCream
2 Mango Broccoli Chicken Cheese
3 Mango Broccoli Fish Yogurt
4 Apple Broccoli Chicken Cheese
5 Banana Carrot Chicken Yogurt
資料
food.dat <- structure(list(Fruit = c(1L, 3L, 3L, 2L, 1L), Vegetable = c(2L,
2L, 2L, 2L, 1L), Meat = c(2L, 1L, 2L, 1L, 1L), Dairy = c(3L,
1L, 2L, 1L, 2L)), class = "data.frame", row.names = c("1", "2",
"3", "4", "5"))
food.lookup <- structure(list(FoodItem = c("Fruit", "Fruit",
"Fruit", "Vegetable",
"Vegetable", "Meat", "Meat", "Dairy", "Dairy", "Dairy"), Number = c(1L,
2L, 3L, 1L, 2L, 1L, 2L, 1L, 2L, 3L), FoodName = c("Banana", "Apple",
"Mango", "Carrot", "Broccoli", "Chicken", "Fish", "Cheese", "Yogurt",
"IceCream")), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10"))
uj5u.com熱心網友回復:
同樣,您可以利用不同名稱的“位置”。為此,將查找表拆分為相應的食物型別(或手動輸入)。然后只需使用索引來設定結果。
下面以一個例子來完成。您可以輕松地將其擴展到所有人。我將結果存盤在 Dairy2 中,以便您可以比較并查看索引是如何作業的。
dairy <- c("Cheese","Yogurt","IceCream")
food.dat <- data.frame(Dairy = c(3,1,2,1,2))
food.dat$Dairy2 = dairy[food.dat$Dairy]
food.dat
Dairy Dairy2
1 3 IceCream
2 1 Cheese
3 2 Yogurt
4 1 Cheese
5 2 Yogurt
uj5u.com熱心網友回復:
我們可以將資料轉為長格式,一行一行的食物,加入查找表,然后轉回寬格式
library(tidyr)
library(dplyr)
food.dat %>%
tibble::rowid_to_column() %>%
pivot_longer(-rowid, names_to = "FoodItem",
values_to = "Number") %>%
left_join(food.lookup) %>%
pivot_wider(id_cols = rowid, names_from = FoodItem,
values_from = FoodName)
#> # A tibble: 5 x 5
#> rowid Fruit Vegetable Meat Dairy
#> <int> <chr> <chr> <chr> <chr>
#> 1 1 Banana Broccoli Fish IceCream
#> 2 2 Mango Broccoli Chicken Cheese
#> 3 3 Mango Broccoli Fish Yogurt
#> 4 4 Apple Broccoli Chicken Cheese
#> 5 5 Banana Carrot Chicken Yogurt
有資料:
food.dat <- read.table(text =
'Fruit Vegetable Meat Dairy
1 2 2 3
3 2 1 1
3 2 2 2
2 2 1 1
1 1 1 2', header = TRUE)
food.lookup <- read.table(text =
'FoodItem Number FoodName
Fruit 1 Banana
Fruit 2 Apple
Fruit 3 Mango
Vegetable 1 Carrot
Vegetable 2 Broccoli
Meat 1 Chicken
Meat 2 Fish
Dairy 1 Cheese
Dairy 2 Yogurt
Dairy 3 IceCream', header = TRUE)
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/363894.html
上一篇:使用NULL作為函式引數的缺點
