我想基于一列同時更新三列
我的資料看起來像這樣
df <- data.frame(input = c("Antidesma cuspidatum Mull.Arg.", "Antidesma cuspidatum Müll.Arg.",
"Alchornea parviflora (Benth.) Mull.Arg.", "Alchornea parviflora (Benth.) Müll.Arg."),
n1 = c("Antidesma cuspidatum", NA, "Alchornea parviflora", NA),
n2 = c("Antidesma", NA, "Alchornea", NA),
n3 = c("Phyllanthaceae", NA, "Euphorbiaceae", NA))
input n1 n2 n3
1 Antidesma cuspidatum Mull.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
2 Antidesma cuspidatum Müll.Arg. <NA> <NA> <NA>
3 Alchornea parviflora (Benth.) Mull.Arg. Alchornea parviflora Alchornea Euphorbiaceae
4 Alchornea parviflora (Benth.) Müll.Arg. <NA> <NA> <NA>
我想問一下,如果我找到了前兩個strings的input列是相同的,那么coresponding行是相同的。這意味著值(第二和第四行)n1,n2,n3在本實施例中將由的值(第1和第3行)加入。
我想要的輸出在這里
input n1 n2 n3
1 Antidesma cuspidatum Mull.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
2 Antidesma cuspidatum Müll.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
3 Alchornea parviflora (Benth.) Mull.Arg. Alchornea parviflora Alchornea Euphorbiaceae
4 Alchornea parviflora (Benth.) Müll.Arg. Alchornea parviflora Alchornea Euphorbiaceae
這個案子對我有什么建議嗎?
uj5u.com熱心網友回復:
您可以使用該dplyr軟體包。首先,我創建一個列gr,其中僅包含input. 然后我更改(或mutate)列n1,n2并將n3該組的非 NA 值放在那里。
library(dplyr)
df %>%
group_by(gr = gsub("(^\\w \\w ) .*", "\\1", input)) %>%
mutate(across(c(n1, n2, n3), ~.x[!is.na(.x)][1])) %>%
ungroup()
uj5u.com熱心網友回復:
基礎 R 解決方案:
# Resolve the names of column vectors prefixed with "n":
# na_col_names => character vector
na_col_names <- grep(
"n\\d ",
names(df),
value = TRUE
)
# Carry the last value forward: df => data.frame
df[,na_col_names] <- lapply(
na_col_names,
function(x){
df[,x] <- na.omit(df[,x])[cumsum(!(is.na(df[,x])))]
}
)
整理宇宙:
library(tidyverse)
df %>%
mutate_if(
str_detect("n\\d ", names(.)),
function(x){
fill(x, .direction = "down")
}
)
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/316073.html
下一篇:從R資料框中的字符列中提取%
