假設我的資料是df <- c("Author1","Reference1","Abstract1","Author2","Reference2","Abstract2","Author3","Reference3","Author4","Reference4","Abstract4")
.
這是一個順序為作者、參考文獻和摘要的系列。但在某些情況下,摘要資料會丟失。(在此示例中,缺少第三個摘要。)那么,當缺少摘要時,如何添加 NA 值來代替摘要?
換句話說,如果向量中的一個元素以單詞“Reference”開頭,但它的下一個元素不是以單詞“Abstract”開頭,我想在以“Reference”開頭的元素之后添加一個 NA 值。結果向量應該是
result <- c("Author1","Reference1","Abstract1","Author2","Reference2","Abstract2","Author3","Reference3",NA,"Author4","Reference4","Abstract4")
How can I do it?
我已經嘗試過 R 中的追加函式,但要使用它,我需要有要添加 NA 的元素的索引號。因此,它需要為每個 NA 元素手動輸入。
uj5u.com熱心網友回復:
這是一種方法。
基本上你得到兩個向量:
- 哪個測驗該元素是否
Reference
包含,另一個檢查該元素是否不包含Abstract
- 你將一個向量偏移 1,因為你想測驗抽象是否遵循參考。
- 你采取邏輯和
- 然后你將
NA
s 插入到 abstract 應該但沒有的位置append()
ab_missing <- grepl("Reference", df) & c(!grepl("Abstract", df)[-1], FALSE)
df <- append(df, NA, which(ab_missing))
df
[1] "Author1" "Reference1" "Abstract1" "Author2" "Reference2" "Abstract2" "Author3" "Reference3" NA "Author4"
[11] "Reference4" "Abstract4"
uj5u.com熱心網友回復:
一種方法(也是我完成這些事情的唯一方法)是用小標題或資料框來思考:(所以這不是最好的方法)!
- 我們創建了一個列 calling 的 tibble
x
, - 然后我們按數字分組,例如 1,1,1,
parse_number()
函式來自readr
(I loveparse_number()
), - 通過
summarise(cur_data()[seq(3),])
查看將每個組擴展到最大行,請參見此處將每個組擴展到行的最大 n 行 3a 在此處停止并在需要 NA 時拉動,否則繼續 - 最后我們使用具有 r 回收能力的 paste 并拉取向量:
1. 如果需要 NA:
library(dplyr)
library(readr)
my_vector <- tibble(x = c("Author1","Reference1","Abstract1","Author2","Reference2",
"Abstract2","Author3","Reference3","Author4","Reference4","Abstract4")) %>%
group_by(group= parse_number(x)) %>%
summarise(cur_data()[seq(3),]) %>%
pull(x)
[1] "Author1" "Reference1" "Abstract1" "Author2" "Reference2" "Abstract2" "Author3"
[8] "Reference3" NA "Author4" "Reference4" "Abstract4"
2. 如果需要缺詞:
library(dplyr)
library(readr)
my_vector <- tibble(x = c("Author1","Reference1","Abstract1","Author2","Reference2",
"Abstract2","Author3","Reference3","Author4","Reference4","Abstract4")) %>%
group_by(group= parse_number(x)) %>%
summarise(cur_data()[seq(3),]) %>%
mutate(group = paste0(c("Author", "Reference", "Abstract"), group)) %>%
pull(group)
[1] "Author1" "Reference1" "Abstract1" "Author2" "Reference2" "Abstract2" "Author3"
[8] "Reference3" "Abstract3" "Author4" "Reference4" "Abstract4"
uj5u.com熱心網友回復:
一個稍微不同的方法可能是:
c(sapply(split(x, cumsum(grepl("Author", x))), function(x) head(c(x, NA_character_), 3)))
[1] "Author1" "Reference1" "Abstract1" "Author2" "Reference2" "Abstract2" "Author3"
[8] "Reference3" NA "Author4" "Reference4" "Abstract4"
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/536391.html
標籤:r纵梁