我有一列具有不同的字串長度,用“,”分隔。我想將此列的每一行拆分為單獨的列,并用“NA”填充缺失值,并為每個字串計算頻率數。這是一個示例:
M <- data.frame(name = c("A", "B", "C"), mapped = c("X1, X3, X4", "X2, X4", "X2,X3, X4"))
name mapped
1 A X1, X3, X4
2 B X2, X4
3 C X2,X3, X4
我想得到這樣的結果資料框:
df <- data.frame(name = c("A","B", "C"), V1 = c("X1","NA", "NA"), V2 = c("NA", "X2","X2"), V3 = c("X3","NA", "X3"), V4 = c("X4","X4", "X4"))
name V1 V2 V3 V4
1 A X1 NA X3 X4
2 B NA X2 NA X4
3 C NA X2 X3 X4
然后計算新資料框每一列的 X1、X2、X3 和 X4 的數量。
謝謝!
uj5u.com熱心網友回復:
你可以使用separate_rowsand pivot_wider:
library(tidyverse)
M %>%
separate_rows(mapped) %>%
pivot_wider(names_from = mapped, values_from = mapped) %>%
relocate(order(colnames(.)))
# A tibble: 3 x 5
name X1 X2 X3 X4
<chr> <chr> <chr> <chr> <chr>
1 A X1 NA X3 X4
2 B NA X2 NA X4
3 C NA X2 X3 X4
然后計算每列的值的數量,使用:
colSums(!is.na(M[,-1]))
# X1 X2 X3 X4
# 1 2 2 3
uj5u.com熱心網友回復:
拆分逗號,unlist,然后計數:
table(unlist(strsplit(M$mapped, ",")))
# X1 X2 X3 X4
# 1 2 2 3
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/456720.html
上一篇:R正則運算式匹配字串的開頭和中間
