我正在嘗試創建一個路徑序列。以下是示例資料集:
df <- structure(list(
sess_id = c(4, 4, 4, 4, 4, 4, 4, 7, 7, 7, 7, 7),
Page = c("A", "B", "C", "D", "A", "C", "B", "B", "C", "D", "A", "D")),
.Names = c("sess_id", "Page"),
row.names = c(NA, -12L),
class = "data.frame")
這是表格:
| sess_id | 頁 |
|---|---|
| 4 | 一個 |
| 4 | 乙 |
| 4 | C |
| 4 | D |
| 4 | 一個 |
| 4 | C |
| 4 | 乙 |
| 7 | 乙 |
| 7 | C |
| 7 | D |
| 7 | 一個 |
| 7 | D |
我想像這樣添加三列:
| sess_id | 頁 | 小路 | 開始 | 結尾 |
|---|---|---|---|---|
| 4 | 一個 | |||
| 4 | 乙 | AB | 一個 | 乙 |
| 4 | C | 美國廣播公司 | 一個 | C |
| 4 | D | A B C D | 一個 | D |
| 4 | 一個 | ABCDA | 一個 | 一個 |
| 4 | C | BCDAC | 乙 | C |
| 4 | 乙 | CDACB | C | 乙 |
| 7 | 乙 | |||
| 7 | C | 公元前 | 乙 | C |
| 7 | D | BCD | 乙 | D |
| 7 | 一個 | BCDA | 乙 | 一個 |
| 7 | D | BCDAD | 乙 | D |
我正在嘗試在每個會話中創建一個包含五個頁面的路徑序列。并映射該五頁序列的開始和結束。
uj5u.com熱心網友回復:
使用rollapplyrfrom packagezoo創建每組的滾動序列sess_id。那么序列的第一個和最后一個字符分別是Start和End列。
df <- structure(list(
sess_id = c(4, 4, 4, 4, 4, 4, 4, 7, 7, 7, 7, 7),
Page = c("A", "B", "C", "D", "A", "C", "B", "B", "C", "D", "A", "D")),
.Names = c("sess_id", "Page"),
row.names = c(NA, -12L),
class = "data.frame")
fun <- function(x, width) {
y1 <- zoo::rollapplyr(x, width = seq(width), paste, collapse = "")[1:(width - 1L)]
y2 <- zoo::rollapplyr(x, width = width, paste, collapse = "")
c(y1, y2)
}
sp <- split(df$Page, df$sess_id)
l <- 5L
df$Path <- unlist(lapply(sp, fun, width = l))
df$Start <- substr(df$Path, 1, 1)
df$End <- substring(df$Path, nchar(df$Path))
df
#> sess_id Page Path Start End
#> 1 4 A A A A
#> 2 4 B AB A B
#> 3 4 C ABC A C
#> 4 4 D ABCD A D
#> 5 4 A ABCDA A A
#> 6 4 C BCDAC B C
#> 7 4 B CDACB C B
#> 8 7 B B B B
#> 9 7 C BC B C
#> 10 7 D BCD B D
#> 11 7 A BCDA B A
#> 12 7 D BCDAD B D
使用reprex v2.0.2創建于 2022-11-08
uj5u.com熱心網友回復:
您可以使用accumulate substr如下所示
library(dplyr)
library(purrr)
df %>%
group_by(sess_id) %>%
mutate(Path = accumulate(Page, paste0)) %>%
ungroup() %>%
mutate(
Path = substr(Path, nchar(Path) - 4, nchar(Path)),
Start = substr(Path, 1, 1),
End = Page
)
這使
# A tibble: 12 × 5
sess_id Page Path Start End
<dbl> <chr> <chr> <chr> <chr>
1 4 A A A A
2 4 B AB A B
3 4 C ABC A C
4 4 D ABCD A D
5 4 A ABCDA A A
6 4 C BCDAC B C
7 4 B CDACB C B
8 7 B B B B
9 7 C BC B C
10 7 D BCD B D
11 7 A BCDA B A
12 7 D BCDAD B D
uj5u.com熱心網友回復:
以下作業并使用 tidyverse。Path最初是在每個字母中的所有字母sess_id粘在一起時創建的。然后取第一個到第 n 個字母,其中 n 是行號。然后從字串末尾取 0 到 5 個字符。
Start和End只是 的第一個和最后一個字母Path。
最后我們設定Path,當Start的長度End為1 時。""Path
df <- df %>%
group_by(sess_id) %>%
mutate(Path = paste0(Page , collapse = "") %>%
str_sub( 1 , row_number()) %>%
str_extract( "\\w{0,5}$"),
Start = str_extract(Path , "^\\w"),
End = str_extract(Path , "\\w$")) %>%
mutate(across(c(Path, Start, End), ~if_else(str_length(Path)==1 , "" , .)))
> df
# A tibble: 12 x 5
# Groups: sess_id [2]
sess_id Page Path Start End
<dbl> <chr> <chr> <chr> <chr>
1 4 A "" "" ""
2 4 B "AB" "A" "B"
3 4 C "ABC" "A" "C"
4 4 D "ABCD" "A" "D"
5 4 A "ABCDA" "A" "A"
6 4 C "BCDAC" "B" "C"
7 4 B "CDACB" "C" "B"
8 7 B "" "" ""
9 7 C "BC" "B" "C"
10 7 D "BCD" "B" "D"
11 7 A "BCDA" "B" "A"
12 7 D "BCDAD" "B" "D"
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/529431.html
標籤:r序列
上一篇:按列和類似日期加入
下一篇:運行省略類別的回歸
