我想計算每個學生在最近一次缺勤之前的缺勤次數,并將這些計數添加為資料框中的一列。
Student ID Absent Date Subject
4567 08/30/2018 M
4567 09/22/2019 M
8345 09/01/2019 S
8345 03/30/2019 PE
8345 07/18/2017 M
5601 01/08/2019 SS
這是所需的輸出:
Student ID Absent Date Subject Previous Absence
4567 08/30/2018 M 1
4567 09/22/2019 M 1
8345 09/01/2019 S 2
8345 03/30/2019 PE 2
8345 07/18/2017 M 2
5601 01/08/2019 SS 0
然后我想計算每個學生在數學(M)中的先前缺勤次數,并將這些計數添加為資料框中的一列。
Student ID Absent Date Subject Previous Absence
4567 08/30/2018 M 1
4567 09/22/2019 M 1
8345 09/01/2019 S 2
8345 03/30/2019 PE 2
8345 07/18/2017 M 2
5601 01/08/2019 SS 0
所需的輸出:
Student ID Absent Date Subject Prior Absence Prior M Absence
4567 08/30/2018 M 1 1
4567 09/22/2019 M 1 1
8345 09/01/2019 S 2 0
8345 03/30/2019 PE 2 0
8345 07/18/2017 M 2 0
5601 01/08/2019 SS 0 0
謝謝!
uj5u.com熱心網友回復:
這假設資料已經按Absent_Date(至少在 each 內Student_ID)排序:
library(dplyr)
df %>%
group_by(Student_ID) %>%
mutate(
n_prior_absence = n() - 1,
n_prior_absence_math = sum(head(Subject, -1) == "M")
)
# # A tibble: 6 × 5
# # Groups: Student_ID [3]
# Student_ID Absent_Date Subject n_prior_absence n_prior_absence_math
# <int> <chr> <chr> <dbl> <int>
# 1 4567 08/30/2018 M 1 1
# 2 4567 09/22/2019 M 1 1
# 3 8345 09/01/2019 S 2 0
# 4 8345 03/30/2019 PE 2 0
# 5 8345 07/18/2017 M 2 0
# 6 5601 01/08/2019 SS 0 0
使用這些資料:
df = read.table(text = 'Student_ID Absent_Date Subject
4567 08/30/2018 M
4567 09/22/2019 M
8345 09/01/2019 S
8345 03/30/2019 PE
8345 07/18/2017 M
5601 01/08/2019 SS', header = T)
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/383586.html
