我有一個包含有關比賽資訊的資料框,其中一列顯示了汽車在特定時間點行駛的距離。它看起來像這樣:
data.frame(id = rep(c("A"), each = 15),
distance = seq(from = 1, to = 20, length.out = 15))
id distance
1 A 1.000000
2 A 2.357143
3 A 3.714286
4 A 5.071429
5 A 6.428571
6 A 7.785714
7 A 9.142857
8 A 10.500000
9 A 11.857143
10 A 13.214286
11 A 14.571429
12 A 15.928571
13 A 17.285714
14 A 18.642857
15 A 20.000000
如果我知道一圈是 5 個單位,我想創建一個新列,根據行駛距離告訴每個資料點的圈數。結果應如下所示:
data.frame(id = rep("A", each = 15),
distance = seq(from = 1, to = 20, length.out = 15),
lap = c(1,1,1,2,2,2,2,3,3,3,3,4,4,4,4))
id distance lap
1 A 1.000000 1
2 A 2.357143 1
3 A 3.714286 1
4 A 5.071429 2
5 A 6.428571 2
6 A 7.785714 2
7 A 9.142857 2
8 A 10.500000 3
9 A 11.857143 3
10 A 13.214286 3
11 A 14.571429 3
12 A 15.928571 4
13 A 17.285714 4
14 A 18.642857 4
15 A 20.000000 4
我該怎么做,最好使用 tidyverse?
uj5u.com熱心網友回復:
這是一個整數除法問題。只需將距離除以 5,然后取ceiling,將其四舍五入為最接近的整數。這將為您提供當前圈數:
dplyr::mutate(df, lap = ceiling(distance/5))
id distance lap
1 A 1.000000 1
2 A 2.357143 1
3 A 3.714286 1
4 A 5.071429 2
5 A 6.428571 2
6 A 7.785714 2
7 A 9.142857 2
8 A 10.500000 3
9 A 11.857143 3
10 A 13.214286 3
11 A 14.571429 3
12 A 15.928571 4
13 A 17.285714 4
14 A 18.642857 4
15 A 20.000000 4
uj5u.com熱心網友回復:
另一種選擇是使用group_by(),case_when()和between()。
代碼:
df_desired = df %>%
group_by(id, distance,
lap = case_when(between(distance, 1, 5)~"1",
between(distance, 5, 10)~ "2",
between(distance, 10, 15)~"3",
between(distance, 15, 20)~"4"))
# Desired output
# A tibble: 15 × 3
# Groups: id, distance, lap [15]
id distance lap
<chr> <dbl> <chr>
1 A 1 1
2 A 2.36 1
3 A 3.71 1
4 A 5.07 2
5 A 6.43 2
6 A 7.79 2
7 A 9.14 2
8 A 10.5 3
9 A 11.9 3
10 A 13.2 3
11 A 14.6 3
12 A 15.9 4
13 A 17.3 4
14 A 18.6 4
15 A 20 4
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/530795.html
上一篇:在r中按組計算加權滾動平均值?
下一篇:在R中的矩陣中應用函式
