我試圖看到隨著時間的推移行業中的職業變化,所以我有工程師、教師和律師等職業 ID,以及與建筑、教育、礦產開采、漁業等行業相對應的幾個 ID。 . 我想提取,從每個職業中最大和最小的變化。資料樣本如下。在這個例子中,我想提取前 3 個最大的積極變化,以及前 3 個最大的消極變化,你們能幫幫我嗎?
x <- data.frame("occ_id" = c(1010, 1010, 1010, 1010, 1010, 1010, 1010,1234,1234,1234,1234, 4321, 4321,4321,4321,4321),
"Ind_id" = c(52418,52417,28339,27138,31224,33103,1112,27138,31224,1112,52418,33103,31224,1112,52417,26301),
"Change_occ_2000_2022" = c(1, -5 , 8 ,9 , - 11 ,15 ,16 ,-50,10,30,-5,20,10,50,30,-50))
然后我嘗試了這個
x %>%
count(Change_occ_2000_2022) %>%
arrange(Change_occ_2000_2022) %>%
slice(c(head(row_number(), 3), tail(row_number(), 3)))
但是這樣做我無法捕獲此更改所屬的對 occ-ind。我想像這樣出現這些變化:
x <- data.frame("occ_id" = c(4321, 4321, 1234, 1234, 4321, 1010),
"Ind_id" = c(1112,52417,1112,27138,26301, 31224 ),
"Change_occ_2000_2022" = c(50,30,30, -50, -50, -11))
uj5u.com熱心網友回復:
library(dplyr)
x %>%
group_by(occ_id) %>%
arrange(-Change_occ_2000_2022) %>%
ungroup() %>%
slice(c(head(row_number(), 3), tail(row_number(), 3)))
輸出:
occ_id Ind_id Change_occ_2000_2022
<dbl> <dbl> <dbl>
1 4321 1112 50
2 1234 1112 30
3 4321 52417 30
4 1010 31224 -11
5 1234 27138 -50
6 4321 26301 -50
uj5u.com熱心網友回復:
library(dplyr)
x %>%
arrange(desc(Change_occ_2000_2022)) %>%
slice(c(1:3, (nrow(.) - 2):nrow(.)))
輸出
occ_id Ind_id Change_occ_2000_2022
1 4321 1112 50
2 1234 1112 30
3 4321 52417 30
4 1010 31224 -11
5 1234 27138 -50
6 4321 26301 -50
uj5u.com熱心網友回復:
基于解決方案LMc
df <- data.frame("occ_id" = c(1010, 1010, 1010, 1010, 1010, 1010, 1010,1234,1234,1234,1234, 4321, 4321,4321,4321,4321),
"Ind_id" = c(52418,52417,28339,27138,31224,33103,1112,27138,31224,1112,52418,33103,31224,1112,52417,26301),
"Change_occ_2000_2022" = c(1, -5 , 8 ,9 , - 11 ,15 ,16 ,-50,10,30,-5,20,10,50,30,-50))
library(data.table)
setDT(df)[order(Change_occ_2000_2022), .SD[c(1:3, (.N-2):.N)]]
#> occ_id Ind_id Change_occ_2000_2022
#> 1: 1234 27138 -50
#> 2: 4321 26301 -50
#> 3: 1010 31224 -11
#> 4: 1234 1112 30
#> 5: 4321 52417 30
#> 6: 4321 1112 50
由reprex 包(v2.0.1)于 2022-05-19 創建
或者
setDT(df)[frankv(Change_occ_2000_2022, ties.method = "dense") <= 2 |
frankv(-Change_occ_2000_2022, ties.method = "dense") <= 2][order(Change_occ_2000_2022)]
如果您需要考慮重復值
uj5u.com熱心網友回復:
x<-x %>%
arrange(Change_occ_2000_2022)
x<-rbind(head(x, 3), tail(x, 3))
輸出:
> x
occ_id Ind_id Change_occ_2000_2022
1 1234 27138 -50
2 4321 26301 -50
3 1010 31224 -11
14 1234 1112 30
15 4321 52417 30
16 4321 1112 50
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/478495.html
