假設我有一個包含不同年份出生的人的資料集:
ID year birth_year outcome
1 10021 2015 1960 1
2 10021 2016 1960 1
3 10021 2017 1960 1
4 10021 2018 1960 0
5 10021 2019 1960 0
6 10022 2015 1968 1
7 10022 2016 1968 0
8 10022 2017 1968 0
9 10022 2018 1968 0
10 10022 2019 1968 0
11 10023 2015 1968 1
12 10023 2016 1968 1
13 10023 2017 1968 1
14 10023 2018 1968 1
15 10023 2019 1968 1
16 10024 2015 1961 0
17 10024 2016 1961 0
18 10024 2017 1961 0
19 10024 2018 1961 1
20 10024 2019 1961 1
我想根據出生年份將這個資料集拆分成更小的資料集,并將它們存盤為year1960
,year1961
和year1968
。具體來說,
> year1960
ID year birth_year outcome
1 10021 2015 1960 1
2 10021 2016 1960 1
3 10021 2017 1960 1
4 10021 2018 1960 0
5 10021 2019 1960 0
> year1961
1 10024 2015 1961 0
2 10024 2016 1961 0
3 10024 2017 1961 0
4 10024 2018 1961 1
5 10024 2019 1961 1
> year1968
1 10022 2015 1968 1
2 10022 2016 1968 0
3 10022 2017 1968 0
4 10022 2018 1968 0
5 10022 2019 1968 0
6 10023 2015 1968 1
7 10023 2016 1968 1
8 10023 2017 1968 1
9 10023 2018 1968 1
10 10023 2019 1968 1
我如何以最少的步驟做到這一點?
uj5u.com熱心網友回復:
可能有更短/更好的方法來做到這一點,但他會起作用,你最終會得到每個出生年份的個人資料框。
# read data
df <-read.csv('data.csv')
# split data by 'birth_year' into list of data frames
df_split <- split(df, with(df, birth_year))
# rename elements of list
names(df_split) <- paste0('year', names(df_split))
# create individual dataframes from list
list2env(df_split, env = .GlobalEnv)
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/536390.html
標籤:r
上一篇:從另一列中找到符合條件的最大值