按FIPS跨多個列聚合（總和），即兩個時間段范圍內的縣ID-有解無憂

根據下面的資料，我如何sum跨兩個時間段和？一些縣可能沒有特定財政年度的資料，但這并不重要，因為我們的想法是將這兩個時間段范圍內的資料相加。當然，最終資料集中會出現一些。我正在使用，因為有幾個同名的縣。因此，不再需要該列，因為它是and的串聯。InflowOutflowNetMigrationInAGIOutAGIFIPS2011-20152016-2020NAsFIPSKeyFIPSYear

所需的輸出模式/列：

    FIPS    County          State   TotInflow   TotOutflow  TotNetMigration TotInAGI    TotOutAGI   Time_Period
12001       Alachua County  FL                                                                      2011-2015
12001       Alachua County  FL                                                                      2016-2020
08001       Adams County    CO                                                                      2011-2015
08001       Adams County    CO                                                                      2016-2020

樣本資料：

     df = structure(list(Key = c("080012020", "120012020", "120012018", 
"120012017", "080012017", "120012016", "120012015", "080012014", 
"120012013", "120012012", "080012012", "080012011", "080012016"
), County = c("Adams County", "Alachua County", "Alachua County", 
"Alachua County", "Adams County", "Alachua County", "Alachua County", 
"Adams County", "Alachua County", "Alachua County", "Adams County", 
"Adams County", "Adams County"), State = c("CO", "FL", "FL", 
"FL", "CO", "FL", "FL", "CO", "FL", "FL", "CO", "CO", "CO"), 
    FIPS = c("08001", "12001", "12001", "12001", "08001", "12001", 
    "12001", "08001", "12001", "12001", "08001", "08001", "08001"
    ), Inflow = c(38L, 261L, 321L, 339L, 58L, 288L, 254L, 46L, 
    413L, 433L, 30L, 42L, NA), InAGI = c(1817L, 6287L, 8423L, 
    8364L, 1865L, 14720L, 5224L, 1074L, 11774L, 10151L, 921L, 
    500L, NA), FiscalYear = c("2019- 2020", "2019- 2020", "2017 - 2018", 
    "2016 - 2017", "2016 - 2017", "2015 - 2016", "2014 - 2015", 
    "2013 - 2014", "2012 - 2013", "2011 - 2012", "2011 - 2012", 
    "2010 - 2011", "2015 - 2016"), Year = c(2020L, 2020L, 2018L, 
    2017L, 2017L, 2016L, 2015L, 2014L, 2013L, 2012L, 2012L, 2011L, 
    2016L), Outflow = c(54L, 447L, 444L, 558L, 44L, 436L, 334L, 
    49L, 466L, 495L, 39L, 31L, 51L), OutAGI = c(1879L, 13106L, 
    15409L, 16496L, 2408L, 12675L, 7448L, 733L, 10309L, 11677L, 
    847L, 605L, 1114L), NetMigration = c(-16L, -186L, -123L, 
    -219L, 14L, -148L, -80L, -3L, -53L, -62L, -9L, 11L, NA)), row.names = c(NA, 
-13L), class = "data.frame")

uj5u.com熱心網友回復：

根據“年”是否介于某個開始年和結束年之間，按“FIPS”、“縣”、“州”和Time_Period從“年”創建的列進行分組，然后通過回圈這些列來獲取sum感興趣的列across名字

library(dplyr)
df %>%
   group_by(FIPS, County, State, 
   Time_Period = case_when(between(Year, 2011, 2015)~
     '2011-2015', between(Year, 2016, 2020)~ '2016-2020')) %>% 
  summarise(across(c(Inflow, InAGI, Outflow, OutAGI, NetMigration), 
   ~ sum(.x, na.rm = TRUE), .names = "Total{.col}"),.groups = "drop")

-輸出

# A tibble: 4 × 9
  FIPS  County         State Time_Period TotalInflow TotalInAGI TotalOutflow TotalOutAGI TotalNetMigration
  <chr> <chr>          <chr> <chr>             <int>      <int>        <int>       <int>             <int>
1 08001 Adams County   CO    2011-2015           118       2495          119        2185                -1
2 08001 Adams County   CO    2016-2020            96       3682          149        5401                -2
3 12001 Alachua County FL    2011-2015          1100      27149         1295       29434              -195
4 12001 Alachua County FL    2016-2020          1209      37794         1885       57686              -676

轉載請註明出處，本文鏈接：https://www.uj5u.com/qianduan/530805.html

標籤：rdplyr

上一篇：每個元素出現在向量中的百分比

下一篇：如果使用單獨表中的預測，如何將SD添加到geom_line？