我有由三列組成的資料:
家庭 ID、產品 ID (H14aq2)、價值。
我有大約 7000 行(家庭 ID),可以分為 12 個區和 160 個產品。HH Id 可以出現多次,因為它們消費了多種產品。我的目標是對每個產品的家庭價值求和,這樣我就可以得到一個地區范圍內的產品價值總額。我知道如何手動實作這一點,但我想使用回圈,因為我將對多個資料集執行此操作。
這是我當前的代碼。這實際上運行沒有錯誤,顯示了 156 次迭代,但是當我查看 total_values_05 物件時,只附加了一個額外的向量 val_i。
for(i in 105:161){
total_val_i <- cons_05 %>%
filter(H14aq2 == i) %>%
group_by(Districtn05) %>%
summarise(val_i = sum(total_val_yr)) %>%
ungroup()
total_values_05 <- total_values_05 %>%
left_join(total_val_i)
rm(total_val_i)
}
有 161 種產品(使用從 101 到 161 的變數 H14aq2 進行索引)。在此回圈之前,我創建了物件 total_values_05,出于其他原因,我在其中處理產品 101 到 104。
在每次迭代中,我想過濾單個產品,對包含值的 total_val_yr 變數求和,然后將新向量 val_i 附加到現有物件 total_values_05。最終我想要一個結構如下的物件:
| 區 | val_101 | val_102 | val_103 |
|---|---|---|---|
| 第一的 | 排 | 排 | 排 |
| 第二 | 排 | 排 | 排 |
(直到 val_161 和 12 區)
在我看來,實際上我錯過了一件小事來實際完成這項作業,因為代碼運行并且實際上已經附加了一個名為 val_i 的變數 - 我認為使用 i 索引多個事物存在問題。
這是我第一次嘗試回圈!非常感謝任何幫助:)
這是示例資料(僅包含我的問題所需的 4 個變數)
structure(list(Hhid = structure(c("1033000301", "1033000301",
"1033000301", "1033000301", "1033000301", "1033000301"), label = "Unique hh identifier across panel waves", format.stata = "%-10s"),
Districtn05 = structure(c("Kiboga", "Kiboga", "Kiboga", "Kiboga",
"Kiboga", "Kiboga"), label = "District name as in 2005/06", format.stata = "%-13s"),
H14aq2 = structure(c(150, 135, 140, 136, 112, 103), label = "Consumption item code", format.stata = ".0g", labels = c(Matooke = 101,
Matooke = 102, Matooke = 103, Matooke = 104, `Sweet potatoes fresh` = 105,
`Sweet potatoes dry` = 106, `Cassava fresh` = 107, `Cassava dry/flour` = 108,
`Irish potatoes` = 109, Rice = 110, `Maize grains` = 111,
`Maize cobs` = 112, `Maize flour` = 113, Bread = 114, Millet = 115,
Sorghum = 116, Beef = 117, Pork = 118, `Goat meat` = 119,
`Other meat` = 120, Chicken = 121, `Fresh fish` = 122, `Dry/smoked fish` = 123,
Eggs = 124, `Fresh milk` = 125, `Infant formula foods` = 126,
`Cooking oil` = 127, Ghee = 128, `Margarine,butter` = 129,
`Passion fruits` = 130, `Sweet bananas` = 131, Mangoes = 132,
Oranges = 133, `Other fruits` = 134, Onions = 135, Tomatoes = 136,
Cabbages = 137, Dodo = 138, `Other vegetables` = 139, `Beans fresh` = 140,
`Beans dry` = 141, `Ground nuts in shell` = 142, `Ground nuts shelled` = 143,
`Ground nuts pounded` = 144, Peas = 145, Simsim = 146, Sugar = 147,
Coffee = 148, Tea = 149, Salt = 150, Soda = 151, Beer = 152,
`Other alcoholic drinks` = 153, `Other drinks` = 154, Cigarettes = 155,
`Other tobbaco` = 156, `Expenditure in restaurants on food` = 157,
`Expenditure in restaurants on soda` = 158, `Expenditure in restaurants on beer` = 159,
`Other juice` = 160, `Other foods` = 161), class = c("haven_labelled",
"vctrs_vctr", "double")), total_val_yr = c(3250, 10400, 156000,
10400, 260000, 312000)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame")) ```
uj5u.com熱心網友回復:
您可以按多列分組,然后將匯總結果轉換為如下所示的寬格式:
library(tidyverse)
data <- structure(list(
Hhid = structure(c(
"1033000301", "1033000301",
"1033000301", "1033000301", "1033000301", "1033000301"
), label = "Unique hh identifier across panel waves", format.stata = "%-10s"),
Districtn05 = structure(c(
"Kiboga", "Kiboga", "Kiboga", "Kiboga",
"Kiboga", "Kiboga"
), label = "District name as in 2005/06", format.stata = "%-13s"),
H14aq2 = structure(c(150, 135, 140, 136, 112, 103), label = "Consumption item code", format.stata = ".0g", labels = c(
Matooke = 101,
Matooke = 102, Matooke = 103, Matooke = 104, `Sweet potatoes fresh` = 105,
`Sweet potatoes dry` = 106, `Cassava fresh` = 107, `Cassava dry/flour` = 108,
`Irish potatoes` = 109, Rice = 110, `Maize grains` = 111,
`Maize cobs` = 112, `Maize flour` = 113, Bread = 114, Millet = 115,
Sorghum = 116, Beef = 117, Pork = 118, `Goat meat` = 119,
`Other meat` = 120, Chicken = 121, `Fresh fish` = 122, `Dry/smoked fish` = 123,
Eggs = 124, `Fresh milk` = 125, `Infant formula foods` = 126,
`Cooking oil` = 127, Ghee = 128, `Margarine,butter` = 129,
`Passion fruits` = 130, `Sweet bananas` = 131, Mangoes = 132,
Oranges = 133, `Other fruits` = 134, Onions = 135, Tomatoes = 136,
Cabbages = 137, Dodo = 138, `Other vegetables` = 139, `Beans fresh` = 140,
`Beans dry` = 141, `Ground nuts in shell` = 142, `Ground nuts shelled` = 143,
`Ground nuts pounded` = 144, Peas = 145, Simsim = 146, Sugar = 147,
Coffee = 148, Tea = 149, Salt = 150, Soda = 151, Beer = 152,
`Other alcoholic drinks` = 153, `Other drinks` = 154, Cigarettes = 155,
`Other tobbaco` = 156, `Expenditure in restaurants on food` = 157,
`Expenditure in restaurants on soda` = 158, `Expenditure in restaurants on beer` = 159,
`Other juice` = 160, `Other foods` = 161
), class = c(
"haven_labelled",
"vctrs_vctr", "double"
)), total_val_yr = c(
3250, 10400, 156000,
10400, 260000, 312000
)
), row.names = c(NA, -6L), class = c(
"tbl_df",
"tbl", "data.frame"
))
data %>%
group_by(Districtn05, H14aq2) %>%
summarise(total_val_yr = sum(total_val_yr)) %>%
select(total_val_yr, H14aq2) %>%
pivot_wider(names_from = H14aq2, values_from = total_val_yr, names_prefix = "val_")
#> `summarise()` has grouped output by 'Districtn05'. You can override using the
#> `.groups` argument.
#> Adding missing grouping variables: `Districtn05`
#> # A tibble: 1 × 7
#> # Groups: Districtn05 [1]
#> Districtn05 val_103 val_112 val_135 val_136 val_140 val_150
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Kiboga 312000 260000 10400 10400 156000 3250
由reprex 包于 2022-05-25 創建 (v2.0.0 )
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/482203.html
下一篇:改變參考值時保持相同的退化步長
