我有 11 個資料框,其中包含切薩皮克海草調查的各種觀察結果。每個資料框都包含以下變數(包括示例值)。有 11 個資料框,每個資料框代表來自單個 SAMPYR 的觀察結果。所以:
> head(density.2007)
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS
1 HI 2 1.0 50 2006 2007 1 6.0
2 HI 5 0.5 100 2006 2007 1 11.6
3 HI 7 0.5 50 2006 2007 1 6.0
4 HI 9 0.5 100 2006 2007 1 9.6
5 HI 10 1.0 100 2006 2007 1 30.0
6 HI 23 1.0 50 2006 2007 1 40.4
> head(density.2008)
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS NOTES id
29 HI 1 1.0 100 2007 2008 1 39.6 29
30 HI 2 1.0 50 2006 2008 2 54.8 30
31 HI 3 0.5 100 2007 2008 1 11.2 31
32 HI 4 1.0 100 2007 2008 1 8.8 32
33 HI 5 0.5 100 2006 2008 2 24.0 33
34 HI 7 0.5 50 2006 2008 2 0.0 34
我想撰寫一個 for 回圈,它從 PLOT 列中獲取唯一字符的數量,并計算每個字符的頻率(這樣我就可以過濾,只列出那些出現多次的字符)。
到目前為止,我所擁有的是:
density.names <- c("density.2007",
"density.2008",
"density.2009",
"density.2010",
"density.2011",
"density.2012",
"density.2013",
"density.2014",
"density.2015",
"density.2016",
"density.2017"
)
for(i in 1:length(density.names)) {
get(density.names[i]) %>%
count(PLOT) %>%
print()
}
此代碼輸出
print()
PLOT n
1 HI 1 1
2 HI 10 1
3 HI 100 1
4 HI 103 1
5 HI 104 1
6 HI 11 1
7 HI 13 1
8 HI 14 1
9 HI 15 1
10 HI 17 1
11 HI 18 1
12 HI 2 1
13 HI 20 1
14 HI 21 1
15 HI 23 1
16 HI 25 1
17 HI 27 1
18 HI 29 1
19 HI 3 1
20 HI 31 1
21 HI 32 1
22 HI 36 1
23 HI 37 1
24 HI 38 1
25 HI 39 1
26 HI 4 1
27 HI 40 1
但我對此無能為力。有沒有辦法讓我過濾行,所以只有那些 an=2 出現?或者從 for 回圈中列印 11 個資料幀,以便我可以進一步操作它們,但至少我會在全域環境中擁有它們的副本?
謝謝!如果有幫助,我可以提供更多詳細資訊。
uj5u.com熱心網友回復:
不要回圈執行!它是完全不同的。我會一步一步地告訴你。我的第一步是準備一個函式,該函式將生成與您的資料相似的資料。
library(tidyverse)
dens = function(year, n) tibble(
PLOT = paste("HI", sample(1:(n/7), n, replace = T)),
SIZE = runif(n, 0.1, 3),
DENSITY = sample(seq(50,200, by=50), n, replace = T),
SEEDYR = year-1,
SAMPYR = year,
AGE = sample(1:5, n, replace = T),
SHOOTS = runif(n, 0.1, 3)
)
讓我們看看它是如何作業的并生成一些示例資料框
set.seed(123)
density.2007 = dens(2007, 120)
density.2008 = dens(2008, 88)
density.2009 = dens(2009, 135)
density.2010 = dens(2010, 156)
density.2007資料框看起來像這樣
# A tibble: 120 x 7
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl>
1 HI 15 1.67 200 2006 2007 4 1.80
2 HI 14 0.270 150 2006 2007 2 2.44
3 HI 3 0.856 50 2006 2007 3 0.686
4 HI 10 1.25 200 2006 2007 5 1.43
5 HI 11 0.673 50 2006 2007 5 1.40
6 HI 5 2.51 150 2006 2007 3 2.23
7 HI 14 0.543 150 2006 2007 2 2.17
8 HI 5 2.43 200 2006 2007 5 2.51
9 HI 9 1.69 100 2006 2007 4 2.67
10 HI 3 2.02 50 2006 2007 2 2.86
# ... with 110 more rows
現在需要將它們組合成一幀
df = density.2007 %>%
bind_rows(density.2008) %>%
bind_rows(density.2009) %>%
bind_rows(density.2010)
輸出
# A tibble: 499 x 7
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl>
1 HI 15 1.67 200 2006 2007 4 1.80
2 HI 14 0.270 150 2006 2007 2 2.44
3 HI 3 0.856 50 2006 2007 3 0.686
4 HI 10 1.25 200 2006 2007 5 1.43
5 HI 11 0.673 50 2006 2007 5 1.40
6 HI 5 2.51 150 2006 2007 3 2.23
7 HI 14 0.543 150 2006 2007 2 2.17
8 HI 5 2.43 200 2006 2007 5 2.51
9 HI 9 1.69 100 2006 2007 4 2.67
10 HI 3 2.02 50 2006 2007 2 2.86
# ... with 489 more rows
在下一步中,計算PLOT變數的每個值出現了多少次
PLOT.count = df %>%
group_by(PLOT) %>%
summarise(PLOT.n = n()) %>%
arrange(PLOT.n)
輸出
# A tibble: 22 x 2
PLOT PLOT.n
<chr> <int>
1 HI 20 3
2 HI 22 5
3 HI 21 7
4 HI 18 12
5 HI 2 19
6 HI 1 20
7 HI 15 20
8 HI 17 21
9 HI 6 22
10 HI 11 23
# ... with 12 more rows
在倒數第二步,讓我們將這些計數器附加到原始資料幀中
df = df %>% left_join(PLOT.count, by="PLOT")
輸出
# A tibble: 499 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 15 1.67 200 2006 2007 4 1.80 20
2 HI 14 0.270 150 2006 2007 2 2.44 32
3 HI 3 0.856 50 2006 2007 3 0.686 27
4 HI 10 1.25 200 2006 2007 5 1.43 25
5 HI 11 0.673 50 2006 2007 5 1.40 23
6 HI 5 2.51 150 2006 2007 3 2.23 38
7 HI 14 0.543 150 2006 2007 2 2.17 32
8 HI 5 2.43 200 2006 2007 5 2.51 38
9 HI 9 1.69 100 2006 2007 4 2.67 26
10 HI 3 2.02 50 2006 2007 2 2.86 27
# ... with 489 more rows
Now filter it at will
df %>% filter(PLOT.n > 30)
ouptut
# A tibble: 139 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 14 0.270 150 2006 2007 2 2.44 32
2 HI 5 2.51 150 2006 2007 3 2.23 38
3 HI 14 0.543 150 2006 2007 2 2.17 32
4 HI 5 2.43 200 2006 2007 5 2.51 38
5 HI 8 0.598 50 2006 2007 1 1.70 34
6 HI 7 1.94 50 2006 2007 4 1.61 35
7 HI 14 2.91 50 2006 2007 4 0.215 32
8 HI 7 0.846 150 2006 2007 4 0.506 35
9 HI 7 2.38 150 2006 2007 3 1.34 35
10 HI 7 2.62 100 2006 2007 3 0.167 35
# ... with 129 more rows
Or this way
df %>% filter(PLOT.n == min(PLOT.n))
df %>% filter(PLOT.n == median(PLOT.n))
df %>% filter(PLOT.n == max(PLOT.n))
output
# A tibble: 3 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 20 0.392 200 2009 2010 1 0.512 3
2 HI 20 0.859 150 2009 2010 5 2.62 3
3 HI 20 0.882 200 2009 2010 5 1.06 3
> df %>% filter(PLOT.n == median(PLOT.n))
# A tibble: 26 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 9 1.69 100 2006 2007 4 2.67 26
2 HI 9 2.20 50 2006 2007 4 1.49 26
3 HI 9 0.587 200 2006 2007 3 1.13 26
4 HI 9 1.27 50 2006 2007 1 2.55 26
5 HI 9 1.56 150 2006 2007 3 2.01 26
6 HI 9 0.198 100 2006 2007 3 2.08 26
7 HI 9 2.72 150 2007 2008 3 0.421 26
8 HI 9 0.251 200 2007 2008 2 0.328 26
9 HI 9 1.83 50 2007 2008 1 0.192 26
10 HI 9 1.97 100 2007 2008 1 0.900 26
# ... with 16 more rows
> df %>% filter(PLOT.n == max(PLOT.n))
# A tibble: 38 x 8
PLOT SIZE DENSITY SEEDYR SAMPYR AGE SHOOTS PLOT.n
<chr> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 HI 5 2.51 150 2006 2007 3 2.23 38
2 HI 5 2.43 200 2006 2007 5 2.51 38
3 HI 5 2.06 100 2006 2007 5 1.93 38
4 HI 5 1.25 150 2006 2007 4 2.29 38
5 HI 5 2.29 200 2006 2007 1 2.97 38
6 HI 5 0.789 150 2006 2007 2 1.59 38
7 HI 5 1.11 100 2007 2008 4 2.61 38
8 HI 5 2.38 150 2007 2008 4 2.95 38
9 HI 5 2.67 200 2007 2008 3 1.77 38
10 HI 5 2.63 100 2007 2008 1 1.90 38
# ... with 28 more rows
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/432754.html
上一篇:函式不會回圈
下一篇:回圈遍歷串列并向新資料框列添加值
