我有一個 data.frame 顯示水質和土地利用引數之間的相關性。在此data.frame你有列xName,yName,corr,和p.value。然而,在列中xName,yName并不是為了質量和單獨使用而單獨的列。
例如,我只想要xName列中的水質資料,而yName列中只想要土地利用資料,因為順序不會改變corr和 p.value
最初,我想創建新的列,mutate其中顯示土地利用和水質是否xName以及是否yName是土地利用和水質,然后使用case_when和recode切換可能性名稱。
但是我無法繼續前進,因為它回傳錯誤并且不會更改名稱。
df=read.table(text="xName yName corr p.value classe_tipox2 classe_tipoy2
IQA_mean soil 0.639727697 0.025073852 water land
OD_mean veg 0.60989011 0.03029001 water land
OD_mean grass -0.576923077 0.042537186 water land
soil N_Total_mean 0.604577823 0.037305053 land water
crop N_Total_mean 0.695600561 0.012007646 land water
crop P_Total_mean -0.624544589 0.029931227 land water
DBO_mean veg 0.797202797 0.003161252 water land
DBO_mean city 0.756757445 0.004382975 water land
DBO_mean grass -0.825174825 0.001718596 water land
veg P_Total_mean -0.587412587 0.048844858 land water
P_Total_mean grass 0.629370629 0.03239475 water land", sep="", header=TRUE)%>%
#change names positions
mutate(xName=case_when(
classe_tipox2=='land' & classe_tipoy2=='water' ~ recode(xName=yName )
))
我想xName重命名為何yName時xName屬于土地利用型別。
uj5u.com熱心網友回復:
您可以將兩列粘貼到一個列中,這樣您就可以使用正則運算式提取所需的字串模式。
library(tidyverse)
df |>
# Paste x and y names into a single column
mutate(names = paste(xName, yName, sep = ";")) |>
# Extract by regular expression
mutate(xName2 = str_extract(names, "[^;]*_mean"),
yName2 = str_extract(names, paste0("[^;",regex(xName2),"][A-z] ")))
# xName2 yName2
#1 IQA_mean soil
#2 OD_mean veg
#3 OD_mean grass
#4 N_Total_mean soil
#5 N_Total_mean crop
#6 P_Total_mean crop
#7 DBO_mean veg
#8 DBO_mean city
#9 DBO_mean grass
#10 P_Total_mean veg
#11 P_Total_mean grass
uj5u.com熱心網友回復:
另一種選擇(雖然有點冗長)。首先,我將資料放入長格式,然后使用ifelse陳述句“修復” xNameand yName,然后將資料轉回寬格式。
library(tidyverse)
df %>%
pivot_longer(c(xName, yName)) %>%
mutate(
name = ifelse(
name == "xName" &
classe_tipox2 == 'land' &
classe_tipoy2 == 'water',
"yName",
ifelse(
name == "yName" &
classe_tipox2 == 'land' &
classe_tipoy2 == 'water',
"xName",
name
)
)
) %>%
pivot_wider(names_from = name, values_from = value) %>%
select(5:6, 1:4) %>%
mutate(classe_tipox2 = "water", classe_tipoy2 = "land")
輸出
# A tibble: 11 × 6
xName yName corr p.value classe_tipox2 classe_tipoy2
<chr> <chr> <dbl> <dbl> <chr> <chr>
1 IQA_mean soil 0.640 0.0251 water land
2 OD_mean veg 0.610 0.0303 water land
3 OD_mean grass -0.577 0.0425 water land
4 N_Total_mean soil 0.605 0.0373 water land
5 N_Total_mean crop 0.696 0.0120 water land
6 P_Total_mean crop -0.625 0.0299 water land
7 DBO_mean veg 0.797 0.00316 water land
8 DBO_mean city 0.757 0.00438 water land
9 DBO_mean grass -0.825 0.00172 water land
10 P_Total_mean veg -0.587 0.0488 water land
11 P_Total_mean grass 0.629 0.0324 water land
uj5u.com熱心網友回復:
請在下面找到一個使用dplyr和兩個ifelse陳述句的簡單解決方案。請注意,我簡化了您的條件,ifelse因為它似乎足夠了。
正品
- 代碼
library(dplyr)
df %>%
mutate(xName2 = ifelse(classe_tipox2 == 'land', yName, xName),
yName = ifelse(classe_tipoy2 == 'water', xName, yName)) %>%
select(-xName) %>%
relocate(xName2, .before = yName) %>%
rename_with(.cols = 1, ~"xName")
- 輸出
#> xName yName corr p.value classe_tipox2 classe_tipoy2
#> 1 IQA_mean soil 0.6397277 0.025073852 water land
#> 2 OD_mean veg 0.6098901 0.030290010 water land
#> 3 OD_mean grass -0.5769231 0.042537186 water land
#> 4 N_Total_mean soil 0.6045778 0.037305053 land water
#> 5 N_Total_mean crop 0.6956006 0.012007646 land water
#> 6 P_Total_mean crop -0.6245446 0.029931227 land water
#> 7 DBO_mean veg 0.7972028 0.003161252 water land
#> 8 DBO_mean city 0.7567574 0.004382975 water land
#> 9 DBO_mean grass -0.8251748 0.001718596 water land
#> 10 P_Total_mean veg -0.5874126 0.048844858 land water
#> 11 P_Total_mean grass 0.6293706 0.032394750 water land
由reprex 包(v2.0.1)于 2021 年 12 月 22 日創建
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/390422.html
