R：在查看兩個簡單字串的唯一元素時的奇怪結果 -有解無憂

我對我看到的情況絕對感到困惑。我讀了一個 excel 檔案，當我查看一列字串中的唯一值時，我不明白其結果。

我可以在一個最小的reprex中重現這個問題（見下文）：為什么dd有兩個唯一的元素，而dd2只有一個？

歡迎任何建議。

dd < - c(" Grant"/span>。  "Grant"）


dd2 < - c("Grant"。  "Grant"）

獨特的(dd)
#> [1] " Grant" "Grant"。
長度(unique(dd))
#> [1] 2

唯一的(dd2)
#> [1] "Grant"/span>
長度(unique(dd2))
#> [1] 1

sessionInfo（）
#> R版本4.1.1（2021-08-10）
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> 運行在。Debian GNU/Linux 11 (bullseye)。
#> 
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
#> 
#> locale:
#> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C 
#> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 
#> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 
#> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C 
#> [9] LC_ADDRESS=C LC_TELEPHONE=C 
#> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C 
#> 
#> 附帶的基本包：
#> [1] stats graphics grDevices utils datasets methods base 
#> 
#> 通過命名空間加載（并且不附加）：
#> [1] knitr_1.33 magrittr_2.0.1 rlang_0.4.11 fansi_0.5.0 
#> [5] stringr_1.4.0 styler_1.5.1 highr_0.9 tools_4.1.1 
#> [9] xfun_0.25 utf8_1.2.2 withr_2.4.2 htmltools_0.5.1.1
#> [13] ellipsis_0.3.2 yaml_2.2.1 digest_0.6.27 tibble_3.1.3 
#> [17] lifecycle_1.0.0 crayon_1.4.1 purrr_0.3.4 vctrs_0.3.8 
#> [21] fs_1.5.0 glue_1.4.2 evaluate_0.14 rmarkdown_2.10 
#> [25] reprex_2.0.1 stringi_1.7.3 compiler_4.1.1 pillar_1.6.2 
#> [29] backports_1.2.1 pkgconfig_2.0.3

^{創建于2021-09-13，由reprex軟體包（v2.0.1）}

uj5u.com熱心網友回復：

原始值似乎是不同的，可能是因為復制了

sapply(dd, charToRaw) $`Grant`。 [1] ef bb bf 47 72 61 6e 74 $Grant [1] 47 72 61 6e 74

而對于dd2，則是一樣的

。

sapply(dd2, charToRaw)
     補助金補助
[1,]  47 47
[2,]  72 72
[3,]/span> 61 61
[4,] 6E 6E
[5,]  74 74

在第一種情況下，似乎有一個額外的字符

。

nchar（dd）
[1] 6 5

如果我們洗掉第一個字符，unique將是1

unique(c( substring(dd[1]。 2）。  dd[2])>
[1] "Grant"/span>

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/307868.html

標籤：

上一篇：在Golang中將位元組陣列存盤到字串變數中

下一篇：計算字串陣列中每一行的元音C#