我正在使用 R 閱讀 PDF 檔案。我想以這種方式轉換給定的文本,每當檢測到多個空格時,我想用某個值(例如“_”)替換它們。我遇到過可以使用“\\s ”(將多個空格合并為單個空格;洗掉尾隨/前導空格)替換所有 1 個或多個空格的問題,但這對我不起作用。我有一個看起來像這樣的字串;
"[1]This is the first address This is the second one
[2]This is the third one
[3]This is the fourth one This is the fifth"
當我應用我找到的答案時;用一個空格替換所有 1 或多個空格,我將無法再識別單獨的地址,因為它看起來像這樣;
gsub("\\s ", " ", str_trim(PDF))
"[1]This is the first address This is the second one
[2]This is the third one
[3]This is the fourth one This is the fifth"
所以我正在尋找的是這樣的
"[1]This is the first address_This is the second one
[2]This is the third one_
[3]This is the fourth one_This is the fifth"
但是,如果我重寫示例中使用的代碼,則會得到以下結果
gsub("\\s ", "_", str_trim(PDF))
"[1]This_is_the_first_address_This_is_the_second_one
[2]This_is_the_third_one_
[3]This_is_the_fourth_one_This_is_the_fifth"
有人會知道解決方法嗎?任何幫助將不勝感激。
uj5u.com熱心網友回復:
每當我遇到字串和正則運算式問題時,我都喜歡參考stringr備忘單:https : //raw.githubusercontent.com/rstudio/cheatsheets/master/strings.pdf
在第二頁上,您可以看到一個標題為“量詞”的部分,它告訴我們如何解決這個問題:
library(tidyverse)
s <- "This is the first address This is the second one"
str_replace(s, "\\s{2,}", "_")
(由于習慣的影響,我正在加載完整的tidyverse而不是僅僅stringr在這里加載)。任何 2 個或更多空白字符都不會被替換為_.
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/346228.html
上一篇:ajax回傳空請求(未定義索引)
