替換字串中的多個空格，但保留單個空格-有解無憂

我正在使用 R 閱讀 PDF 檔案。我想以這種方式轉換給定的文本，每當檢測到多個空格時，我想用某個值（例如“_”）替換它們。我遇到過可以使用“\\s ”（將多個空格合并為單個空格；洗掉尾隨/前導空格）替換所有 1 個或多個空格的問題，但這對我不起作用。我有一個看起來像這樣的字串；

"[1]This is the first address                                          This is the second one
 [2]This is the third one                                                                     
 [3]This is the fourth one                                             This is the fifth"

當我應用我找到的答案時；用一個空格替換所有 1 或多個空格，我將無法再識別單獨的地址，因為它看起來像這樣；

gsub("\\s ", " ", str_trim(PDF))

"[1]This is the first address This is the second one
 [2]This is the third one                                                                     
 [3]This is the fourth one This is the fifth"

所以我正在尋找的是這樣的

"[1]This is the first address_This is the second one
 [2]This is the third one_                                                                     
 [3]This is the fourth one_This is the fifth"

但是，如果我重寫示例中使用的代碼，則會得到以下結果

gsub("\\s ", "_", str_trim(PDF))

"[1]This_is_the_first_address_This_is_the_second_one
 [2]This_is_the_third_one_                                                                     
 [3]This_is_the_fourth_one_This_is_the_fifth"

有人會知道解決方法嗎？任何幫助將不勝感激。

uj5u.com熱心網友回復：

每當我遇到字串和正則運算式問題時，我都喜歡參考stringr備忘單：https : //raw.githubusercontent.com/rstudio/cheatsheets/master/strings.pdf

在第二頁上，您可以看到一個標題為“量詞”的部分，它告訴我們如何解決這個問題：

library(tidyverse)

s <- "This is the first address                                          This is the second one"

str_replace(s, "\\s{2,}", "_")

（由于習慣的影響，我正在加載完整的tidyverse而不是僅僅stringr在這里加載）。任何 2 個或更多空白字符都不會被替換為_.

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/346228.html

標籤：r pdf

上一篇：ajax回傳空請求（未定義索引）

下一篇：在C#中將PDF拆分為塊并將塊合并為單個PDF