簡化C#中的正則運算式代碼：在數字/十進制和單位之間添加空格-有解無憂

我有一個用 C# 撰寫的正則運算式代碼，它基本上在數字和單位之間添加了一個空格，但有一些例外：

dosage_value = Regex.Replace(dosage_value, @"(\d)\s ", @"$1");
dosage_value = Regex.Replace(dosage_value, @"(\d)%\s ", @"$1%");
dosage_value = Regex.Replace(dosage_value, @"(\d (\.\d )?)", @"$1 ");
dosage_value = Regex.Replace(dosage_value, @"(\d)\s %", @"$1% ");
dosage_value = Regex.Replace(dosage_value, @"(\d)\s :", @"$1:");
dosage_value = Regex.Replace(dosage_value, @"(\d)\s e", @"$1e");
dosage_value = Regex.Replace(dosage_value, @"(\d)\s E", @"$1E");

例子：

10ANYUNIT
10:something
10 : something
10 %
40 e-5
40 E-05

應該成為

10 ANYUNIT
10:something
10: something
10%
40e-5
40E-05

例外情況是：%, E, e and :。我已經嘗試過，但是由于我的正則運算式知識不是一流的，有人可以幫助我減少此代碼并獲得相同的預期結果嗎？

謝謝！

uj5u.com熱心網友回復：

對于您的示例資料，您可以使用 2 個捕獲組，其中第二個組位于可選部分中。

在replace的回呼中，檢查捕獲組2是否存在。如果是，則在替換中使用 is，否則添加空格。

(\d (?:\.\d )?)(?:\s*([%:eE]))?

(捕獲組 1
- \d (?:\.\d )? 將 1 位數字與可選的小數部分匹配
) 關閉組 1
(?: 非捕獲組作為一個整體匹配
- \s*([%:eE])% : e E匹配可選的空白字符，并在組 2中捕獲 1
)? 關閉非捕獲組并使其可選

.NET 正則運算式演示

string[] strings = new string[]
{
    "10ANYUNIT",
    "10:something",
    "10 : something",
    "10 %",
    "40 e-5",
    "40 E-05",
};
string pattern = @"(\d (?:\.\d )?)(?:\s*([%:eE]))?";
var result = strings.Select(s => 
    Regex.Replace(
        s, pattern, m => 
        m.Groups[1].Value   (m.Groups[2].Success ? m.Groups[2].Value : " ")
    )
);

Array.ForEach(result.ToArray(), Console.WriteLine);

輸出

10 ANYUNIT
10:something
10: something
10%
40e-5 
40E-05

在 .NET\d中也可以匹配來自其他語言的數字，\s也可以匹配換行符，并且模式的開頭可能是部分匹配，更精確的匹配可以是：

\b([0-9] (?:\.[0-9] )?)(?:[\p{Zs}\t]*([%:eE]))?

uj5u.com熱心網友回復：

我認為你需要這樣的東西：

dosage_value = Regex.Replace(dosage_value, @"(\d (\.\d*)?)\s*((E|e|%|:) )\s*", @"$1$3 ");

第 1 組 - (\d (\.\d*)?)

任何數字，如 123 1241.23

第 2 組 - ((E|e|%|:) )

任何特殊符號，例如 E e % ：

第 1 組和第 2 組可以用任意數量的空格分隔。

如果它沒有按照您的要求作業，請提供一些樣品進行測驗。

uj5u.com熱心網友回復：

對我來說，僅由一個正則運算式處理太復雜了。我建議拆分成單獨的檢查。參見下面的代碼示例 - 我使用了四種不同的正則運算式，首先詳細描述，其余的可以根據第一個解釋推匯出:)

using System.Text.RegularExpressions;

var testStrings = new string[]
{
    "10mg",
    "10:something",
    "10  :   something",
    "10 %",
    "40 e-5",
    "40 E-05",
};

foreach (var testString in testStrings)
{
    Console.WriteLine($"Input: '{testString}', parsed: '{RegexReplace(testString)}'");
}


string RegexReplace(string input)
{
    // First look for exponential notation.
    // Pattern is: match zero or more whitespaces \s*
    // Then match one or more digits and store it in first capturing group (\d )
    // Then match one ore more whitespaces again.
    // Then match part with exponent ([eE][- ]?\d ) and store it in second capturing group.
    // It will match lower or uppercase 'e' with optional (due to ? operator) dash/plus sign and one ore more digits.
    // Then match zero or more white spaces.
    var expForMatch = Regex.Match(input, @"\s*(\d )\s ([eE][- ]?\d )\s*");
    if(expForMatch.Success)
    {
        return $"{expForMatch.Groups[1].Value}{expForMatch.Groups[2].Value}";
    }

    var matchWithColon = Regex.Match(input, @"\s*(\d )\s*:\s*(\w )");
    if (matchWithColon.Success)
    {
        return $"{matchWithColon.Groups[1].Value}:{matchWithColon.Groups[2].Value}";
    }

    var matchWithPercent = Regex.Match(input, @"\s*(\d )\s*%");
    if (matchWithPercent.Success)
    {
        return $"{matchWithPercent.Groups[1].Value}%";
    }

    var matchWithUnit = Regex.Match(input, @"\s*(\d )\s*(\w )");
    if (matchWithUnit.Success)
    {
        return $"{matchWithUnit.Groups[1].Value} {matchWithUnit.Groups[2].Value}";
    }

    return input;
}

輸出是

Input: '10mg', parsed: '10 mg'
Input: '10:something', parsed: '10:something'
Input: '10  :   something', parsed: '10:something'
Input: '10 %', parsed: '10%'
Input: '40 e-5', parsed: '40e-5'
Input: '40 E-05', parsed: '40E-05'

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/410296.html

標籤：

上一篇：使用Perl前瞻斷言來查找單個串列

下一篇：在x章程后添加新行（修復損壞的檔案）