我正在嘗試檢測字串中的不可列印字符('\n'、'\r' 等)并在它們之前插入一個反斜杠。因此,例如,如果我有一個字串“Hello\nWorld”,我希望它是“Hello\\nWorld”。我有一個應該這樣做的代碼示例,但它插入了一個雙反斜杠('\\'),所以結果是“Hello\\\nWorld”。有沒有辦法在字串中插入一個反斜杠?
expression = Regex.Replace(expression, @"\p{Cc}", m =>
{
int code = m.Value[0];
return code < 32
? @"\" $"{Convert.ToChar(code)}"
: Convert.ToChar(code).ToString();
});
uj5u.com熱心網友回復:
如果您不想要 TLDR,請跳到最后。
當你寫這個:
var s = "Hello\nWorld";
編譯器將\n轉換為換行符,為您提供:
Hello
World
當你寫這個:
var s = "Hello\\nWorld";
編譯器將\\轉換為單個反斜杠字符,為您提供:
Hello\nWorld
當您撰寫此逐字字串時:
var s = @"Hello\nWorld";
前導 @ 關閉任何斜杠字符的編譯器轉換,因此您得到:
Hello\nWorld
當您在除錯器工具提示或 autos/locals 視窗中查看字串時,它會顯示非逐字字串。即,它向您顯示了您必須粘貼到源代碼中以獲取您想要輸出的字串的字串:

如果您想查看如果您將字串寫入檔案并在記事本中打開它時實際顯示的字串,請單擊字串值旁邊的放大鏡

If you edit the value by writing into the tooltip or the autos window, and you write a verbatim string by preceding it with an @:

Remember that it will go back to being a non-verbatim string when the debugger tooltip shows it to you next:

Here there are now 4 slashes because we edited it by making a verbatim string that had 2 slashes, and 2 real-slashes double up to 4 sourcecode-slashes. This is so that if you pasted it into code as a non-verbatim string, the compiler would convert those 4 slashes down to 2 slashes when compiling..
Hopefully you're now down with "compiler slashes". Here's the next thing to get on board with..
The regex engine is also a compiler of sorts, that also does these conversions.
When you have a regex of "a word character":
\w
You need to get past the C# compiler conversion first - the C# compiler conversion happens at compile time, but the Regex engine conversion happens at runtime
If you just write this:
var r = new Regex("\w");
The compiler will try and convert that \w and choke on it because it doesn't have a slash conversion for \w like it does for \newline or \tab
This means to get the regex engine to see \w you need to do either:
var r = new Regex("\\w");
var r = new Regex(@"\w");
Both of these become \w by the C# compiler so that's what the Regex engine sees
Some slashed-characters have meaning to both the compiler and the regex engine
The regex engine can understand either \n (2 chars: literally a slash followed by an n) or a newline (1 char, character 10 in the ascii table) so to get Regex to hunt for a newline you could:
var r = new Regex("\n"); //compiler converts to newline char
var r = new Regex(@"
"); //source code literally contains a newline char
var r = new Regex(@"\n"); //compiler ignores, regex engine interprets \n as newline
var r = new Regex("\\n"); //compiler converts \\ to \, regex engine interprets \n as newline
So bear in mind this two step conversion. It's probably easiest to use @ strings to turn off compiler conversions and then your slashes get through to the regex engine as you wrote them in the source. If you need to get a " through to Regex, write ""
var r = new Regex(@"He said ""I don't know"" to me");
And also note that in recent visual studio, strings inside a regex get extra helpful syntax highlighting for what the regex engine sees:

Now that we have all that out of the way, and you appreciate the multiple levels of conversion going on, hopefully you can appreciate that you can't do what you're asking with Regex. There isn't any notion that the following string:
Hello
World
Which, in source code would be either:
var s1 = "Hello\nWorld";
var s2 = @"Hello
World";
Could "have a slash placed in front of the newline" and pop back out as \n because it isn't an n in the string. The string "Hello World" with some whitespace between the words doesn;t contain an n at all, anywhere
The compiler has essentially done:
code = code.Replace(@"\n", @"
"); //change slash-n to newline char 10
You cannot invert that with:
var x = code.IndexOf("
"); //find newline char
code = code.Insert(x, @"\"); //insert slash before newline
A string of "slash-newline" is not "slash-n"
The only reversion is:
code = code.Replace(@"
", @"\n"); //replace newline char with slash-n
There aren't slash codes for everything you'll find. About the only thing I guess you could do with your current approach would be something like:
expression = Regex.Replace(expression, @"\p{Cc}", m => $@"\u{(int)m.Value[0]:x4}");
This will take some string like:
Hello
World
And turn it into
Hello\u000aWorld
If you want it to be \n you'll have to code for it (and all the other slash-whatevers) specifically by having a big table of replacements:

Table courtesy of https://www.tutorialspoint.com/csharp/csharp_character_escapes.htm
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/430667.html
上一篇:將數字和日期拆分為單獨的列
