正如這個問題中所建議的,我使用 regex_token_iterator<> 將所有匹配的子字串排成一行。但是代碼有時會錯過行中的第二個匹配子字串,并且發生這種錯過的行會在不同的運行中發生變化。這是 regex_token_iterator<> 的錯誤,還是我的代碼有問題?我使用的編譯器是Apple clang version 14.0.0 (clang-1400.0.29.202),我使用-std=c 14編譯了下面的代碼。
我在上面的問題中還嘗試了另一個建議,即使用while-loop重復應用regex_search(),并且該版本的代碼可以正常運行。我只想知道為什么帶有 regex_token_iterator<> 的版本不起作用,我的用法是否錯誤。
代碼:
#include<regex>
#include<iostream>
#include<string>
#include<fstream>
#include<sstream>
using namespace std;
struct bad_from_string : bad_cast{
const char* what() const noexcept override{
return "bad cast from string";
}
};
template<typename T>
T from_string(const string& s){
istringstream is{s};
T t;
if(!(is>>t))
throw bad_from_string{};
return t;
}
int main(){
regex pat{R"((\d{1,2})/(\d{1,2})/(\d{4}))"}; // e.g. 7/21/2022
ifstream ifs{"test_regex_token_iterator.txt"};
ofstream ofs{"test_out_regex_token_iterator.txt"};
regex_token_iterator<string::iterator> rend; // default constructor is used for indicating the end of the sequence
for(string line; getline(ifs, line);){
smatch matches;
string replace_pattern;
int month{0}, day{0}, year{0};
regex_token_iterator<string::iterator> riter(line.begin(), line.end(), pat);
// for each matched substring, replace it individually
while(riter!=rend){
string matched_substring{(*riter).str()};
// *riter returns a reference to the sub_match object riter is pointing to.
// sub_match is not a string. sub_match::str() returns the string of the sub_match.
// put each matched substring into variable "matches"
regex_search(matched_substring, matches, pat);
// get the day, month, and year values in int
day = from_string<int>(matches.str(2));
month = from_string<int>(matches.str(1));
year = from_string<int>(matches.str(3));
// here make replace_pattern yyyy-mm-dd
if(month<10 && day<10)
replace_pattern = to_string(year) "-0" to_string(month) "-0" to_string(day); // both day and month need the fron '0'
else if(month<10)
replace_pattern = to_string(year) "-0" to_string(month) "-" to_string(day);
else if(day<10)
replace_pattern = to_string(year) "-" to_string(month) "-0" to_string(day);
else
replace_pattern = to_string(year) "-" to_string(month) "-" to_string(day);
line = regex_replace(line, regex(matched_substring), replace_pattern); // regex_replace() returns a string
// since I want to replace only 1 matched substring *riter, I use the exact substring
// in the place of regex pattern
riter; // move to the next matched substring
}
ofs << line << endl;
}
return 0;
}
test_regex_token_iterator.txt:
12/01/2022 - 12/31/2022
12/01/2022 - 12/31/2022
12/01/2022 - 12/31/2022
12/01/2022 - 12/31/2022
10/01/2022 - 10/31/2022
10/01/2022 - 10/31/2022
10/01/2022 - 10/31/2022
10/01/2022 - 10/31/2022
10/01/2022 - 10/31/2022
示例 test_out_regex_token_iterator.txt(但結果在不同的運行中會發生變化):
2022-12-01 - 12/31/2022
2022-12-01 - 2022-12-31
2022-12-01 - 12/31/2022
2022-12-01 - 12/31/2022
2022-10-01 - 10/31/2022
2022-10-01 - 2022-10-31
2022-10-01 - 10/31/2022
2022-10-01 - 10/31/2022
2022-10-01 - 10/31/2022
我希望所有匹配的子字串(包括第二列中的日期)都被替換,但只有一部分被正確替換。預期結果:
2022-12-01 - 2022-12-31
2022-12-01 - 2022-12-31
2022-12-01 - 2022-12-31
2022-12-01 - 2022-12-31
2022-10-01 - 2022-10-31
2022-10-01 - 2022-10-31
2022-10-01 - 2022-10-31
2022-10-01 - 2022-10-31
2022-10-01 - 2022-10-31
uj5u.com熱心網友回復:
啟用地址消毒器表明您的代碼導致了未定義的行為:https ://godbolt.org/z/n3rnn9nqY
riter包含來自的迭代器,line但在 while 回圈的末尾,您重新分配line、使line迭代器無效并因此使迭代器無效riter,然后當您嘗試遞增時,riter您進入了未定義行為的領域。
為您的輸出添加一個單獨的字串可以解決問題:https ://godbolt.org/z/Grqe1vv5x
for(string line; getline(ifs, line);){
smatch matches;
string outputLine = line;
string replace_pattern;
int month{0}, day{0}, year{0};
regex_token_iterator<string::iterator> riter(line.begin(), line.end(), pat);
// for each matched substring, replace it individually
while(riter!=rend){
string matched_substring{(*riter).str()};
// *riter returns a reference to the sub_match object riter is pointing to.
// sub_match is not a string. sub_match::str() returns the string of the sub_match.
// put each matched substring into variable "matches"
regex_search(matched_substring, matches, pat);
// get the day, month, and year values in int
day = from_string<int>(matches.str(2));
month = from_string<int>(matches.str(1));
year = from_string<int>(matches.str(3));
// here make replace_pattern yyyy-mm-dd
if(month<10 && day<10)
replace_pattern = to_string(year) "-0" to_string(month) "-0" to_string(day); // both day and month need the fron '0'
else if(month<10)
replace_pattern = to_string(year) "-0" to_string(month) "-" to_string(day);
else if(day<10)
replace_pattern = to_string(year) "-" to_string(month) "-0" to_string(day);
else
replace_pattern = to_string(year) "-" to_string(month) "-" to_string(day);
outputLine = regex_replace(outputLine, regex(matched_substring), replace_pattern); // regex_replace() returns a string
// since I want to replace only 1 matched substring *riter, I use the exact substring
// in the place of regex pattern
riter; // move to the next matched substring
}
ofs << outputLine << endl;
}
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/535565.html
標籤:C 正则表达式
