我想捕獲整個子查詢,而不管它們之間是否有 concat 或 substring 函式(即忽略子查詢中的另一個括號打開和關閉。(a)我們不想將“join”捕獲為一個詞( b) "alias2" 并不總是跟在 "join" 后面,它可以是任何東西(單詞邊界、空格或“join”單詞)。
情況 1:select 中沒有 concat 或 sub-string 函式
在:
(select t1.col1 as alias1 from db.tb where t1.col1='val1') alias2 join出去:
(select t1.col1 as alias1 from db.tb where t1.col1='val1') alias2
案例 2:選擇中的 Concat 函式
在:
(select concat(t1.col1, t2.col1, t3.col1) as alias1 from db.tb where t1.col1='val1') alias2 join出去:
(select concat(t1.col1, t2.col1, t3.col1) as alias1 from db.tb where t1.col1='val1') alias2
我嘗試過的:
方法一: re.findall('\(select.*?\)\s[a-zA-Z0-9_] ', input statement)
方法 2:正如@TheFourthBird 所建議的
import re
pat1 = '\(select.*?\)\s[a-zA-Z0-9_] '
pat2 = "\(select [^()]*(?:(\((?>[^()] |(?1))*\)))?[^()]*\)[^()\n] "
string1 = "(select t1.col1 as alias1 from db.tb where t1.col1='val1') alias2"
string2 = "(select concat(t1.col1, t2.col1, t3.col1) as alias1 from db.tb where t1.col1='val1') alias2"
print(re.findall(pat1, string1))
print(re.findall(pat1, string2))
import regex as re
print(re.findall(pat2, string1))
print(re.findall(pat2, string2))
pattern = re.compile(pat2, re.UNICODE)
print([match.group(0) for match in pattern.finditer(string2)])
輸出:
["(select t1.col1 as alias1 from db.tb where t1.col1='val1') alias2"]
['(select concat(t1.col1, t2.col1, t3.col1) as']
['']
['(t1.col1, t2.col1, t3.col1)']
["(select concat(t1.col1, t2.col1, t3.col1) as alias1 from db.tb where t1.col1='val1') alias2 join "]
上述方法有什么問題:
方法 1:適用于案例 1,但不適用于案例 2。
方法二:還是不行!然而,
["(select concat(t1.col1, t2.col1, t3.col1) as alias1 from db.tb where t1.col1='val1') alias2 join "]是最接近預期的。但是,它不應該捕獲 alias2 旁邊的內容。
請幫我!
uj5u.com熱心網友回復:
您可以首先(select 使用PyPi 正則運算式模塊匹配并可選地匹配平衡括號。
在模式的末尾,匹配一個空白字符并使用您的字符類。
\(select [^()]*(?:(\((?>[^()] |(?1))*\)))?[^()]*\)\s[a-zA-Z0-9_]
在部分,模式匹配:
\(select比賽(select[^()]*可選匹配除(and之外的任何字符)(?:非捕獲組(捕獲組 1\((?>[^()] |(?1))*\)匹配(并使用遞回模式遞回第一個子組(捕獲組 1),最后匹配)
)關閉第 1 組
)?關閉非捕獲組并使其可選[^()]*\)可選匹配除括號之外的任何字符,然后匹配)\s[a-zA-Z0-9_]匹配一個空格字符和字符類中列出的 1
查看正則運算式演示和Python 演示
例如,使用 re.finditer (因為 re.findall 回傳捕獲組值):
import regex as re
pattern = r"\(select [^()]*(?:(\((?>[^()] |(?1))*\)))?[^()]*\)\s[a-zA-Z0-9_] "
s = ("In: (select t1.col1 as alias1 from db.tb where t1.col1='val1') alias2\n\n"
"Out: (select t1.col1 as alias1 from db.tb where t1.col1='val1') alias2\n\n"
"In: (select concat(t1.col1, t2.col1, t3.col1) as alias1 from db.tb where t1.col1='val1') alias2\n\n"
"Out: (select concat(t1.col1, t2.col1, t3.col1) as alias1 from db.tb where t1.col1='val1') alias2\n")
matches = re.finditer(pattern, s)
for matchNum, match in enumerate(matches, start=1):
print(match.group())
輸出
["(select t1.col1 as alias1 from db.tb where t1.col1='val1') alias2"]
["(select concat(t1.col1, t2.col1, t3.col1) as alias1 from db.tb where t1.col1='val1') alias2"]
請注意,匹配 SQL 很容易出錯。
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/326726.html
