為什么這個正則運算式有時會卡住并凍結我的程式？我可以使用什么替代方案？-有解無憂

import re

input_text_to_check = str(input()) #Input

regex_patron_m1 = r"\s*((?:\w \s*) ) \s*\??(?:would not be what |would not be that |would not be that |would not be the |would not be this |would not be the |would not be some)\s*((?:\w \s*) )\s*\??"
m1 = re.search(regex_patron_m1, input_text_to_check, re.IGNORECASE) #Con esto valido la regex haber si entra o no en el bloque de code

#Validation
if m1:
    word, association = m1.groups()
    word = word.strip()
    association = association.strip()

    print(repr(word))
    print(repr(association))

我認為雖然正則運算式有點長，但對于現代 PC 來說，驗證 10 或 20 個選項不應該做太多作業，(?: | | | | ) 這就是為什么我認為問題可能出在第一個\s*((?:\w \s*) ) \s*和/或最后一個\s*((?:\w \s*) )\s*

以下是導致正則運算式卡住的輸入示例：

"the blue skate would not be that product that you want buy now"

這是一個不會崩潰的例子： "the blue skate would not be that product"

并給我我想要提取的話：

'the blue skate'
'product'

是否有替代方法可以提取這些選項前后的內容？并且它有時不會崩潰？我制作的這個正則運算式出現問題的原因可能是什么？

uj5u.com熱心網友回復：

基于“災難性回溯”的這種解釋，我認為您的正則運算式的問題如下：

您嘗試匹配的事物((?:\w \s*) )可以通過多種方式進行匹配。假設您((?:\w \s*) )在輸入字串上使用abc。這可以通過多種方式匹配：

(a和0空格)(b和0空格)(c和0空格)
(a和0空格)(bc和0空格)
(ab和0空格)(c和0空格)

只要你只需要匹配((?:\w \s*) )這個就可以了。但是當你之后添加其他東西（比如你的情況下的 10 個左右的選項）時，正則運算式需要做一些沉重的回避。查看提供的鏈接以獲得更好的解釋。

在提供的兩種情況下洗掉兩個\w結果后的作業正則運算式：


"\s*((?:\w\s*) ) \s*\??(?:would not be what |would not be that |would not be that |would not be the |would not be this |would not be the |would not be some)\s*((?:\w\s*) )\s*\??"gm

這是否適用于您的設備和所有測驗用例？

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/425384.html

標籤：Python 正则表达式细绳验证回覆

上一篇：如何在Cerberus中使用日期時間型別的最小值？

下一篇：過濾器轉換-空條件