如何使用 Python 中的正則運算式捕獲從字串開頭到特定字串/模式每次出現的所有內容?
因此,例如,如果我有一個如下所示的字串,并且我想捕獲所有內容,直到每次出現“UNTIL”:
txt = "Here's some text UNTIL for the 1st time, then some more text UNTIL for the 2nd time, and finally more text UNTIL the 3rd time."
那么輸出應該如下所示:
[
"Here's some text ",
"Here's some text UNTIL for the 1st time, then some more text ",
"Here's some text UNTIL for the 1st time, then some more text UNTIL for the 2nd time, and finally more text ",
]
我已經可以弄清楚的是:
import re
re.findall(r'. ?(?=UNTIL)', txt)
# Output
[
"Here's some text ",
"UNTIL for the 1st time, then some more text ",
"UNTIL for the 2nd time, and finally more text ",
]
但結果并不完全是我需要達到的。我知道我可以通過編程方式解決這個問題,但我正在處理相對較大的檔案,所以我很樂意只用正則運算式來解決它。
有沒有辦法做到這一點?如果是這樣,怎么辦?
uj5u.com熱心網友回復:
解決方案 1
您正在尋找的正則運算式是(?:\b|^)(?=UNTIL(?=.*UNTIL))
import re
txt = "Here's some text UNTIL for the 1st time, then some more text UNTIL for the 2nd time, and finally more text UNTIL the 3rd time."
res = re.split(r"(?:\b|^)(?=UNTIL(?=.*UNTIL))", txt)
解決方案 2
您可以在這里做的最好的事情. ?(?=UNTIL)是將結果轉換為re.findall(r'. ?(?=UNTIL)', txt)預期的格式。
import re
txt = "Here's some text UNTIL for the 1st time, then some more text UNTIL for the 2nd time, and finally more text UNTIL the 3rd time."
arr = re.findall(r'. ?(?=UNTIL)', txt)
res = [''.join(arr[:i 1]) for i in range(len(arr))]
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/491484.html
標籤:Python python-3.x 正则表达式 细绳
