例如,如果我的文本中沒有美分,我希望我的正則運算式同時捕獲“1 美元”或“2 美元和 71 美分”。我目前有
'\d[\d]*( dollar(s)?)(?:\s*(and )\d[\d]( cent(s)?)\b)?')
我已經在這里測驗了它 regexr.com/67etd 它似乎在那里作業,但是當我在 python 中運行它時。正則運算式捕獲的是
(' dollars', 's', '', '', '')
我很抱歉我對正則運算式很陌生,有人有什么建議嗎?
這是我的python代碼:
import re
train = open(r"C:\Users\inigo\PycharmProjects\pythonProject\all-OANC.txt", encoding='utf8')
# didn't have encoding lol
# opens the files
strain = train.read()
# converts the files into a string
train.close()
#pattern = re.compile(r'\$\d[\d,.]*\b(?:\s*million\b)?(?:\s*billion\b)?')
pattern2 = re.compile('\d[\d]*( dollar(s)?)(?:\s*(and )\d[\d]*( cent(s)?)\b)?')
# Finds all numbers which can include commas and decimals that start with $ and if it has a million or a billion at the end
#We need to find patterns so if it contains a dollar keyword afterward it will count the number
matches = pattern2.findall(strain)
for match in matches:
print(match)
uj5u.com熱心網友回復:
試試這個正則運算式:
(\d \s (?:dollars?)(?:\s and\s \d \s cents?)?)\b
正則運算式演示
uj5u.com熱心網友回復:
你可以使用這個正則運算式:
'(\d dollars?)(\s and\s \d{1,2} cents?)?'
uj5u.com熱心網友回復:
在您的正則運算式中:
\d[\d]*( dollar(s)?)(?:\s*(and )\d[\d]( cent(s)?)\b)?
^ ^ ^ ^^ ^ ^ ^ ^ ^ ^ ^
| (2) || (4)- | (6) | |
----(1)---- | -----(5)- |
--------------(3)-------------
這些是您可以進行子匹配的不同組的編號。您有六個組,在左括號的正則運算式中的位置之后編號,因此這說明,在您匹配的輸入字串下,您只能得到您描述的內容。如果你想要數字,你需要在感興趣的子運算式中添加括號,這樣你就可以在一些組中得到它們,這樣:
(\d[\d]*)( dollar(s)?)(?:\s*(and )(\d[\d])( cent(s)?)\b)?
^ ^^ ^ ^ ^^ ^ ^^ ^^ ^ ^ ^ ^
--(1)-- | (3) || (5)- --(6)- | (8) | |
----(2)---- | ----(7)-- |
--------------(4)---------------
(現在您有第八組),您必須在第 1 組中搜索美元金額,在第 6 組中搜索美分金額。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/316100.html
