當前代碼:
txt = "Jeor MORMONT, Lord COMMANDER of the NIGHT'S WATCH."
print(re.findall(r"\w |\W ", txt))
輸出:
['Jeor', ' ', 'MORMONT', ', ', 'Lord', ' ', 'COMMANDER', ' ', 'of', ' ', 'the', ' ', 'NIGHT', "'", 'S', ' ', 'WATCH', '.']
所需的輸出:
['Jeor', ' ', 'MORMONT', ', ', 'Lord', ' ', 'COMMANDER', ' ', 'of', ' ', 'the', ' ', 'NIGHT'S', ' ', 'WATCH', '.']
uj5u.com熱心網友回復:
嘗試這個:
txt = "Jeor MORMONT, Lord COMMANDER of the NIGHT'S WATCH."
print(re.findall(r"[\w |\']*|\W ", txt))
uj5u.com熱心網友回復:
您需要使用character set.
您可以通過使用方括號來完成此操作[ ]。使用字符集時,將匹配該集中的字符之一。
當您想要一個單詞字符或'時,您應該使用:
[\w'] |\W
[ ]: 一個字符集,匹配以下選項之一。\w:一個單詞字符(與 相同[a-zA-Z0-9_])。': 符號',無需轉義。
print(re.findall(r"[\w'] |\W ", txt))
# ['Jeor', ' ', 'MORMONT', ', ', 'Lord', ' ', 'COMMANDER', ' ', 'of', ' ', 'the', ' ', "NIGHT'S", ' ', 'WATCH', '.']
uj5u.com熱心網友回復:
你只需要更多地探索正則運算式
>>> print(re.findall(r"[a-zA-Z\'] ", txt))
['Jeor', 'MORMONT', 'Lord', 'COMMANDER', 'of', 'the', "NIGHT'S", 'WATCH']
>>>
更新:
>>> import re
>>>
>>> txt = "Jeor MORMONT, Lord COMMANDER of the NIGHT'S WATCH."
>>>
>>> required = ['Jeor', ' ', 'MORMONT', ', ', 'Lord', ' ', 'COMMANDER', ' ', 'of', ' ', 'the', ' ', 'NIGHT\'S', ' ', 'WATCH', '.']
>>>
>>> bag = re.findall(r'[a-zA-Z\'] |[\ ,] |[\.]', txt)
>>>
>>> print(bag)
['Jeor', ' ', 'MORMONT', ', ', 'Lord', ' ', 'COMMANDER', ' ', 'of', ' ', 'the', ' ', "NIGHT'S", ' ', 'WATCH', '.']
>>> print(bag == required)
True
>>>
如果我錯過了什么,請在此處評論。
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/451879.html
下一篇:在括號中的時間后添加換行符
