我正在嘗試從包含任何非字母字符的串列中洗掉單詞。我已經嘗試了幾個小時來理解為什么我的正則運算式嘗試失敗。
import re
lst = ["acted", "30-30", "adage", "fatal", "tested", "abcd-ef", "g'day"]
# get rid of words that have any non-alphabet characters
pattern = r"\W"
# pattern = r"[a-z|A-Z]{5}" # tried this as well
for word in lst:
if re.findall(pattern, word):
print(word " not valid")
lst.remove(word)
else:
print(word " valid")
print(lst)
為什么不adage列印為有效但未從串列中洗掉?為什么g'day不被洗掉'?理想情況下,我希望檢查 5 個字母的單詞,但只是將特殊的字符單詞取出來讓我望而卻步,我不想變得更加困惑。
uj5u.com熱心網友回復:
您可以考慮使用類似的回圈,但使用原始串列的副本。像這樣:
lst = ["acted", "30-30", "adage", "fatal", "tested", "abcd-ef", "g'day"]
for i, e in enumerate(lst[:]):
if re.findall(r'\W ', e):
lst.remove(e)
print(lst)
輸出:
['acted', 'adage', 'fatal', 'tested']
uj5u.com熱心網友回復:
正則運算式模式是正確的。正如@JCaeser 所提到的,擁有一個存盤有效單詞的新串列可以正常作業。g'day由于某些索引行為,未檢查該詞。
import re
lst = ["acted", "30-30", "adage", "fatal", "tested", "abcd-ef", "g'day"]
new_lst = []
# get rid of words that have any non-alphabet characters
pattern = r"\W"
for word in lst:
if re.findall(pattern, word):
print(word " not valid")
else:
print(word " valid")
new_lst.append(word)
print(new_lst)
輸出:
['acted', 'adage', 'fatal', 'tested']
uj5u.com熱心網友回復:
嘿,看來您正在使用正則運算式使您的生活復雜化,有一個更簡單的解決方案:
lst = ["acted", "30-30", "adage", "fatal", "tested", "abcd-ef", "g'day"]
fixed_list = list()
for word in lst:
if word.isalpha():
print(word " valid")
fixed_list.append(word)
else:
print(word " not valid")
print(lst)
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/419690.html
標籤:
下一篇:帶有格式的位元組序列的正則運算式
