我有一個包含非英語/英語單詞的字串串列。我只想過濾掉英文單詞。
例子:
phrases = [
"S/O ???? ?????, ????? ?.-4??, S/O Ashok Kumar, Block no.-4D.",
"???????-15, ????? 5. ????? ????? Street-15, sector -5, Civic Centre",
"?????, ?????, ?????, ?????????, Bhilai, Durg. Bhilai, Chhattisgarh,",
]
到目前為止我的代碼:
import re
regex = re.compile("[^a-zA-Z0-9!@#$&()\\-`. ,/\"] ")
for i in phrases:
print(regex.sub(' ', i))
我的輸出:
["S/O , .-4 , S/O Ashok Kumar, Block no.-4D.",
"-15, 5. Street-15, sector -5, Civic Centre",
", , , , Bhilai, Durg. Bhilai, Chhattisgarh",]
我的愿望輸出
["S/O Ashok Kumar, Block no.-4D.",
"Street-15, sector -5, Civic Centre",
"Bhilai, Durg. Bhilai, Chhattisgarh,"]
uj5u.com熱心網友回復:
如果我查看您的資料,您似乎可以使用以下內容:
import regex as re
lst=["S/O ???? ?????, ????? ?.-4??, S/O Ashok Kumar, Block no.-4D.",
"???????-15, ????? 5. ????? ????? Street-15, sector -5, Civic Centre",
"?????, ?????, ?????, ?????????, Bhilai, Durg. Bhilai, Chhattisgarh,",]
for i in lst:
print(re.sub(r'^.*\p{Devanagari}. ?\b', '', i))
印刷:
S/O Ashok Kumar, Block no.-4D.
Street-15, sector -5, Civic Centre
Bhilai, Durg. Bhilai, Chhattisgarh,
查看在線正則運算式演示
^- 開始字串錨;.*\p{Devanagari}- 0 (貪婪)字符直到最后一個梵文字母;. ?\b- 1 (懶惰)字符直到第一個字邊界
uj5u.com熱心網友回復:
如果您的意思是您的字符可能只是標準英文字母,而您的正則運算式適用于此,而您只想過濾掉有問題的“, , , ,”值,您可以執行以下操作:
def format_output(current_output):
results = []
for row in current_output:
# split on the ","
sub_elements = row.split(",").
# this will leave the empty ones as "" in the list which can be filtered
filtered = list(filter(key=lambda x: len(x) > 0, sub_elements))
# then join the elements togheter and append to the final results array
results.append(",".join(filtered))
uj5u.com熱心網友回復:
在我看來,串列中每個元素的第一部分是第二部分的印地語翻譯,單詞數量之間存在一一對應關系。
因此,對于您提供的示例以及任何遵循完全相同模式的示例(如果不這樣做,它將中斷),您所要做的就是獲取陣列每個元素的第二部分。
phrases = ["S/O ???? ?????, ????? ?.-4??, S/O Ashok Kumar, Block no.-4D.",
"???????-15, ????? 5. ????? ????? Street-15, sector -5, Civic Centre",
"?????, ?????, ?????, ?????????, Bhilai, Durg. Bhilai, Chhattisgarh,",]
mod_list = []
for s in list:
tmp_list = []
strg = s.split()
n = len(strg)
for i in range(int(n/2),n):
tmp_list.append(strg[i])
tmp_list = ' '.join(tmp_list)
mod_list.append(tmp_list)
print(mod_list)
輸出:
['S/O Ashok Kumar, Block no.-4D.',
'Street-15, sector -5, Civic Centre',
'Bhilai, Durg. Bhilai, Chhattisgarh,']
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/396518.html
上一篇:在java中將地圖轉換為串列
下一篇:PythonUNO游戲錯誤
