我有數千個字串的sub_strings串列和數百萬個字串的串列strings
我想檢查是否strings[i]有子字串sub_strings或子字串在 1 個字符中不同
sub_strings = ['hello']
strings = ['hell dude whats good',
'hllo',
'hallo',
'hello',
'dude whats good']
is_substring_no_more_then_1_differnce(strings , sub_strings)
預期的
[True, True, True, True, False]
uj5u.com熱心網友回復:
解決方法如下:
from Levenshtein import distance as lev
sub_strings = ['hello', 'bob']
strings = ['hell dude whats good',
'hllo',
'hallo',
'hello',
'dude whats good',
'bobo wants some food']
distance_list = []
for sentence in strings:
distance_list.append(min([lev(word, substring) for word in sentence.split() for substring in sub_strings]))
print([x <= 1 for x in distance_list])
那會吐出來[True, True, True, True, False, True]
但是,當您將元素添加到任一串列時,這將變得非常緩慢。strings必須檢查字串內部的每個單詞與substrings. 當每個字串中有數百萬個單詞strings和數千個subtrings.
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/514872.html
標籤:Python算法
上一篇:如何獲得解決方案背后的直覺?
下一篇:如何根據范圍規則標記給定值
