我有一個字串和一個單詞串列:
string ="""Ventilation box with reference VE03 with soundproofed box with inspection door
with the following technical characteristics: Air flow: 250 l s Available static pressure:
200 Pa With voltage regulator With characteristics according to project technical data
sheets Model make: CVB 4 180 180N or equivalent Includes flexible anti-vibration tarpaulins
at the air connections and metal dampers and supports Includes cable."""
list1 = ["CVB","1100","250"]
list2 = ["CVB","4","180","180N","RE","147W"]
我想檢查字串是否包含串列中的 2 個或更多單詞以及它們是否彼此靠近(例如前后 5 個位置/單詞)。使用“list1”必須為假,因為“CVB”和“250”不在一起,但使用“list2”應該回傳真(“CVB”、“4”、“180”和“180N”在一起)。
我的實際功能只檢測字串中是否有單詞:
count = 0
for word in list1:
if len(re.findall("(?<!\S)" word "(?!\S)", string)) > 0:
count =1
print(count)
uj5u.com熱心網友回復:
我的建議:
def my_function(my_string, list1):
list_to_search_in = my_string.split()
indices = []
for word in list1:
if word in list_to_search_in:
indices.append(list_to_search_in.index(word))
for index1 in indices:
for index2 in indices:
if abs(index1 - index2) <= 5:
return True
return False
還要檢查您的命名約定并避免使用string或list作為變數名之類的東西。
uj5u.com熱心網友回復:
您可以考慮使用:collections.defaultdict
from collections import defaultdict
from itertools import combinations
def any_close_words(text: str, words: list[str], dist: int) -> bool:
words = set(words)
word_to_positions = defaultdict(list)
for index, word in enumerate(text.split()): # Could consider using a nltk tokenizer...
if word in words:
word_to_positions[word].append(index)
for w1_positions, w2_positions in combinations(word_to_positions.values(), r=2):
for a in w1_positions:
for b in w2_positions:
if abs(a - b) <= dist:
return True
return False
def main() -> None:
string = """Ventilation box with reference VE03 with soundproofed box with inspection door
with the following technical characteristics: Air flow: 250 l s Available static pressure:
200 Pa With voltage regulator With characteristics according to project technical data
sheets Model make: CVB 4 180 180N or equivalent Includes flexible anti-vibration tarpaulins
at the air connections and metal dampers and supports Includes cable."""
print(f'{any_close_words(string, ["CVB", "1100", "250"], 5) = }')
print(f'{any_close_words(string ["CVB", "4", "180", "180N", "RE", "147W"], 5) = }')
if __name__ == '__main__':
main()
輸出:
any_close_words(string, ["CVB", "1100", "250"], 5) = False
any_close_words(string, ["CVB", "4", "180", "180N", "RE", "147W"], 5) = True
uj5u.com熱心網友回復:
代碼:
def chk(string, lst):
ls = string.split(' ')
for i,s in enumerate(ls):
if s in lst:
if len(set([ls[i-1],s,ls[i 1]]).intersection (lst))>2:
return True
return False
chk(string, list2)
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/518008.html
上一篇:剩余行總和的矢量化版本
下一篇:從一個函式訪問變數到另一個函式
