我遇到了一個類似于最長公共子串問題但經過修改的問題。如下:
提供了一個字串串列lst和一個字串text。該字串可能包含也可能不包含串列中存在的子字串。考慮到您從后面開始檢查,我需要其中的第一個最長子字串。我的意思是,從最后一個單詞開始迭代,匹配最長的子字串,并在遇到破壞子字串匹配的字符后回傳。textlsttext text
例如,如果
lst = ['abcd', 'x', 'xy', 'xyz', 'abcdxyz']
text = 'abcd abcd xyz xyz'
那么答案將是xyz文本中的最后一個,因為您從 的后面開始檢查text,它在里面lst,并且它是 的子字串text。
'abcd'不是答案,因為它xyz在text'abcdxyz'不是答案,因為它是text
此外, in text,子字串可以由不在 inside 的任何字符分隔[A-Za-z],但通常它們由空格分隔。
我需要一個演算法來解決這個問題。偽代碼或 Python 程式就可以了。
一些測驗用例
lst = ['brisbane', 'east brisbane', '2 street east']
text = '2 street east brisbane'
# answer is east brisbane
lst = ['brisbane', 'east brisbane', '2 street east']
text = '2 street east east brisbane brisbane xyz'
# answer is brisbane
lst = ['sale', 'yarrabilba']
text = 'sale yarrabilba'
# answer is yarrabilba
lst = ['sale', 'yarrabilba']
text = 'abc fgh xyz'
# answer is None
text = 'A Street Name Some Words Suburb Name'
lst = ['A Street Name', 'Suburb Name']
# answer is 'Suburb Name'
text = 'A Street Name Some Words Name Suburb Name'
lst = ['A Street Name', 'Suburb Name', 'Name']
# answer is 'Suburb Name'
uj5u.com熱心網友回復:
讓我們使用您更好的評論示例:
在原來的問題中, text 可以是類似的字串
'A Street Name Some Words Suburb Name',而 lst 可以是['A Street Name', 'Suburb Name'],那么我只想匹配'Suburb Name'。這是不可能'A Street Name'的'Suburb Name'
如果您必須找到句子的第一個匹配項,那么任務會很容易,您可以使用正則運算式和re.finditer. 然后,讓我們通過反轉單詞來重新處理輸入并執行此操作!
text = 'A Street Name Some Words Suburb Name'
lst = ['A Street Name', 'Suburb Name']
import re
# define a helper function to reverse words
rev = lambda x: ' '.join(reversed(x.split()))
# invert words in the query
txet = rev(text)
# 'Name Suburb Words Some Name Street A'
# invert words in the searched strings
tsl = [rev(e) for e in sorted(lst, key=len, reverse=True)]
# ['Name Street A', 'Name Suburb']
# find "first" match
m = re.finditer('|'.join(tsl), txet)
try:
out = rev(next(m).group())
except StopIteration:
out = None
輸出: 'Suburb Name'
示例#2:
lst = ['brisbane', 'east brisbane', '2 street east']
text = '2 street east east brisbane xyz'
輸出#2: 'east brisbane'
uj5u.com熱心網友回復:
def findstem(arr):
# Determine size of the array
n = len(arr)
# Take first word from array
# as reference
s = arr[0]
l = len(s)
res = ""
for i in range(l):
for j in range(i 1, l 1):
# generating all possible substrings
# of our reference string arr[0] i.e s
stem = s[i:j]
k = 1
for k in range(1, n):
# Check if the generated stem is
# common to all words
if stem not in arr[k]:
break
# If current substring is present in
# all strings and its length is greater
# than current result
if (k 1 == n and len(res) < len(stem)):
res = stem
return res
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/422800.html
標籤:
