我在python中有一個字串。我想得到所有一個詞的子串,所有的 2 個詞的子串和所有的 3 個詞的子串。執行此操作的最有效方法是什么?
我目前的解決方案是這樣的:
>>> s = "This is the example string of which I want to generate subsequent combinations"
>>> words = s.split()
>>> lengths = [1, 2, 3]
>>> ans = []
>>> for ln in lengths:
... for i in range(len(words)-ln 1):
... ans.append(" ".join(words[i:i ln]))
...
>>> print(ans)
['This', 'is', 'the', 'example', 'string', 'of', 'which', 'I', 'want', 'to', 'generate', 'subsequent', 'combinations', 'This is', 'is the', 'the example', 'example string', 'string of', 'of which', 'which I', 'I want', 'want to', 'to generate', 'generate subsequent', 'subsequent combinations', 'This is the', 'is the example', 'the example string', 'example string of', 'string of which', 'of which I', 'which I want', 'I want to', 'want to generate', 'to generate subsequent', 'generate subsequent combinations']
uj5u.com熱心網友回復:
我認為最容易理解(無論如何對我來說)并且可能最快的是處理前兩個詞的特殊情況,然后迭代剩余的詞,同時跟蹤前一個詞。
它具有迄今為止最快的附帶好處。
words = "This is the example string of which I want to generate subsequent combinations".split()
prior_prior_word = words[0]
prior_word = words[1]
ans = [prior_prior_word, prior_word, f"{prior_prior_word} {prior_word}"]
for word in words[2:]:
ans.append(f"{word}")
ans.append(f"{prior_word} {word}")
ans.append(f"{prior_prior_word} {prior_word} {word}")
prior_prior_word = prior_word
prior_word = word
print(ans)
如果你想timeit,你可以嘗試:
import timeit
ruchit = '''
words = "This is the example string of which I want to generate subsequent combinations".split()
def test(words):
lengths = [1, 2, 3]
ans = []
for ln in lengths:
for i in range(len(words)-ln 1):
ans.append(" ".join(words[i:i ln]))
return ans
'''
tom = '''
words = "This is the example string of which I want to generate subsequent combinations".split()
def test(words):
return [' '.join(words[i:i l]) for l in [1,2,3] for i in range(len(words)-l 1)]
'''
jonsg = '''
words = "This is the example string of which I want to generate subsequent combinations".split()
def test(words):
prior_prior_word = words[0]
prior_word = words[1]
ans = [prior_prior_word, prior_word, f"{prior_prior_word} {prior_word}"]
for word in words[2:]:
ans.append(f"{word}")
ans.append(f"{prior_word} {word}")
ans.append(f"{prior_prior_word} {prior_word} {word}")
prior_prior_word = prior_word
prior_word = word
return ans
'''
runs = 1_000_000
print("xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")
print(f"Test: ruchit Time: {timeit.timeit('test(words)', setup=ruchit, number=runs)}")
print("xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")
print(f"Test: tom Time: {timeit.timeit('test(words)', setup=tom, number=runs)}")
print("xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")
print(f"Test: jonsg Time: {timeit.timeit('test(words)', setup=jonsg, number=runs)}")
print("xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")
這給了我:
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Test: ruchit Time: 8.692457999999998
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Test: tom Time: 7.512314900000002
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Test: jonsg Time: 3.7232652
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
你的旅費可能會改變。
uj5u.com熱心網友回復:
你可以這樣做:
from itertools import chain, combinations
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return list(map(lambda x: " ".join(x), chain.from_iterable(combinations(s, r) for r in range(1,4))))
s = "This is the example string of which I want to generate subsequent combinations"
print(powerset(s.split()))
詳細了解請閱讀:https : //stackoverflow.com/a/1482316/17073342
uj5u.com熱心網友回復:
FWIW,你可以做你所擁有的串列理解:
[' '.join(words[i:i l]) for l in [1,2,3] for i in range(len(words)-l 1)]
它更快嗎?一點點:
%%timeit
ans = []
for ln in [1,2,3]:
for i in range(len(words)-ln 1):
ans.append(" ".join(words[i:i ln]))
# 8.46 μs ± 89.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%%timeit
[' '.join(words[i:i l]) for l in [1,2,3] for i in range(len(words)-l 1)]
# 7.03 μs ± 133 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
是否更具可讀性?可能不是。我可能只是堅持你所擁有的。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/330982.html
