如果我有一些字串(示例字串:)["niiiice", "niiiiiiiceee", "nice", "yummy", "shiiinee", "shine", "hello", "print", "priintering", "priinter", "Howdy", "yuup", "yup", "soooouuuuuppppp", "soup", "yeehaw"],我如何檢查它們是否相似并且只有重復的字符,然后找到檢查中的哪一個應該先行(最小的在前)?(示例輸出:["nice", "niiiice", "niiiiiiiceee", "yummy", "shine", "shiiinee", "hello", "print", "priinter", "priintering", "Howdy", "yup", "yuup", "soup", "soooouuuuuppppp", "yeehaw"]
筆記:
如果可能的話,支票應該按照相同的順序保留其他所有內容。我的意思是,如果有更多沒有相似對應物的字串,它們會留在大致相同的位置。
uj5u.com熱心網友回復:
您可以擠出重復的字符,使“相似”的字串變得相等。
import re
a = ["niiiice", "niiiiiiiceee", "nice", "shiiinee", "shine"]
def squeeze(s):
return re.sub(r'(.)\1 ', r'\1', s)
a.sort(key=lambda s: (squeeze(s), len(s)))
print(a)
輸出:
['nice', 'niiiice', 'niiiiiiiceee', 'shine', 'shiiinee']
或者,如果您只想對連續的“相似”字串組進行排序:
from itertools import groupby
import re
a = ["niiiice", "niiiiiiiceee", "nice", "yummy", "shiiinee", "shine", "hello", "print", "priintering", "priinter", "Howdy", "yuup", "yup", "soooouuuuuppppp", "soup", "yeehaw"]
def squeeze(s):
return re.sub(r'(.)\1 ', r'\1', s)
a = [s for _, g in groupby(a, squeeze) for s in sorted(g, key=len)]
print(a)
輸出:
['nice', 'niiiice', 'niiiiiiiceee', 'yummy', 'shine', 'shiiinee', 'hello', 'print', 'priintering', 'priinter', 'Howdy', 'yup', 'yuup', 'soup', 'soooouuuuuppppp', 'yeehaw']
uj5u.com熱心網友回復:
另一種解決方案,使用itertools.groupby:
import itertools
sorted(["niiiice", "niiiiiiiceee", "nice", "shiiinee", "shine"], key=lambda s: ([k for k, v in itertools.groupby(s)], len(s)))
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/474773.html
