任何給定的兩個串列中每組單詞的交集/并集python在for回圈中-有解無憂

我試圖將分數定義為任何給定兩個串列中每組單詞的交集/聯合。我知道聯合和交叉點僅適用于集合型別的容器，我一直在努力設定正確但無法正確設定，有人可以幫忙嗎？

corpus = [
    ["i","did","not","like","the","service"],
    ["the","service","was","ok"],
    ["i","was","ignored","when","i","asked","for","service"]
]
tags = ["a","b","c"]
dct_keys = {
    "a":1,
    "b":2,
    "c":3
}
corpus_tags = dict(zip(tags,corpus))

from itertools import combinations
my_keys = list(combinations(tags, 2))

goal_dct = {}
for i in range(len(my_keys)):
    goal_dct[(my_keys[i])] = {"id_alpha":(dct_keys[my_keys[i][0]]),
                             "id_beta"  :(dct_keys[my_keys[i][1]]),
                             "socore" : (len(set1&set3))/(len(set1|set3))} # THIS IS WHAT I WAS TRYING TO ACHIEVE HERE
print(goal_dct)

這就是我試圖定義為分數，以設定示例：

set1 = {"i","did","not","like","the","service"}
set2 = {"the","service","was","ok"}
set3 = {"i","was","ignored","when","i","asked","for","service"}
(len(set1&set3))/(len(set1|set3))

uj5u.com熱心網友回復：

這不會像您認為的那樣做：

(len(set1)&len(set3))/(len(set1)|len(set3))

len回傳一個int. 您可以在整數上使用&and|運算子，但它執行按位運算，這不是您要尋找的。相反，您想在sets上使用這些運算子，然后采用len這些結果集中的：

len(set1 & set3)/len(set1 | set3)

因此，為任意兩個字串（句子）串列生成分數的函式如下所示：

def score(s1: list[str], s2: list[str]) -> float:
    set1, set2 = set(s1), set(s2)
    return len(set1 & set2) / len(set1 | set2)

您可以使用它來為以下所有組合建立分數corpus：

from itertools import combinations
from string import ascii_lowercase

corpus = [
    ["i","did","not","like","the","service"],
    ["the","service","was","ok"],
    ["i","was","ignored","when","i","asked","for","service"]
]
tagged_corpus = dict(zip(ascii_lowercase, corpus))

def score(s1: list[str], s2: list[str]) -> float:
    set1, set2 = set(s1), set(s2)
    return len(set1 & set2) / len(set1 | set2)

goal = {
    (a, b): score(tagged_corpus[a], tagged_corpus[b])
    for a, b in combinations(tagged_corpus, 2)
}

print(goal)  
# ('a', 'b'): 0.25, 
# ('a', 'c'): 0.18181818181818182, 
# ('b', 'c'): 0.2222222222222222}

uj5u.com熱心網友回復：

從您的串列中制作套裝。

set1 = set(some_list)
set2 = set(other_list)
common_items = set1.intersection(set2)

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/447416.html

標籤：Python for循环放

上一篇：如何使用具有不同結尾的鏈接遍歷rselenium中的不同頁面

下一篇：Python-修改for回圈變數