更改元組串列中的元組-有解無憂

我正在從多個 Excel 檔案中讀取資料并將它們寫回到聚合的 Excel 檔案中。

所以我有這個輸出，它代表了我公司 ( enity-ID) 內多個物體與其他公司 ( debitor-name) 的關系：

debitor_list = [
    ("1", "X AG"),
    ("1", "X AG"),
    ("1", "Z AG"),
    ("2", "X AG"),
    ("2", "X AG"),
    ("3", "LOL AG"),
    ("1", "Z AG"), 
    ("1", "HS AG"),
    ("2", "hs ag")
]

此串列中的元組結構如下：

('entity-ID', 'debitor-name')

此外，我還有一個串列，其中包含有關借方的真實姓名和資訊：

real_file = ["LOLLIPOP AG", "HS AG", "X AG", "Z AG"]

然后我檢查借方姓名之間的相似性debitor_list并real_file替換為真實姓名：

import difflib as dif

for deb in debitor_list:
    for cam in cam_file:
        if deb[1] != cam:
            sequence = dif.SequenceMatcher(
                isjunk=None,
                a=deb[1].lower(),
                b=cam.lower()
            )
            match = sequence.ratio() * 100
            if (match >= 80):
                print(deb[1], cam, match)
                debitor_list.append((deb[0], cam))

輸出：

hs ag HS AG 100.0

我怎樣才能洗掉("2", "hs ag")元組？

uj5u.com熱心網友回復：

要么替換整個串列，要么用一些簡單的邏輯替換元素，請參閱下面的 2 個選項。

請注意，元組可能是不可變的，但串列本身不是......

import difflib as dif

debitor_list = [
    ("1", "X AG"),
    ("1", "X AG"),
    ("1", "Z AG"),
    ("2", "X AG"),
    ("2", "X AG"),
    ("3", "LOL AG"),
    ("1", "Z AG"),
    ("1", "HS AG"),
    ("2", "hs ag"),
]

real_file = ["LOLLIPOP AG", "HS AG", "X AG", "Z AG"]


def fix_stuff(d_list, c_list):
    result = []
    for deb in d_list:
        repl_val = None
        for cam in c_list:
            if deb[1] != cam:
                sequence = dif.SequenceMatcher(
                    isjunk=None, a=deb[1].lower(), b=cam.lower()
                )
                match = sequence.ratio() * 100
                if match >= 80:
                    repl_val = cam
        if repl_val:
            result.append((deb[0], repl_val))
        else:
            result.append(deb)
    return result


print(debitor_list)
new_deb_list = fix_stuff(debitor_list, real_file)
print(new_deb_list)


for idx, deb in enumerate(debitor_list):
    for cam in real_file:
        if deb[1] != cam:
            sequence = dif.SequenceMatcher(isjunk=None, a=deb[1].lower(), b=cam.lower())
            match = sequence.ratio() * 100
            if match >= 80:
                debitor_list[idx] = (deb[0], cam)
print(debitor_list)

輸出

[('1', 'X AG'), ('1', 'X AG'), ('1', 'Z AG'), ('2', 'X AG'), ('2', 'X AG'), ('3', 'LOL AG'), ('1', 'Z AG'), ('1', 'HS AG'), ('2', 'hs ag')]
[('1', 'X AG'), ('1', 'X AG'), ('1', 'Z AG'), ('2', 'X AG'), ('2', 'X AG'), ('3', 'LOL AG'), ('1', 'Z AG'), ('1', 'HS AG'), ('2', 'HS AG')]
[('1', 'X AG'), ('1', 'X AG'), ('1', 'Z AG'), ('2', 'X AG'), ('2', 'X AG'), ('3', 'LOL AG'), ('1', 'Z AG'), ('1', 'HS AG'), ('2', 'HS AG')]

if repl_val檢查值是否需要替換。由于變數在每個 for 的開頭repl_val設定為，因此只有在回圈期間更改它時才會為真。Noneif repl_val

至于 using result，在使用函式時，我們不是修改傳入的串列，而是回傳一個新的串列result。

至于執行此操作的第二種方法（這可能是更好的方法），由于使用了enumerate我們為每個串列元素獲取索引 ( idx) 以及值deb。它允許通過它的索引直接分配給原始串列，因此它是對原始串列的直接修改。

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/533926.html

標籤：Python细绳列表元组相似

上一篇：在字串中搜索多個單詞（python）

下一篇：如何在python的字串中同時將X轉換為Y并將Y轉換為X