從兩個檔案中讀取每一行并列印其他檔案中不存在的行-有解無憂

團隊，我有兩個檔案有一些重復。我想列印或創建具有獨特串列的新串列。但是，我的串列列印為空。不知道為什么

f1 = open(file1, 'r')
f2 = open(file2, 'r')
unique = []
for lineA in f1.readlines():
        for lineB in f2.readlines():
            if lineA != lineB:
                print("lineA not equal to lineB", lineA, lineB)
            else:
                unique.append(lineB)
print(unique)

輸出

lineA not equal to lineB  node789
  node321

lineA not equal to lineB  node789
 node12345

[]

預期的

lineA not equal to lineB  node789
  node321

lineA not equal to lineB  node789
 node12345

[node321,node12345]

查看評論串列的第二種方法正在填充，但全部為空且無法識別實際字串。

 [~] $ cat  ~/backup/2strings.log
restr1
restr2

 [~] $ cat ~/backup/4strings.log 
restr1
restr2
restr3
restr4

file2 = os.environ.get('HOME')   '/backup/2strings.log'
file1 = os.environ.get('HOME')   '/backup/4strings.log'
f1 = open(file1, 'r')
f2 = open(file2, 'r')
unique = []
for lineA in f1.readlines():
        for lineB in f2.readlines():
            # if lineA.rstrip() != lineB.rstrip():
            if lineA.strip() != lineB.strip():
                print("lineA not equal to lineB", lineA, lineB)
            else:
                print("found uniq")
        unique.append(lineB.rstrip())
print(unique)
print(len(unique))

輸出

found uniq
lineA not equal to lineB restr1
 restr2

lineA not equal to lineB restr1
 

['', '', '', '', '']
5

uj5u.com熱心網友回復：

我建議您使用不同但更簡單的方法。使用sets資料結構。鏈接 - https://docs.python.org/3/tutorial/datastructures.html#sets

偽代碼

unique = []
items01 = set([line.strip() for line in open(file1).readlines()])
items02 = set([line.strip() for line in open(file2).readlines()])

# unique items not present file2
print(list(items01 - items02))
unique  = list(items01 - items02)

# unique items not present file2
print(list(items02 - items01))
unique  = list(items02 - items01)

# all unique items
print(unique)

在您的代碼中，您使用 file01 作為檢查 file01 中的專案的參考。你也需要做相反的事情。第二個挑戰是時間復雜度太高。Python 集合在內部進行哈希以提高性能，因此請使用集合。

uj5u.com熱心網友回復：

正如我從您發布的內容中看到的那樣，您的預期輸出偏離實際輸出的唯一方式是 node321 和 node12345 未添加到unique最后列印的串列中。這不足為奇，因為在您的代碼中，您在where和match 的情況下追加（因為追加發生在lineBafter中）。uniquelineAlineBelseif lineA != lineB:

轉載請註明出處，本文鏈接：https://www.uj5u.com/yidong/439641.html

標籤：Python python-3.x grep

上一篇：在熊貓資料框中洗掉重復項

下一篇：按日期更新DataFrame中的值