我正在嘗試比較 python 中的兩個串列,其中一個是來自我存盤在串列中的休息請求的回應,另一個是通過 csv 檔案獲得的。我需要比較它們并捕獲從 csv 獲得的第一個串列中不存在的值,該 csv 小于作為我的資料庫回應的第二個串列。
總而言之,檢查 csv 中的值以確保所有內容都已正確存盤并存在于資料庫中。
到目前為止,我已經構建了它來進行比較,但是我在控制臺中得到了很多重復的值,它列印了多次相同的值。
import json, requests, urllib.parse, re
from jsonpath_ng import jsonpath, parse
from pandas.io.parsers import read_csv
import pandas as pd
from termcolor import colored
import numpy as np
from glob import glob
# Set Up
dateinplay = "2021-09-27"
cdwenv1 = "cdwu" # Note that it only works with the http version right now
cdwenv2 = "cdwp" # Control Env, usually cdwp
CoreMurexFilesLoc = r"J:\Gerard\Release197\UAT\day1\Files\Core"
# Dev Static
cdwenv = "" # leave empty
files = glob(CoreMurexFilesLoc "\\*")
# index 1: MHEU_TradeCash_ | 2: TradeCash_
tradeCashfiles = [i for i in files if "TradeCash" in i]
# index 1: MHEU_Trade_ | 2: Trade_
tradeFiles = [i for i in files if "Trade_" in i]
mheu_tradeCash = tradeCashfiles[1]
# tradeCash = tradeCashfiles[2]
# mheu_trade = tradeFiles[1]
# trade = tradeFiles[2]
# filesList = [mheu_trade, trade, mheu_tradeCash, tradeCash]
def read_csv(file):
df_trade = pd.read_csv(
file, delimiter="|", index_col=False, low_memory=False, dtype="unicode"
)
# Drop any blank fields
df_trade.dropna(subset=["MurexCounterpartyRef"], inplace=True)
tradeList = df_trade["MurexCounterpartyRef"]
# remove elemets duplicates
l = []
for i in tradeList:
if i not in l:
l.append(i)
l.sort()
return l
mheu_tradeCash = read_csv(mheu_tradeCash)
# tradeCash = read_csv(tradeCash)
# mheu_trade = read_csv(mheu_trade)
# trade = read_csv(trade)
# Request to get the accountID related to the date
listAccountId = []
cdwCounterparties = f"http://cdwu/cdw/counterparties/{dateinplay}?limit=999999"
r = requests.get(cdwCounterparties).json()
jsonpath_expression = parse("$..accounts.account[*].identifiers.identifier[*]")
for match in jsonpath_expression.find(r):
# print(f'match id: {match.value}')
thisdict = match.value
if thisdict["accountIdType"] == "ACCOUNTID":
# print(thisdict["accountId"])
listAccountId.append(thisdict["accountId"])
# print the quantity of accountID found in the CDW
# print(len(listAccountId))
# Compare the two lists: If the values found in Murex -> csv exist in list of accountId -> cdw
def comparator(murex, cdw):
for i in murex:
for j in cdw:
if i != j:
print(f"The counterparty {i} does not exist in CDW")
# mheu_tradeCash | listAccountId
comparator(mheu_tradeCash, listAccountId)
誰能幫我這個?
uj5u.com熱心網友回復:
對于這種特定情況,您可以執行以下操作,
difference = list(set(mheu_tradeCash) - set(listAccountId))
[print(f"The counterparty {i} does not exist in CDW") for i in difference]
uj5u.com熱心網友回復:
如果您安裝了 numpy,您還可以使用setdiff1d函式,該函式回傳第一個 1D numpy 陣列中不存在于第二個陣列中的唯一值。
import numpy as np
difference = np.setdiff1d(np.array(mheu_tradeCash) - np.array(listAccountId))
可能不需要轉換為 numpy 陣列。
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/347429.html
上一篇:js過濾多頁串列
下一篇:重塑串列/陣列
