所以我確實有這個 NumPy 陣列結果(最終),我想減少它,我的意思是,如果該值重復,那么我想洗掉第一個值并保持第二個、第三個值重復等等......
import hmac
import hashlib
import time
from argparse import _MutuallyExclusiveGroup
from tkinter import *
import pandas as pd
import base64
import matplotlib.pyplot as plt
import numpy as np
key="800070FF00FF08012"
key=bytes(key,'utf-8')
collision=[]
for x in range(1,1000001):
msg=bytes(f'{x}','utf-8')
digest = hmac.new(key, msg,"sha256").digest()
code = base64.b64encode(digest).decode('utf-8')
code=code[:6]
key=key.replace(key,digest)
collision.append(code)
df=pd.DataFrame(collision)
df=df[df.duplicated(keep=False)]
df_index=df.index.to_numpy()
df=df.values.flatten()
final=np.stack((df_index,df),axis=1)
Results of the variable "final":
I HAVE:
[[14093 'JRp1kX']
[43985 'KGlW7X']
[59212 'pU97Tr']
[90668 'ecTjTB']
[140615 'JRp1kX']
[218480 '25gtjT']
[344174 'dtXg6E']
[380467 'DdHQ3M']
[395699 'vnFw/c']
[503504 'dtXg6E']
[531073 'KGlW7X']
[633091 'ecTjTB']
[671091 'vnFw/c']
[672111 '25gtjT']
[785568 'pU97Tr']
[991540 'DdHQ3M']
[991548 'JRp1kX']]
And I WANT TO HAVE:
[[140615 'JRp1kX']
[503504 'dtXg6E']
[531073 'KGlW7X']
[633091 'ecTjTB']
[671091 'vnFw/c']
[672111 '25gtjT']
[785568 'pU97Tr']
[991540 'DdHQ3M']
[991548 'JRp1kX']]
消除陣列中重復的第一個值。有人有一些適用于我的情況的代碼嗎?
更簡單地說,如果你有這個串列 [1,2,3,4,5,1,3,5,5] 我想有 [2,4,1,3,5,5]
uj5u.com熱心網友回復:
df = pd.DataFrame([1, 2, 3, 4, 5, 1, 3, 5, 5])
# keep the unique rows
unique_mask = ~df.duplicated(keep=False)
# keep the repeated rows (skipping the first for each non-unique)
repeated_mask = df.duplicated()
df.loc[unique_mask | repeated_mask]
0
1 2
3 4
5 1
6 3
7 5
8 5
uj5u.com熱心網友回復:
final是一個 numpy 陣列,因此您可以np.unique在第二列上使用來獲取第一次出現的索引和出現次數,以避免洗掉單個值
_, idx, counts = np.unique(final[:, 1], return_index=True, return_counts=True)
idx = idx[counts > 1]
final = np.delete(final, idx, axis=0)
這將適用于ndarray您的第二個一維陣列示例使用
_, idx, counts = np.unique(final, return_index=True, return_counts=True)
uj5u.com熱心網友回復:
也許你可以創建for回圈。
to_remove = list()
for i in range(len(your_list)):
if your_list[i] in your_list[i:]:
to_remove.append(i)
removed_count = 0
for i in to_remove:
del your_list[i - removed_count]
removed_count = 1
您不能del在第一個周期立即進行,因為i要迭代下一個數字,這將導致每次洗掉一個數字時都會跳過數字。
[i - removed_count]因為每次洗掉較低的索引時,較高的索引都會立即減少一。
我認為它可以以更有效的方式撰寫,但這應該可以作業,也許幾乎沒有什么變化。
uj5u.com熱心網友回復:
生成 df 后,添加以下行:
df=pd.DataFrame(collision)
# ... your code ends here
removed_already=[]
for idx in df[df.duplicated(keep=False)].index:
if df.loc[idx][0] not in removed_already:
removed_already.append(df.loc[idx][0])
df.drop(index=idx, inplace=True)
# your code continues
df_index=df.index.to_numpy()
df=df.values.flatten()
final=np.stack((df_index,df),axis=1)
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/530651.html
