獲取numpy2d陣列中的所有行索引，其中每行中的元素在整個陣列中存在超過2次-有解無憂

我正在處理定義為二維邊陣列的圖形資料。IE

[[1, 0],
 [2, 5],
 [1, 5],
 [3, 4],
 [1, 4]]

定義一個圖，所有元素定義一個節點id，沒有自回圈，它是有向的，并且一列中沒有值存在于另一列中。

現在問題是，我需要選擇串列中兩個“節點”都出現不止一次的所有邊。我該如何快速做到這一點。目前我正在遍歷每個邊緣并單獨查看節點。感覺這是一種非常糟糕的方法。

當前的啞/慢解決方案

edges = []
for edge in graph:
   src, dst = edge[0], edge[1]
   # Check src for existance in col 1 & 2
   src_fan = np.count_nonzero(graph == src, axis=1).sum()
   dst_fan = np.count_nonzero(graph == dst, axis=1).sum()

   if(src_fan >= 2 and dst_fan >= 2):
     # Add to edges
     edges.append(edge)

我也不完全確定這種方式是否正確......

uj5u.com熱心網友回復：

# Obtain the unique nodes and their counts

from_nodes, from_counts = np.unique(a[:, 0], return_counts = True)
to_nodes, to_counts = np.unique(a[:, 1], return_counts = True)

# Obtain the duplicated nodes

dup_from_nodes = from_nodes[from_counts > 1]
dup_to_nodes = to_nodes[to_counts > 1]

# Obtain the edge whose nodes are duplicated

graph[np.in1d(a[:, 0], dup_from_nodes) & np.in1d(a[:, 1], dup_to_nodes)]
Out[297]: array([[1, 4]])

uj5u.com熱心網友回復：

使用networkx的解決方案：

import networkx as nx

edges = [[1, 0],
 [2, 5],
 [1, 5],
 [3, 4],
 [1, 4]] 

G = nx.DiGraph()
G.add_edges_from(edges)

print([node for node in G.nodes if G.degree[node]>1])

編輯：

print([edge for edge in G.edges if (G.degree[edge[0]]>1) & (G.degree[edge[1]]>1)])

uj5u.com熱心網友回復：

import numpy as np
graph = np.array([[1, 0],
 [2, 5],
 [1, 5],
 [3, 4],
 [1, 4]])

# get a 1d array of all nodes
array = graph.reshape(-1)

# get occurances of each element 
occurances = np.sum(np.equal(array, array[:,np.newaxis]), axis=0)

# reshape back to graph shape
occurances = occurances.reshape(graph.shape)

# check if both edges occur more than once
mask = np.all(occurances > 1, axis=1)

# select the masked elements
edges = graph[mask]

根據我的測驗，這種方法幾乎比接受的答案快 2 倍。

測驗：

import timeit
import numpy as np

graph = np.array([[1, 0],
    [2, 5],
    [1, 5],
    [3, 4],
    [1, 4]])

# accepted answer
def method1(a):
    # Obtain the unique nodes and their counts

    from_nodes, from_counts = np.unique(a[:, 0], return_counts = True)
    to_nodes, to_counts = np.unique(a[:, 1], return_counts = True)

    # Obtain the duplicated nodes

    dup_from_nodes = from_nodes[from_counts > 1]
    dup_to_nodes = to_nodes[to_counts > 1]

    # Obtain the edge whose nodes are duplicated

    return graph[np.in1d(a[:, 0], dup_from_nodes) & np.in1d(a[:, 1], dup_to_nodes)]

# this answer
def method2(graph):
    # get a 1d array of all nodes
    array = graph.reshape(-1)

    # get occurances of each element then reshape back to graph shape
    occurances = np.sum(np.equal(array, array[:,np.newaxis]), axis=0).reshape(graph.shape)

    # check if both edges occur more than once
    mask = np.all(occurances > 1, axis=1)

    # select the masked elements
    edges = graph[mask]

    return edges

print('method1 (accepted answer): ', timeit.timeit(lambda: method1(graph), number=10000))

print('method2 (this answer): ', timeit.timeit(lambda: method2(graph), number=10000))

輸出：

method1 (accepted answer):  0.20238440000000013
method2 (this answer):  0.06534320000000005

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/464614.html

標籤：Python 麻木的双方

上一篇：如何有效地創建具有特定模式的1和0的二進制矩陣？

下一篇：用3個值的串列替換2Dnp.array中的int值以使其成為3D