根據Python中另一個陣列的值選擇陣列元素的有效方法？-有解無憂

我有兩個陣列，例如一個是標簽，另一個是距離：

labels= array([3, 1, 0, 1, 3, 2, 3, 2, 1, 1, 3, 1, 2, 1, 3, 2, 2, 3, 3, 3, 2, 3,
        0, 3, 3, 2, 3, 2, 3, 2,...])

distances = array([2.32284095, 0.36254613, 0.95734965, 0.35429638, 2.79098656,
        5.45921793, 2.63795657, 1.34516461, 1.34028463, 1.10808795,
        1.60549826, 1.42531201, 1.16280383, 1.22517273, 4.48511033,
        0.71543217, 0.98840598,...])

我想做的是根據唯一標簽值的數量（在本例中為 N=4 ）將距離中的值分組到N個陣列中。因此，標簽 = 3 的所有值都進入一個陣列，標簽 = 2 進入另一個陣列，依此類推。

我可以想到帶有回圈和 if 條件的簡單蠻力，但這會導致大型陣列的嚴重減速。我覺得通過使用本機串列理解或 numpy 或其他東西，有更好的方法來做到這一點，只是不確定是什么。什么是最好、最有效的方法？

“蠻力”示例供參考，注意(len(labels)==len(distances))：

all_distance_arrays = []
for id in np.unique(labels):

   sorted_distances = []
   
   for index in range(len(labels)):

        if id == labels[index]:

          sorted_distances.append(distances[index])
    
   all_distance_arrays.append(sorted_distances)

uj5u.com熱心網友回復：

一個簡單的串列理解會很好而且很快：

groups = [distances[labels == i] for i in np.unique(labels)]

輸出：

>>> groups
[array([0.95734965]),
 array([0.36254613, 0.35429638, 1.34028463, 1.10808795, 1.42531201,
        1.22517273]),
 array([5.45921793, 1.34516461, 1.16280383, 0.71543217, 0.98840598]),
 array([2.32284095, 2.79098656, 2.63795657, 1.60549826, 4.48511033])]

uj5u.com熱心網友回復：

通過僅使用 NumPy 作為：

_, counts = np.unique(labels, return_counts=True)  # counts is the repeatation number of each index
sor = labels.argsort()
sections = np.cumsum(counts)                       # end index of slices
labels_sor = np.split(labels[sor], sections)[:-1]
distances_sor = np.split(distances[sor], sections)[:-1]

uj5u.com熱心網友回復：

對于合理數量的標簽，“蠻力”似乎就足夠了：

from collections import defaultdict

dist_group = defaultdict(list)
for lb, ds in zip(labels, distances):
    dist_group[lb].append(ds)

很難說為什么這不符合您的目的。

uj5u.com熱心網友回復：

您只能使用 numpy 函式來執行此操作。首先以鎖步的方式對陣列進行排序（np.unique無論如何這都是在幕后所做的），然后將它們拆分到標簽更改的位置：

i = np.argsort(labels)
labels = labels[i]
distances = distances[i]
splitpoints = np.flatnonzero(np.diff(labels))   1
result = np.split(distances, splitpoints)
unique_labels = labels[np.r_[0, split_points]]

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/449870.html

標籤：Python 数组麻木的表现

上一篇：如何在不回圈的情況下將唯一組映射到id

下一篇：Python-如何向numpyndarray添加另一個“列”