根據特定屬性對物件串列進行排序（或部分排序）-有解無憂

問題

我有一個物件串列。每個物件都有兩個屬性："score"和"coordinates"。我需要根據屬性找到串列中最大的N 個物件score。我遇到的主要問題是僅使用score屬性對物件進行排序。排序可以是部分的。我只對N 個最大的物體感興趣。

當前解決方案

我目前的方法不是最優雅也不是最有效的。這個想法是創建一個dictionaryof 物件indices及其score，然后對分數串列進行排序并使用dictionary來索引產生最大分數的物件。

這些是步驟：

創建一個串列scores。串列的每個元素對應一個物件。即第一個條目是第一個物件的分數，第二個條目是第二個物件的分數，依此類推。
創建一個dictionary使用物件的scoresaskey和 object indexas value。
使用 aheapq對分數串列進行排序以獲得N最大的物件。
使用dictionary來獲取那些具有最大的物件scores。
list僅使用N得分最高的物件創建一個新物件。

代碼片段

這是我的排序功能：

import random
import heapq


# Gets the N objects with the largest score:
def getLargest(N, objects):
    # Set output objects:
    outobjects = objects

    # Get the total of objects in list:
    totalobjects = len(objects)

    # Check if the total number of objects is bigger than the N requested
    # largest objects:

    if totalobjects > N:

        # Get the "score" attributes from all the objects:
        objectScores = [o.score for o in objects]

        # Create a dictionary with the index of the objects and their score.
        # I'm using a dictionary to keep track of the largest scores and
        # the objects that produced them:
        objectIndices = range(totalobjects)
        objectDictionary = dict(zip(objectIndices, objectScores))

        # Get the N largest objects based on score:
        largestObjects = heapq.nlargest(N, objectScores)
        print(largestObjects)

        # Prepare the output list of objects:
        outobjects = [None] * N

        # Look for those objects that produced the
        # largest score:
        for k in range(N):
            # Get current largest object:
            currentLargest = largestObjects[k]
            # Get its original position on the keypoint list:
            position = objectScores.index(currentLargest)
            # Index the corresponding keypoint and store it
            # in the output list:
            outobjects[k] = objects[position]

    # Done:
    return outobjects

此代碼段生成100用于測驗我的方法的隨機物件。最后一個回圈列印N = 3最大的隨機生成的物件score：

# Create a list with random objects:
totalObjects = 100
randomObjects = []


# Test object class:
class Object(object):
    pass


# Generate a list of random objects
for i in range(totalObjects):
    # Instance of objects:
    tempObject = Object()
    # Set the object's random score
    random.seed()
    tempObject.score = random.random()
    # Set the object's random coordinates:
    tempObject.coordinates = (random.randint(0, 5), random.randint(0, 5))
    # Store object into list:
    randomObjects.append(tempObject)

# Get the 3 largest objects sorted by score:
totalLargestObjects = 3
largestObjects = getLargest(totalLargestObjects, randomObjects)

# Print the filtered objects:
for i in range(len(largestObjects)):
    # Get the current object in the list:
    currentObject = largestObjects[i]
    # Get its score:
    currentScore = currentObject.score
    # Get its coordinates as a tuple (x,y)
    currentCoordinates = currentObject.coordinates
    # Print the info:
    print("object: "   str(i)   " score: "   str(currentScore)   " x: "   str(
        currentCoordinates[0])   " y: "   str(currentCoordinates[1]))

我目前的方法可以完成作業，但必須有一種更Pythonic（更矢量化）的方式來實作同樣的事情。我的背景主要是 C ，我還在學習 Python。歡迎任何反饋。

附加資訊

最初，我正在尋找類似于 C 的std:: nth_element. 看起來這個功能在 Python 中由 NumPy 的partition. 不幸的是，雖然std::nth_element支持自定義排序的謂詞，但 NumPypartition不支持。我最終使用了 a heapq，它可以很好地完成作業并按所需順序排序，但我不知道基于一個屬性進行排序的最佳方法。

uj5u.com熱心網友回復：

元組正是您所需要的。不是將分數存盤在堆中，而是在堆中存盤一個元組(score, object)。它將嘗試按分數進行比較并回傳可用于檢索原始物件的元組串列。這將節省您按分數檢索物件的額外步驟：

heapq.nlargest(3, ((obj.score, obj) for obj in randomObjects))
# [(0.9996643881256989, <__main__.Object object at 0x155f730>), (0.9991398955041872, <__main__.Object object at 0x119e928>), (0.9858047551444177, <__main__.Object object at 0x15e38c0>)]

對于現實世界的例子：https : //akuiper.com/console/g6YuNa_1WClp

或者如@shriakhilc 所評論的那樣，使用key引數 inheapq.nlargest指定您要按分數進行比較：

heapq.nlargest(3, randomObjects, lambda o: o.score)

uj5u.com熱心網友回復：

我建議您使用排序的python本機方法 lambda函式。請參閱此處：https : //docs.python.org/3/howto/sorting.html#sortinghowto

基本上，這就是您可以擁有的：

myList = [
  {score: 32, coordinates: [...]},
  {score: 12, coordinates: [...]},
  {score: 20, coordinates: [...]},
  {score: 8, coordinates: [...]},
  {score: 40, coordinates: [...]},
]

# Sort by score DESCENDING
mySortedList = sorted(myList, key=lambda element: element['score'], reverse=True)

# Retrieve top 3 results
myTopResults = mySortedList[0:3]

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/396710.html

標籤：Python 排序堆

上一篇：從按字母順序排序的陣列創建一個物件，其中鍵是第一個字母，值是字串

下一篇：HTTPnginx到HTTPSproxy_pass回傳504BAD網關