問題
我有一個物件串列。每個物件都有兩個屬性:"score"和"coordinates"。我需要根據屬性找到串列中最大的N 個物件score。我遇到的主要問題是僅使用score屬性對物件進行排序。排序可以是部分的。我只對N 個最大的物體感興趣。
當前解決方案
我目前的方法不是最優雅也不是最有效的。這個想法是創建一個dictionaryof 物件indices及其score,然后對分數串列進行排序并使用dictionary來索引產生最大分數的物件。
這些是步驟:
創建一個串列
scores。串列的每個元素對應一個物件。即第一個條目是第一個物件的分數,第二個條目是第二個物件的分數,依此類推。創建一個
dictionary使用物件的scoresaskey和 objectindexasvalue。使用 a
heapq對分數串列進行排序以獲得N最大的物件。使用
dictionary來獲取那些具有最大 的物件scores。list僅使用N得分最高的物件創建一個新物件。
代碼片段
這是我的排序功能:
import random
import heapq
# Gets the N objects with the largest score:
def getLargest(N, objects):
# Set output objects:
outobjects = objects
# Get the total of objects in list:
totalobjects = len(objects)
# Check if the total number of objects is bigger than the N requested
# largest objects:
if totalobjects > N:
# Get the "score" attributes from all the objects:
objectScores = [o.score for o in objects]
# Create a dictionary with the index of the objects and their score.
# I'm using a dictionary to keep track of the largest scores and
# the objects that produced them:
objectIndices = range(totalobjects)
objectDictionary = dict(zip(objectIndices, objectScores))
# Get the N largest objects based on score:
largestObjects = heapq.nlargest(N, objectScores)
print(largestObjects)
# Prepare the output list of objects:
outobjects = [None] * N
# Look for those objects that produced the
# largest score:
for k in range(N):
# Get current largest object:
currentLargest = largestObjects[k]
# Get its original position on the keypoint list:
position = objectScores.index(currentLargest)
# Index the corresponding keypoint and store it
# in the output list:
outobjects[k] = objects[position]
# Done:
return outobjects
此代碼段生成100用于測驗我的方法的隨機物件。最后一個回圈列印N = 3最大的隨機生成的物件score:
# Create a list with random objects:
totalObjects = 100
randomObjects = []
# Test object class:
class Object(object):
pass
# Generate a list of random objects
for i in range(totalObjects):
# Instance of objects:
tempObject = Object()
# Set the object's random score
random.seed()
tempObject.score = random.random()
# Set the object's random coordinates:
tempObject.coordinates = (random.randint(0, 5), random.randint(0, 5))
# Store object into list:
randomObjects.append(tempObject)
# Get the 3 largest objects sorted by score:
totalLargestObjects = 3
largestObjects = getLargest(totalLargestObjects, randomObjects)
# Print the filtered objects:
for i in range(len(largestObjects)):
# Get the current object in the list:
currentObject = largestObjects[i]
# Get its score:
currentScore = currentObject.score
# Get its coordinates as a tuple (x,y)
currentCoordinates = currentObject.coordinates
# Print the info:
print("object: " str(i) " score: " str(currentScore) " x: " str(
currentCoordinates[0]) " y: " str(currentCoordinates[1]))
我目前的方法可以完成作業,但必須有一種更Pythonic(更矢量化)的方式來實作同樣的事情。我的背景主要是 C ,我還在學習 Python。歡迎任何反饋。
附加資訊
最初,我正在尋找類似于 C 的std:: nth_element. 看起來這個功能在 Python 中由 NumPy 的partition. 不幸的是,雖然std::nth_element支持自定義排序的謂詞,但 NumPypartition不支持。我最終使用了 a heapq,它可以很好地完成作業并按所需順序排序,但我不知道基于一個屬性進行排序的最佳方法。
uj5u.com熱心網友回復:
元組正是您所需要的。不是將分數存盤在堆中,而是在堆中存盤一個元組(score, object)。它將嘗試按分數進行比較并回傳可用于檢索原始物件的元組串列。這將節省您按分數檢索物件的額外步驟:
heapq.nlargest(3, ((obj.score, obj) for obj in randomObjects))
# [(0.9996643881256989, <__main__.Object object at 0x155f730>), (0.9991398955041872, <__main__.Object object at 0x119e928>), (0.9858047551444177, <__main__.Object object at 0x15e38c0>)]
對于現實世界的例子:https : //akuiper.com/console/g6YuNa_1WClp
或者如@shriakhilc 所評論的那樣,使用key引數 inheapq.nlargest指定您要按分數進行比較:
heapq.nlargest(3, randomObjects, lambda o: o.score)
uj5u.com熱心網友回復:
我建議您使用排序的python本機方法 lambda函式。請參閱此處:https : //docs.python.org/3/howto/sorting.html#sortinghowto
基本上,這就是您可以擁有的:
myList = [
{score: 32, coordinates: [...]},
{score: 12, coordinates: [...]},
{score: 20, coordinates: [...]},
{score: 8, coordinates: [...]},
{score: 40, coordinates: [...]},
]
# Sort by score DESCENDING
mySortedList = sorted(myList, key=lambda element: element['score'], reverse=True)
# Retrieve top 3 results
myTopResults = mySortedList[0:3]
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/396710.html
