有沒有更快的方法來計算每支球隊得分最高的人數?
輸入
# Random scores per player per round (each player corresponds to an integer from 0 - 100)
scores_per_round = np.random.rand(10_000,100)
# 100,000 random teams of 8
teams = np.array([random.sample(list(range(100)), 8) for _ in range(100_000)])
期望的輸出
# Count of top scores per team key being the index of the team in the teams array and the value being the amount of wins.
{
0: 20,
1: 12,
...
}
目前,我回圈遍歷各個回合,將每個團隊的得分相加,然后使用獲取最大值索引np.argmax并將計數存盤在字典中。
import random
from collections import defaultdict
win_count = defaultdict(int)
# Random scores
scores_per_round = np.random.rand(10_000,100)
# 100,000 random teams of 8
teams = np.array([random.sample(list(range(100)), 8) for _ in range(100_000)])
# Loop through and keep track of teams wins
for round in range(10_000):
win_count[np.argmax(np.sum(np.take(scores_per_round[round], teams), axis=1))] = 1
uj5u.com熱心網友回復:
初始代碼很慢,因為它分配了相當大的臨時陣列。迭代它們 10_000 是昂貴的,因為 RAM 或最后一級快取相對較慢(與 L1 快取或暫存器相比)。使用Numba可以通過以更快取友好的方式即時計算陣列來解決此問題。
這是一個簡單的并行實作:
import numpy as np
import numba as nb
@nb.njit('int32[::1](float64[:,::1], int32[:,::1])', parallel=True, fastmath=True)
def computeTeamWins(scores_per_round, teams):
roundCount = scores_per_round.shape[0]
result = np.empty(roundCount, dtype=np.int32)
n, m = teams.shape
assert m == 8 # See the comment below
for r in nb.prange(roundCount):
iMax, sMax = -1, -1.0
for i in range(n):
s = 0.0
# Faster if the size is known as the loop can be unrolled
for j in range(8):
s = scores_per_round[r, teams[i, j]]
if s > sMax:
iMax, sMax = i, s
result[r] = iMax
return result
win_count = defaultdict(int)
for v in computeTeamWins(scores_per_round, teams):
win_count[v] = 1
在我的 6 核機器上,它需要 0.8 秒,而初始代碼需要 54.3 秒。這意味著 Numba 的實作速度快了大約68 倍。如果teams轉換為np.uint8陣列,則計算僅需 0.57 秒,從而使執行速度提高 95 倍(由于快取)。請注意,這意味著最大整數值限制為 255(包括在內)。請注意,最終回圈只需要幾毫秒。
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/429852.html
