有沒有更快的方法來獲取最大值的索引并保持計數？-有解無憂

有沒有更快的方法來計算每支球隊得分最高的人數？

輸入

# Random scores per player per round (each player corresponds to an integer from 0 - 100) 
scores_per_round = np.random.rand(10_000,100)

# 100,000 random teams of 8
teams = np.array([random.sample(list(range(100)), 8) for _ in range(100_000)])

期望的輸出

# Count of top scores per team key being the index of the team in the teams array and the value being the amount of wins.
{
  0: 20,
  1: 12,
  ...
}

目前，我回圈遍歷各個回合，將每個團隊的得分相加，然后使用獲取最大值索引np.argmax并將計數存盤在字典中。

import random
from collections import defaultdict

win_count = defaultdict(int)

# Random scores
scores_per_round = np.random.rand(10_000,100)

# 100,000 random teams of 8
teams = np.array([random.sample(list(range(100)), 8) for _ in range(100_000)])

# Loop through and keep track of teams wins
for round in range(10_000):
    win_count[np.argmax(np.sum(np.take(scores_per_round[round], teams), axis=1))]  = 1

uj5u.com熱心網友回復：

初始代碼很慢，因為它分配了相當大的臨時陣列。迭代它們 10_000 是昂貴的，因為 RAM 或最后一級快取相對較慢（與 L1 快取或暫存器相比）。使用Numba可以通過以更快取友好的方式即時計算陣列來解決此問題。

這是一個簡單的并行實作：

import numpy as np
import numba as nb

@nb.njit('int32[::1](float64[:,::1], int32[:,::1])', parallel=True, fastmath=True)
def computeTeamWins(scores_per_round, teams):
    roundCount = scores_per_round.shape[0]
    result = np.empty(roundCount, dtype=np.int32)
    n, m = teams.shape
    assert m == 8 # See the comment below

    for r in nb.prange(roundCount):
        iMax, sMax = -1, -1.0
        for i in range(n):
            s = 0.0
            # Faster if the size is known as the loop can be unrolled
            for j in range(8):
                s  = scores_per_round[r, teams[i, j]]
            if s > sMax:
                iMax, sMax = i, s
        result[r] = iMax

    return result

win_count = defaultdict(int)
for v in computeTeamWins(scores_per_round, teams):
    win_count[v]  = 1

在我的 6 核機器上，它需要 0.8 秒，而初始代碼需要 54.3 秒。這意味著 Numba 的實作速度快了大約68 倍。如果teams轉換為np.uint8陣列，則計算僅需 0.57 秒，從而使執行速度提高 95 倍（由于快取）。請注意，這意味著最大整數值限制為 255（包括在內）。請注意，最終回圈只需要幾毫秒。

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/429852.html

標籤：Python 麻木的表现

上一篇：如何讓海龜/代理在某個補丁上走得更快或更慢？

下一篇：獲取特定行對的平均值并將所有結果與特定行連接的最高效的計算方法