我有以下代碼:
for key in test_large_images.keys():
test_large_images[key]['avg_prob'] = 0
sum = 0
for value in test_large_images[key]['pred_probability']:
print(test_large_images[key]['pred'])
print(type(test_large_images[key]['pred'] ))
if test_large_images[key]['pred'] == 1:
sum = value
test_large_images[key]['avg_prob'] = sum/len(test_large_images[key]['pred_probability'])
它是一個包含 359 個大影像的字典,每個影像可以包含 200 到 8000 個較小的影像,我稱之為補丁。這test_large_images是一個關于較小影像的推理字典,每個影像塊也有預測概率、大影像名稱、影像塊名稱等。我的目標是根據該影像內較小塊預測概率的預測概率來找到較大影像的平均概率。當我在一個較小的資料集(45K 補丁)上運行這個回圈時,我已將其推斷保存在一個pkl檔案中,它運行得非常快。但是,這個腳本已經運行了 130 多分鐘,正如您在 VSCode Remote 上的遠程 Jupyter Notebook 中看到的那樣(在 Mac 上使用本地客戶端)。
有沒有辦法可以利用 24 個 CPU 內核來加速這個嵌套字典計算?

uj5u.com熱心網友回復:
- 不要
sum用作變數名,因為它是內置函式。 test_large_images[key]['avg_prob'] = 0不需要這條線。- PeterK 是正確的,您的條件不需要每次都在內部 for 回圈中計算。
- 為什么我們要反復列印這些,或者只是為了測驗?
for key in test_large_images.keys():
add = 0
condition = test_large_images[key]['pred'] == 1 # This is what PeterK means by take it out (of the loop).
for value in test_large_images[key]['pred_probability']:
# print(test_large_images[key]['pred'])
# print(type(test_large_images[key]['pred']))
if condition:
add = value
test_large_images[key]['avg_prob'] = add/len(test_large_images[key]['pred_probability'])
您的代碼可以簡化為:
for key in test_large_images.keys():
condition = test_large_images[key]['pred'] == 1
num = sum(x for x in test_large_images[key]['pred_probability'] if condition)
denom = len(test_large_images[key]['pred_probability'])
test_large_images[key]['avg_prob'] = num/denom
基于反饋和一些額外的優化:
for key in test_large_images.keys():
if test_large_images[key]['pred'] != 1:
test_large_images[key]['avg_prob'] = 0
continue
values = test_large_images[key]['pred_probability']
test_large_images[key]['avg_prob'] = sum(values)/len(values)
這是兩種不同型別的平均(我最感興趣的是僅對預測為 1 的條目數取概率的平均值)。我這樣稱呼avg_prob_pos
for key in progress_bar(test_large_images.keys()):
condition = test_large_images[key]['pred'] == 1
num = sum(x for x in test_large_images[key]['pred_probability'] if condition)
denom = len(test_large_images[key]['pred_probability'])
count = sum(x for x in test_large_images[key]['pred'] if condition)
if count != 0:
test_large_images[key]['avg_prob_pos'] = num/count
test_large_images[key]['avg_prob'] = num/denom
percentage = test_large_images[key]['pred'].count(1)/len(test_large_images[key]['pred'])
test_large_images[key]['percentage'] = percentage
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/473189.html
上一篇:從嵌套字典Python回傳鍵
