我是執行緒新手。我知道我們可以在函式上呼叫執行緒,但我想在字典上呼叫它。
我有一本字典,其中亂數出現在不同的索引中。我想找到所有這些數字的總和。我想要做的基本上是為該字典的每一行/索引使用一個執行緒。該單個執行緒將找到該特定行中所有數字的總和,然后將所有執行緒的這些總和加在一起以獲得最終結果。
import random
import time
li = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u"
, "v", "w", "x", "y"]
arr = {}
for k in range(0, 25):
arr[li[k]] = [random.randrange(1, 10, 1) for i in range(1000000)]
start = time.perf_counter()
sum = 0
for k, v in arr.items():
for value in v:
sum = value
end = time.perf_counter()
print(sum)
print("Finished in: ", round(end-start, 2), " seconds")
我以前用簡單的方法來做,總共花了我大約 86 秒(由于將數字分配給字典),總共花了 5 秒來計算總和。
我想通過為字典的每個索引創建執行緒來改進這 5 秒的總和計算。誰可以幫我這個事?
uj5u.com熱心網友回復:
我知道...我們可以在函式上呼叫執行緒。
沒有。你不能在任何事情上呼叫執行緒。當你寫這個:
thread = threading.Thread(foobar, args=(x, y, z))
你沒有呼叫執行緒。您正在呼叫該類的構造函式Thread。建構式創建了一個新Thread物件,然后Thread是呼叫的物件:Thread呼叫foobar(x, y, z)。
我想要做的基本上是為該字典的每一行/索引使用一個執行緒。該單個執行緒將找到該特定行中所有數字的總和,并且...
執行緒運行代碼,您必須以函式的形式提供執行緒將運行的代碼。如果您想要一個執行緒“找到特定行中所有數字的總和..”*,那么您必須撰寫一個函式來找到所有數字的總和,然后您必須創建一個新Thread的會呼叫你的函式。
*關于您的問題的其他一些答案和評論解釋了 Python 的全域解釋器鎖(又名 GIL)如何阻止您使用執行緒來使您的程式運行得更快。所以,這個答案的其余部分是幻想,因為它不會讓你的程式更快,但它確實說明了如何創建執行緒。
您可能希望將字典和行號作為引數傳遞給函式。也許您還想向它傳遞一些可變的結果結構(例如,一個陣列),函式可以將結果保存到其中。
def FindRowSum(dictionary, row, results):
sum = 0
for ...:
sum = sum ...
results[row] = sum
...
allThreads = []
results = []
for row in range(...):
thread = threading.Thread(FindRowSum, args=(myDictionary, row, results))
allThreads.append(thread)
然后,再往下看,如果你想等待所有執行緒完成它們的作業:
for thread in allThreads:
thread.join()
uj5u.com熱心網友回復:
因此,這里有一個示例,說明如何將multiprocessing其用于“map-reduce”樣式求和問題。
這在很大程度上假設每個子問題(由 表示process_key)獨立于其余子問題。
最后的歸約(將所有關鍵結果加在一起)由主程式完成。
import multiprocessing
import os
import string
import time
from typing import Tuple, List
def get_key_data(key: str) -> List[int]:
# Get data for a given key from a database or wherever;
# here we just get a big blob of random bytes.
return list(os.urandom(1_000_000))
def process_key(key: str) -> Tuple[str, int]:
# This function is run in a separate process,
# so it can't access global data in the same way a function
# in the same process could. Program accordingly.
key_data = get_key_data(key)
result_for_key = sum(key_data) # Could be heavier computation here...
# Returning a tuple makes it easier to work with the keyed data in the main program.
return (key, result_for_key)
def main():
start = time.perf_counter()
keys = list(string.ascii_lowercase)
with multiprocessing.Pool() as p:
results = {}
# Since result order doesn't matter, we can use `imap_unordered` to optimize performance.
# It would also be worth adding `chunksize=...` to spend less time in serializers.
for key, result in p.imap_unordered(process_key, keys): # unpacking result tuples here
print(f"Got result {result} for key {key}")
results[key] = result
grand_total = sum(results.values())
end = time.perf_counter()
print(f"Grand total: {grand_total} in {end - start:.2f} seconds")
if __name__ == '__main__':
main()
這列印出來(類似)
Got result 127439637 for key y
Got result 127521766 for key z
Got result 127410016 for key a
Got result 127618358 for key b
Got result 127510624 for key c
Got result 127525228 for key d
Got result 127471359 for key e
Got result 127535553 for key f
Got result 127457231 for key m
Got result 127547738 for key n
Got result 127567059 for key o
Got result 127470823 for key g
Got result 127465435 for key h
Got result 127497010 for key i
Got result 127432593 for key j
Got result 127555330 for key k
Got result 127402226 for key l
Got result 127534939 for key p
Got result 127558057 for key q
Got result 127474231 for key r
Got result 127491137 for key v
Got result 127520358 for key w
Got result 127490582 for key x
Got result 127489005 for key s
Got result 127485159 for key t
Got result 127503702 for key u
Grand total: 3314975156 in 0.60 seconds
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/433433.html
