我有一個函式,它基本上采用一對整數(x,y)并生成一個包含 3000 個元素的向量。所以,我用過:
pool_obj=multiprocessing.Pool()
result=np.array(pool_obj.map(f, RANGE))
其中RANGE是 x,y 可能分別取的兩組值的笛卡爾積。
我的問題是我需要的np.sum(result,axis=0)只是 3000 長。我想對所有 x 和 y 求和。總共有 1000x1000 對(x,y)。使用這種方法將創建一個超大的陣列,大小為 1000000x3000 并且超出了記憶體限制。
我該如何解決這個問題?
uj5u.com熱心網友回復:
使用生成器x, y來減少輸入大小,同時使用imap生成器減少輸出大小的示例(減少回傳到主行程的資料)
import multiprocessing as mp
import numpy as np
from time import sleep
class yield_xy:
"""
Generator for x, y pairs prevents all pairs of x and y from being generated
at the start of the map call. In this example it would only be a million
floats, so on the order of 4-8 Mb of data, but if x, and y are bigger
(or maybe you have a z) this could dramatically reduce input data size
"""
def __init__(self, x, y):
self._x = x
self._y = y
def __len__(self): #map, imap, map_async, starmap etc.. need the input size ahead of time
return len(self._x) * len(self._y)
def __iter__(self): #simple generator needs storage x y rather than x * y
for x in self._x:
for y in self._y:
yield x, y
def task(args):
x, y = args
return (np.zeros(3000) x) * y
def main():
x = np.arange(0,1000)
y = np.sin(x)
out = np.zeros(3000)
with mp.Pool() as pool:
for result in pool.imap(task, yield_xy(x, y)):
out = result #accumulate results
return out
if __name__ == "__main__":
result = main()
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/322025.html
