我想以更有效的方式進行以下一組操作:
import numpy as np
from time import time
n1 = 5
n2 = 800
n3 = 1000
a = np.random.rand( n1,n2,n3 )
ts = time()
a_new = np.array( [ np.repeat( a[np.newaxis,ll,:,:], n1, axis=0 ) for ll in range(n1)] )
print(time()-ts)
ts = time()
d = n1**2
a_new = np.reshape( a_new, ( d, n2, n3 ) )
a_new = np.repeat( a_new[np.newaxis,:,:,:], n1, axis=0 )
print(time()-ts)
ts = time()
d2 = d*n1
a_new = np.reshape( a_new, ( d2, n2, n3 ))
a_new = np.reshape( a_new, ( d2*n2 , n3 ) )
print(time()-ts)
當 n1,n2,n3 變得相當大時,這變得有點低效。最好的方法是什么?
uj5u.com熱心網友回復:
您的代碼,跟蹤結果的大小:
In [22]: n1 = 5
...: n2 = 800
...: n3 = 1000
...:
...: a = np.random.rand(n1, n2, n3)
In [23]: a.shape
Out[23]: (5, 800, 1000)
In [24]: a_new = np.array([np.repeat(a[np.newaxis, ll, :, :], n1, axis=0) for ll in range(n1)])
...:
In [25]: a_new.shape
Out[25]: (5, 5, 800, 1000)
In [26]: d = n1**2
...: a1 = np.reshape(a_new, (d, n2, n3))
...: a1 = np.repeat(a1[np.newaxis, :, :, :], n1, axis=0)
...:
In [27]: a1.shape
Out[27]: (5, 25, 800, 1000)
In [28]: d2 = d * n1
...: a2 = np.reshape(a1, (d2, n2, n3))
...: a2 = np.reshape(a2, (d2 * n2, n3))
...:
...:
In [29]: a2.shape
Out[29]: (100000, 1000)
一個次要的點,但雙重reshape輸入a2不是必需的。您可以直接轉到最后一個形狀。在任何情況下,重塑都很快(除非它必須強制復制,如轉置之后)。
有時
In [34]: %timeit a_new = np.array([np.repeat(a[np.newaxis, ll, :, :], n1, axis=0) for ll in range(n1)])
263 ms ± 11.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [36]: a1 = np.reshape( a_new, ( d, n2, n3 ) )
In [37]: %timeit np.repeat( a_new[np.newaxis,:,:,:], n1, axis=0 )
912 ms ± 134 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
所以是的,重復確實需要最多的時間。但它將大型陣列的大小增加了 5 倍。 repeat相對較快 - 對于制作大型陣列的東西。
您的a_new串列理解不是必需的:
In [41]: np.repeat(a[:, None, :, :], n1, axis=1).shape
Out[41]: (5, 5, 800, 1000)
In [42]: np.allclose(np.repeat(a[:, None, :, :], n1, axis=1), a_new)
Out[42]: True
In [43]: %timeit np.repeat(a[:, None, :, :], n1, axis=1)
210 ms ± 23.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
節省的時間不多,因為n1回圈只有 5 倍。
請注意,以下repeat大約需要 5 倍。 a_new是 的 5 倍,是 的a25a1倍a。
我們可以在一行中完成所有操作
In [44]: a4 = np.repeat(np.repeat(a[:,None,:,:], n1, axis=1)[None],n1, axis=0).reshape(-1,n3)
In [45]: a4.shape
Out[45]: (100000, 1000)
In [46]: np.allclose(a2, a4)
Out[46]: True
時間變化不大。有趣的是,[46] allclose 花費了最明顯的時間。
雙重復可以寫成tile:
a5 = np.tile(a[None,None,:,:],(5,5,1,1)).reshape(-1,n3)
它不節省時間,因為在每個維度上重復tile使用。repeat
甚至
a6 = np.repeat(a[None,:,:], n1*n1,axis=0).reshape(-1,n3)
I'm not going to time or test this since creating several arrays of this size was pushing my memory limits.
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/436263.html
