我開發了一個代碼來分析大量股票價格。基本上它使用兩個技術指標(MACD 和 EMA)并創建一個技術分析標志。
代碼正在運行,這很好,但執行時間太長,很可能是因為使用 iloc 進行迭代。你有什么提高速度的建議嗎?我在下面提供了一個例子:
import pandas as pd
import numpy as np
import time
df = pd.DataFrame(np.random.uniform(low=2, high=5.5, size=(10000,)), columns=['Close'])
close = df['Close'].astype(float)
def MACD(first,second,signal):
df['EMA' str(first)] = close.ewm(span=first).mean()
df['EMA' str(second)] = close.ewm(span=second).mean()
df['MACD']=df['EMA' str(first)]-df['EMA' str(second)]
df['signal']=df.MACD.ewm(span=signal).mean()
df['MACD_ind'] = 0
for i in range (second signal, len(df)):
if df.MACD.iloc[i]>df.signal.iloc[i] and df.MACD.iloc[i-1]<df.signal.iloc[i-1]:
df.loc[i,'MACD_ind']=1
if df.MACD.iloc[i]<df.signal.iloc[i] and df.MACD.iloc[i-1]>df.signal.iloc[i-1]:
df.loc[i,'MACD_ind']=-1
def EMA(first,second):
df['EMA' str(first)] = close.rolling(window=first).mean()
df['EMA' str(second)] = close.rolling(window=second).mean()
df['EMAdif'] = df['EMA' str(first)]-df['EMA' str(second)]
df['EMA_ind'] = 0
for i in range (second, len(df)):
if df.EMAdif.iloc[i]>0 and df.EMAdif.iloc[i-1]<0:
df.loc[i,'EMA_ind']=1
if df.EMAdif.iloc[i]<0 and df.EMAdif.iloc[i-1]>0:
df.loc[i,'EMA_ind']=-1
split_time = time.time()
TA_ind=list()
MACD(12, 26, 9)
TA_ind.append('MACD_ind')
print("MACD--- %s seconds ---" % (time.time() - split_time))
split_time = time.time()
EMA(20,50)
TA_ind.append('EMA_ind')
print("EMA--- %s seconds ---" % (time.time() - split_time))
split_time = time.time()
uj5u.com熱心網友回復:
我發現這個名為 shift 的資料框函式對我幫助很大。
def MACD(first,second,signal):
df['EMA' str(first)] = close.ewm(span=first).mean()
df['EMA' str(second)] = close.ewm(span=second).mean()
df['MACD']=df['EMA' str(first)]-df['EMA' str(second)]
df['signal']=df.MACD.ewm(span=signal).mean()
df['dif']=df['MACD']-df['signal']
df['dif_shift']=df.dif.shift(1)
df['MACD_ind'] = 0
df['MACD_ind']=np.where((df['dif']>0) & (df['dif_shift']<0),1,df['MACD_ind'])
df['MACD_ind']=np.where((df['dif']<0) & (df['dif_shift']>0),-1,df['MACD_ind'])
def EMA(first,second):
df['EMA' str(first)] = close.rolling(window=first).mean()
df['EMA' str(second)] = close.rolling(window=second).mean()
df['EMAdif'] = df['EMA' str(first)]-df['EMA' str(second)]
df['EMAdif_shift'] = df.EMAdif.shift(1)
df['EMA_ind'] = 0
df['EMA_ind']=np.where((df['EMAdif']>0) & (df['EMAdif_shift']<0),1,df['EMA_ind'])
df['EMA_ind']=np.where((df['EMAdif']<0) & (df['EMAdif_shift']>0),-1,df['EMA_ind'])
uj5u.com熱心網友回復:
Loop through 字典比 loop through 快得多DataFrame。所以你可以將你的轉換DataFrame為字典。但是你不能使用像MACDor 之類的內置函式的pandas ewm。所以你應該自己寫這個函式。然后結果你有更快的程式。例如,我使用您的資料框執行此操作。首先我遍歷DataFrame自身,然后將其轉換為字典并遍歷它。
df = pd.DataFrame(np.random.uniform(low=2, high=5.5, size=(10000,2)), columns=['Close', 'Open'])
st = time.time()
for i in range(len(df)):
a = df.iloc[i]['Close'] - df.iloc[i]['Open']
print(time.time() - st)
這個的執行時間是2.005403757095337.
df = pd.DataFrame(np.random.uniform(low=2, high=5.5, size=(10000,2)), columns=['Close', 'Open'])
df_dict = df.to_dict()
st = time.time()
for i in range(len(df)):
a = df_dict['Close'][i] - df_dict['Open'][i]
print(time.time() - st)
但是這個的執行時間是0.0029413700103759766. 這意味著第二種方法大約快了一千倍!
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/394328.html
上一篇:Numba并行代碼比順序代碼慢
