Pythoniloc回圈執行時間過長。技術分析計算-有解無憂

我開發了一個代碼來分析大量股票價格。基本上它使用兩個技術指標（MACD 和 EMA）并創建一個技術分析標志。

代碼正在運行，這很好，但執行時間太長，很可能是因為使用 iloc 進行迭代。你有什么提高速度的建議嗎？我在下面提供了一個例子：

import pandas as pd
import numpy as np
import time

df = pd.DataFrame(np.random.uniform(low=2, high=5.5, size=(10000,)), columns=['Close'])
close = df['Close'].astype(float)

def MACD(first,second,signal):
    df['EMA' str(first)] = close.ewm(span=first).mean()
    df['EMA' str(second)] = close.ewm(span=second).mean()
    df['MACD']=df['EMA' str(first)]-df['EMA' str(second)]
    df['signal']=df.MACD.ewm(span=signal).mean()
    df['MACD_ind'] = 0
    
    for i in range (second signal, len(df)):
        if df.MACD.iloc[i]>df.signal.iloc[i] and df.MACD.iloc[i-1]<df.signal.iloc[i-1]:
            df.loc[i,'MACD_ind']=1
        if df.MACD.iloc[i]<df.signal.iloc[i] and df.MACD.iloc[i-1]>df.signal.iloc[i-1]:
            df.loc[i,'MACD_ind']=-1
                
def EMA(first,second):
    df['EMA' str(first)] = close.rolling(window=first).mean()
    df['EMA' str(second)] = close.rolling(window=second).mean()
    df['EMAdif'] = df['EMA' str(first)]-df['EMA' str(second)]
    df['EMA_ind'] = 0
    for i in range (second, len(df)):
        if df.EMAdif.iloc[i]>0 and df.EMAdif.iloc[i-1]<0:
            df.loc[i,'EMA_ind']=1
        if df.EMAdif.iloc[i]<0 and df.EMAdif.iloc[i-1]>0:
            df.loc[i,'EMA_ind']=-1


split_time = time.time()
TA_ind=list()

MACD(12, 26, 9)
TA_ind.append('MACD_ind')
print("MACD--- %s seconds ---" % (time.time() - split_time))
split_time = time.time()    
        
EMA(20,50)
TA_ind.append('EMA_ind')
print("EMA--- %s seconds ---" % (time.time() - split_time))
split_time = time.time()

uj5u.com熱心網友回復：

我發現這個名為 shift 的資料框函式對我幫助很大。

def MACD(first,second,signal):
    df['EMA' str(first)] = close.ewm(span=first).mean()
    df['EMA' str(second)] = close.ewm(span=second).mean()
    df['MACD']=df['EMA' str(first)]-df['EMA' str(second)]
    df['signal']=df.MACD.ewm(span=signal).mean()
    df['dif']=df['MACD']-df['signal']
    df['dif_shift']=df.dif.shift(1)
    df['MACD_ind'] = 0
    df['MACD_ind']=np.where((df['dif']>0) & (df['dif_shift']<0),1,df['MACD_ind'])
    df['MACD_ind']=np.where((df['dif']<0) & (df['dif_shift']>0),-1,df['MACD_ind'])
                
def EMA(first,second):
    df['EMA' str(first)] = close.rolling(window=first).mean()
    df['EMA' str(second)] = close.rolling(window=second).mean()
    df['EMAdif'] = df['EMA' str(first)]-df['EMA' str(second)]
    df['EMAdif_shift'] = df.EMAdif.shift(1)
    df['EMA_ind'] = 0
    df['EMA_ind']=np.where((df['EMAdif']>0) & (df['EMAdif_shift']<0),1,df['EMA_ind'])
    df['EMA_ind']=np.where((df['EMAdif']<0) & (df['EMAdif_shift']>0),-1,df['EMA_ind'])

uj5u.com熱心網友回復：

Loop through 字典比 loop through 快得多DataFrame。所以你可以將你的轉換DataFrame為字典。但是你不能使用像MACDor 之類的內置函式的pandas ewm。所以你應該自己寫這個函式。然后結果你有更快的程式。例如，我使用您的資料框執行此操作。首先我遍歷DataFrame自身，然后將其轉換為字典并遍歷它。

df = pd.DataFrame(np.random.uniform(low=2, high=5.5, size=(10000,2)), columns=['Close', 'Open'])
st = time.time()
for i in range(len(df)):
    a = df.iloc[i]['Close'] - df.iloc[i]['Open']
print(time.time() - st)

這個的執行時間是2.005403757095337.

df = pd.DataFrame(np.random.uniform(low=2, high=5.5, size=(10000,2)), columns=['Close', 'Open'])
df_dict = df.to_dict()
st = time.time()
for i in range(len(df)):
    a = df_dict['Close'][i] - df_dict['Open'][i]
print(time.time() - st)

但是這個的執行時間是0.0029413700103759766. 這意味著第二種方法大約快了一千倍！

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/394328.html

標籤：Python 熊猫表现迭代技术指标

上一篇：Numba并行代碼比順序代碼慢

下一篇：如果我保留對底層迭代器的參考，為什么islice(permutations)會快100倍？