我正在嘗試學習如何使用 RNN 進行時間序列預測,在我看到的所有示例中,他們使用一系列價格來預測以下價格。在示例中,每個目標 ( Y_train[n]) 都與由最后 30 個價格/步驟 ( [X_train[[n-1],[n-2]....,[n-30])組成的序列或矩陣相關聯。
然而,在現實世界中,要準確預測您需要的不僅僅是最后 30 個價格的序列,您還需要其他......我應該說特征嗎?就像交易量的最后 30 個值或情緒指數的最后 30 個值一樣。
所以我的問題是:如何為每個目標(最后 30 個價格和最后 30 個交易量值)使用兩個序列來塑造 RNN 的輸入?這是我使用的示例代碼,只有 1 個序列用作參考:
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
# Dividing Dataset (Test and Train)
train_lim = int(len(df) * 2 / 3)
training_set = df[:train_lim][['Close']]
test_set = df[train_lim:][['Close']]
# Normalizing
sc = MinMaxScaler(feature_range=(0, 1))
training_set_scaled = sc.fit_transform(training_set)
# Shaping Input
X_train = []
y_train = []
X_test = []
for i in range(30, training_set_scaled.size):
X_train.append(training_set_scaled[i - 30:i, 0])
y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)
for i in range(30, len(test_set)):
X_test.append(test_set.iloc[i - 30:i, 0])
X_test = np.array(X_test)
# Adding extra dimension ???
X_train = np.reshape(X_train, [X_train.shape[0], X_train.shape[1], 1])
X_test = np.reshape(X_test, [X_test.shape[0], X_test.shape[1], 1])
regressor = Sequential()
# LSTM layer 1
regressor.add(LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], 1)))
regressor.add(Dropout(0.2))
# LSTM layer 2,3,4
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))
# LSTM layer 5
regressor.add(LSTM(units=50))
regressor.add(Dropout(0.2))
# Fully connected layer
regressor.add(Dense(units=1))
# Compiling the RNN
regressor.compile(optimizer='adam', loss='mean_squared_error')
# Fitting the RNN model
regressor.fit(X_train, y_train, epochs=120, batch_size=32)
我使用的資料框是帶有日期時間索引的標準 OHLCV,因此它看起來像這樣:
Datetime Open High Low Close Volume
01/01/2021 102.42 103.33 100.57 101.23 1990
02/01/2021 101.23 105.22 99.45 100.11 1970
... ... ... ... ... ...
01/12/2021 203.22 210.34 199.22 201.11 2600
uj5u.com熱心網友回復:
您可以遵循完全相同的程序,唯一的區別是具有輸入序列 (X_train和X_test)的陣列的最后一維的長度將大于一(因為它將等于外部回歸器的數量加一,其中加一來自目標的過去值也用作輸入這一事實)。
import pandas as pd
import numpy as np
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout
pd.options.mode.chained_assignment = None
# define the target and features
target = ['Close']
features = ['Volume', 'High', 'Low']
# download the data
df = yf.download(tickers=['AAPL'], period='1y')
df = df[features target]
# split the data
split = int(df.shape[0] * 2 / 3)
df_train = df.iloc[:split, :].copy()
df_test = df.iloc[split:, :].copy()
# scale the data
target_scaler = MinMaxScaler().fit(df_train[target])
df_train[target] = target_scaler.transform(df_train[target])
df_test[target] = target_scaler.transform(df_test[target])
features_scaler = MinMaxScaler().fit(df_train[features])
df_train[features] = features_scaler.transform(df_train[features])
df_test[features] = features_scaler.transform(df_test[features])
# extract the input sequences and output values
sequence_length = 30
X_train, y_train = [], []
for i in range(sequence_length, df_train.shape[0]):
X_train.append(df_train[features target].iloc[i - sequence_length: i])
y_train.append(df_train[target].iloc[i])
X_train, y_train = np.array(X_train), np.array(y_train)
X_test, y_test = [], []
for i in range(sequence_length, df_test.shape[0]):
X_test.append(df_test[features target].iloc[i - sequence_length: i])
y_test.append(df_test[target].iloc[i])
X_test, y_test = np.array(X_test), np.array(y_test)
print(X_train.shape)
# (138, 30, 4)
print(X_test.shape)
# (55, 30, 4)
# build and train the model
model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=X_train.shape[1:]))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50))
model.add(Dropout(0.2))
model.add(Dense(units=1))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X_train, y_train, epochs=120, batch_size=32)
model.evaluate(X_test, y_test)
# generate the test set predictions
y_pred = model.predict(X_test)
y_pred = target_scaler.inverse_transform(y_pred)
# plot the test set predictions
df['Predicted Close'] = np.nan
df['Predicted Close'].iloc[- y_pred.shape[0]:] = y_pred.flatten()
df[['Close', 'Predicted Close']].plot()
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/409765.html
標籤:
