In the early time of Bitcoin’s development, it acted more like a substitute to traditional commodities—when one price went up, the other often went down, showing a kind of inverse movement.
But in recent years, as Bitcoin’s market size grew and more people get into it, this relationship started to shift to another side.
Now, more and more, Bitcoin’s price trend begins to move in the same direction (in phase) with the real economy or traditional assets. In other words, its price began to align positively with macro market trends like U.S. stocks, gold, or even policy changes.
This makes us wonder: is Bitcoin becoming a kind of complement rather than a substitute?
If so, can we try to use the early moves of traditional markets to predict Bitcoin’s future direction?
That’s the core question I want to explore in this project.
When choosing the model, I started with LSTM as the base layer. One reason is that it was relatively mature for time series tasks (at least in 2 years ago when I finished most of the work of this project). Another reason is honestly because I hadn’t really touched Transformer models yet at taht time, haha.
If I get more time later, I might try to extend the model—maybe use Diffusion-based architectures and see if they can improve prediction performance.
# import
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from scipy.stats import pearsonr, binomtest
print("TensorFlow ver:", tf.__version__)
print("GPU number:", len(tf.config.list_physical_devices("GPU")))
TensorFlow ver: 2.6.0 GPU number: 1
To build this model, I mainly used three types of market data, all at daily frequency (1 day).
Why daily? Because BTC trades 24/7, but obviously stocks don’t—so to avoid too much data loss, I picked 1-day as the most stable and consistent option.
The time span is roughly from Jan 1, 2014 to Dec 31, 2024—almost 4000 days—basically covering the full phase from Bitcoin’s speculative period to its recent gradual financialization.
Data sources:
Bitcoin (BTC/USDT): from Investing.com, includes daily price and volume
Gold (GLD): from Investing.com, used as a traditional safe-haven asset
S&P 500 Index (SP500): also from Investing, used as a representation for the overall U.S. stock market
I aligned all data by date and handled missing values using forward fill, to avoid abrupt breaks in the time series. At the same time, I removed clearly abnormal values (like extreme spikes due to zero volume).I chose forward fill here because it matches more of a real-world investor mindset—after all, during U.S. stock market holidays, most traders still see the same last-available price and volume.
For the next steps, all features will be extracted, combined, and normalized based on this cleaned dataset.
# 1) Read three CSV files
btc = pd.read_csv("BTC1.csv", parse_dates=["Date"], index_col="Date")[["Price","Change %","Vol."]]
gld = pd.read_csv("GOLD1.csv", parse_dates=["Date"], index_col="Date")[["Change %","Vol."]]
sp500 = pd.read_csv("SP5001.csv", parse_dates=["Date"], index_col="Date")[["Change %","Vol."]]
# 2) Rename close price and volume columns
gld.rename(columns={"Change %":"GLD_Change %", "Vol.":"GLD_Vol"}, inplace=True)
sp500.rename(columns={"Change %":"SP500_Change %","Vol.":"SP500_Vol"}, inplace=True)
# 3) Merge + forward-fill + drop na
df = pd.concat([
btc[["Change %","Vol."]],
gld[["GLD_Change %","GLD_Vol"]],
sp500[["SP500_Change %"]]
], axis=1)
df.ffill(inplace=True)
df.dropna(inplace=True)
print(df.head())
print(df.columns)
df.shape
Change % Vol. GLD_Change % GLD_Vol SP500_Change % Date 2014-01-02 4.69% 0.13K 1.62% 7.57M -0.89% 2014-01-03 4.78% 0.13K 1.09% 5.88M -0.03% 2014-01-04 -1.26% 0.01K 1.09% 5.88M -0.03% 2014-01-05 12.74% 0.02K 1.09% 5.88M -0.03% 2014-01-06 3.38% 0.10K 0.18% 10.11M -0.25% Index(['Change %', 'Vol.', 'GLD_Change %', 'GLD_Vol', 'SP500_Change %'], dtype='object')
(4015, 5)
Here, I chose the percentage changes (volatility) of BTC, gold, and SP500 as inputs, instead of their actual real price values. The detailed reason for this choice is discussed later in the last thoughts section.
# 4) Strip potential leading/trailing spaces
df.columns = df.columns.str.strip()
pct_cols = ['Change %', 'GLD_Change %', 'SP500_Change %']
df[pct_cols] = (
df[pct_cols]
.replace({'%': ''}, regex=True) # Remove '%'
.astype(float) # convert to float
.div(100) # divide by 100
) # Remove '%' → convert to float → divide by 100
# 5)Replace K/M/B with scientific notation
vol_cols = [ 'Vol.', 'GLD_Vol']
df[vol_cols] = (
df[vol_cols]
.replace({',':''}, regex=True) # Remove commas
.replace({'K':'e3','M':'e6', 'B':'e9'}, regex=True) # K→e3, M→e6
.astype(float) # convert to float
)
# 6) Split by first appearance into train/test sets (80/20)
train_size = int(len(df) * 0.8)
df_train = df.iloc[:train_size]
df_test = df.iloc[train_size:]
#Please note,for step 7,8,9 it is kind of useless here, most of the values do not change.I wrote when the price was in df.
# 7) Initialize MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))
# 8) Fit only on train set to avoid data leakage
scaler.fit(df_train)
# 9) Transform train and test sets
train_scaled = scaler.transform(df_train)
test_scaled = scaler.transform(df_test)
df_train
| Change % | Vol. | GLD_Change % | GLD_Vol | SP500_Change % | |
|---|---|---|---|---|---|
| Date | |||||
| 2014-01-02 | 0.0469 | 130.0 | 0.0162 | 7570000.0 | -0.0089 |
| 2014-01-03 | 0.0478 | 130.0 | 0.0109 | 5880000.0 | -0.0003 |
| 2014-01-04 | -0.0126 | 10.0 | 0.0109 | 5880000.0 | -0.0003 |
| 2014-01-05 | 0.1274 | 20.0 | 0.0109 | 5880000.0 | -0.0003 |
| 2014-01-06 | 0.0338 | 100.0 | 0.0018 | 10110000.0 | -0.0025 |
| ... | ... | ... | ... | ... | ... |
| 2022-10-16 | 0.0103 | 780.0 | -0.0125 | 5470000.0 | -0.0237 |
| 2022-10-17 | 0.0148 | 1450.0 | 0.0029 | 4350000.0 | 0.0265 |
| 2022-10-18 | -0.0113 | 2120.0 | 0.0022 | 4530000.0 | 0.0114 |
| 2022-10-19 | -0.0104 | 860.0 | -0.0134 | 8260000.0 | -0.0067 |
| 2022-10-20 | -0.0039 | 1270.0 | -0.0016 | 5020000.0 | -0.0080 |
3212 rows × 5 columns
# 10) Sliding window
window = 60 # Use past 60 days to predict the next day
price_idx = 0 # the 0th column of price
eps=1e-11
def make_xy_regression(data, window, price_idx):
X, y = [], []
for i in range(window, len(data)):
X.append(data[i-window:i])
ret = (data[i, price_idx] - data[i-1, price_idx]) / (data[i-1, price_idx]+eps)
ret = np.clip(ret, -1, 1)
y.append(ret)
return np.array(X), np.array(y) # Use simple return as prediction target, clip to [-1, 1] to avoid outliers
X_train, y_train = make_xy_regression(train_scaled, window, price_idx)
X_test, y_test = make_xy_regression(test_scaled, window, price_idx)
print("Train shape:", X_train.shape, y_train.shape)
print("Test shape:", X_test.shape, y_test.shape)
# 11) Build model
model = Sequential([
LSTM(128, return_sequences=True, input_shape=(window, train_scaled.shape[1])),
Dropout(0.05),
LSTM(128, return_sequences=False),
Dropout(0.01),
Dense(1, activation='tanh') # tanh ∈ [-1, 1] # Predict a single scalar
])
model.compile(optimizer='adam',
loss='mse') # or loss='mae'
# 11) Training
es = EarlyStopping(monitor='val_loss', patience=50, restore_best_weights=True)
ckpt = ModelCheckpoint('best_lstm.h5', save_best_only=True)
history = model.fit(
X_train, y_train,
epochs=300,
batch_size=64,
validation_split=0.15,
shuffle=False,
verbose=1
)
y_pred = model.predict(X_test).squeeze()
# 12) DirAcc (directional accuracy: proportion of correct up/down predictions)
sign_true = np.sign(y_test)
sign_pred = np.sign(y_pred)
diracc = np.mean(sign_true == sign_pred)
print(f"DirAcc: {diracc*100:.2f}%")
# 13) Correlation (how well predicted returns align with actual returns)
corr, _ = pearsonr(y_pred, y_test)
print(f"Pearson Correlation: {corr:.3f}")
# 14) Plot
plt.scatter(y_test, y_pred, alpha=0.3)
plt.xlabel("True Return"); plt.ylabel("Predicted")
plt.title("pre vs real")
plt.grid(True); plt.show()
Train shape: (3152, 60, 5) (3152,) Test shape: (743, 60, 5) (743,) Epoch 1/300 42/42 [==============================] - 4s 19ms/step - loss: 0.0263 - val_loss: 0.0180 Epoch 2/300 42/42 [==============================] - 0s 9ms/step - loss: 0.0248 - val_loss: 0.0178 Epoch 3/300 42/42 [==============================] - 0s 9ms/step - loss: 0.0245 - val_loss: 0.0176 Epoch 4/300 42/42 [==============================] - 0s 9ms/step - loss: 0.0242 - val_loss: 0.0175 Epoch 5/300 42/42 [==============================] - 0s 9ms/step - loss: 0.0237 - val_loss: 0.0171 Epoch 6/300 42/42 [==============================] - 0s 9ms/step - loss: 0.0219 - val_loss: 0.0141 Epoch 7/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0181 - val_loss: 0.0116 Epoch 8/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0154 - val_loss: 0.0101 Epoch 9/300 42/42 [==============================] - 2s 37ms/step - loss: 0.0143 - val_loss: 0.0102 Epoch 10/300 42/42 [==============================] - 0s 10ms/step - loss: 0.0137 - val_loss: 0.0096 Epoch 11/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0136 - val_loss: 0.0098 Epoch 12/300 42/42 [==============================] - 1s 20ms/step - loss: 0.0134 - val_loss: 0.0098 Epoch 13/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0134 - val_loss: 0.0094 Epoch 14/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0134 - val_loss: 0.0097 Epoch 15/300 42/42 [==============================] - 1s 26ms/step - loss: 0.0130 - val_loss: 0.0093 Epoch 16/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0131 - val_loss: 0.0093 Epoch 17/300 42/42 [==============================] - 1s 17ms/step - loss: 0.0130 - val_loss: 0.0097 Epoch 18/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0131 - val_loss: 0.0095 Epoch 19/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0129 - val_loss: 0.0092 Epoch 20/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0129 - val_loss: 0.0093 Epoch 21/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0128 - val_loss: 0.0092 Epoch 22/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0128 - val_loss: 0.0094 Epoch 23/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0128 - val_loss: 0.0098 Epoch 24/300 42/42 [==============================] - 1s 25ms/step - loss: 0.0129 - val_loss: 0.0103 Epoch 25/300 42/42 [==============================] - 1s 21ms/step - loss: 0.0129 - val_loss: 0.0094 Epoch 26/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0128 - val_loss: 0.0092 Epoch 27/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0127 - val_loss: 0.0098 Epoch 28/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0127 - val_loss: 0.0106 Epoch 29/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0125 - val_loss: 0.0095 Epoch 30/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0126 - val_loss: 0.0098 Epoch 31/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0129 - val_loss: 0.0092 Epoch 32/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0125 - val_loss: 0.0093 Epoch 33/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0127 - val_loss: 0.0092 Epoch 34/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0124 - val_loss: 0.0091 Epoch 35/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0122 - val_loss: 0.0094 Epoch 36/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0124 - val_loss: 0.0095 Epoch 37/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0125 - val_loss: 0.0095 Epoch 38/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0125 - val_loss: 0.0096 Epoch 39/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0128 - val_loss: 0.0097 Epoch 40/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0133 - val_loss: 0.0103 Epoch 41/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0128 - val_loss: 0.0100 Epoch 42/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0124 - val_loss: 0.0093 Epoch 43/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0124 - val_loss: 0.0100 Epoch 44/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0123 - val_loss: 0.0089 Epoch 45/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0121 - val_loss: 0.0094 Epoch 46/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0123 - val_loss: 0.0100 Epoch 47/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0122 - val_loss: 0.0092 Epoch 48/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0119 - val_loss: 0.0088 Epoch 49/300 42/42 [==============================] - 1s 17ms/step - loss: 0.0121 - val_loss: 0.0092 Epoch 50/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0121 - val_loss: 0.0090 Epoch 51/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0122 - val_loss: 0.0098 Epoch 52/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0123 - val_loss: 0.0089 Epoch 53/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0120 - val_loss: 0.0088 Epoch 54/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0119 - val_loss: 0.0088 Epoch 55/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0118 - val_loss: 0.0088 Epoch 56/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0118 - val_loss: 0.0088 Epoch 57/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0117 - val_loss: 0.0087 Epoch 58/300 42/42 [==============================] - 1s 17ms/step - loss: 0.0117 - val_loss: 0.0090 Epoch 59/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0119 - val_loss: 0.0087 Epoch 60/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0116 - val_loss: 0.0088 Epoch 61/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0116 - val_loss: 0.0091 Epoch 62/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0117 - val_loss: 0.0088 Epoch 63/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0117 - val_loss: 0.0088 Epoch 64/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0116 - val_loss: 0.0087 Epoch 65/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0116 - val_loss: 0.0089 Epoch 66/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0122 - val_loss: 0.0124 Epoch 67/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0129 - val_loss: 0.0094 Epoch 68/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0122 - val_loss: 0.0097 Epoch 69/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0117 - val_loss: 0.0090 Epoch 70/300 42/42 [==============================] - 1s 34ms/step - loss: 0.0117 - val_loss: 0.0090 Epoch 71/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0118 - val_loss: 0.0088 Epoch 72/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0117 - val_loss: 0.0088 Epoch 73/300 42/42 [==============================] - 1s 21ms/step - loss: 0.0118 - val_loss: 0.0088 Epoch 74/300 42/42 [==============================] - 1s 23ms/step - loss: 0.0116 - val_loss: 0.0088 Epoch 75/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0117 - val_loss: 0.0089 Epoch 76/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0116 - val_loss: 0.0088 Epoch 77/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0114 - val_loss: 0.0088 Epoch 78/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0113 - val_loss: 0.0088 Epoch 79/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0115 - val_loss: 0.0092 Epoch 80/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0115 - val_loss: 0.0088 Epoch 81/300 42/42 [==============================] - 1s 17ms/step - loss: 0.0115 - val_loss: 0.0088 Epoch 82/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0113 - val_loss: 0.0089 Epoch 83/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0113 - val_loss: 0.0089 Epoch 84/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0113 - val_loss: 0.0088 Epoch 85/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0115 - val_loss: 0.0093 Epoch 86/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0116 - val_loss: 0.0088 Epoch 87/300 42/42 [==============================] - 1s 20ms/step - loss: 0.0112 - val_loss: 0.0088 Epoch 88/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0111 - val_loss: 0.0087 Epoch 89/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0110 - val_loss: 0.0087 Epoch 90/300 42/42 [==============================] - 1s 24ms/step - loss: 0.0111 - val_loss: 0.0092 Epoch 91/300 42/42 [==============================] - 1s 20ms/step - loss: 0.0114 - val_loss: 0.0088 Epoch 92/300 42/42 [==============================] - 1s 23ms/step - loss: 0.0119 - val_loss: 0.0101 Epoch 93/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0119 - val_loss: 0.0090 Epoch 94/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0114 - val_loss: 0.0089 Epoch 95/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0114 - val_loss: 0.0088 Epoch 96/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0112 - val_loss: 0.0089 Epoch 97/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0114 - val_loss: 0.0088 Epoch 98/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0111 - val_loss: 0.0087 Epoch 99/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0111 - val_loss: 0.0089 Epoch 100/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0112 - val_loss: 0.0087 Epoch 101/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0110 - val_loss: 0.0088 Epoch 102/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0110 - val_loss: 0.0088 Epoch 103/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0110 - val_loss: 0.0088 Epoch 104/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0111 - val_loss: 0.0089 Epoch 105/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0114 - val_loss: 0.0092 Epoch 106/300 42/42 [==============================] - 1s 28ms/step - loss: 0.0113 - val_loss: 0.0091 Epoch 107/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0112 - val_loss: 0.0090 Epoch 108/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0113 - val_loss: 0.0089 Epoch 109/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0110 - val_loss: 0.0089 Epoch 110/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0109 - val_loss: 0.0089 Epoch 111/300 42/42 [==============================] - 1s 18ms/step - loss: 0.0110 - val_loss: 0.0089 Epoch 112/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0110 - val_loss: 0.0089 Epoch 113/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0109 - val_loss: 0.0092 Epoch 114/300 42/42 [==============================] - 1s 19ms/step - loss: 0.0111 - val_loss: 0.0090 Epoch 115/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0108 - val_loss: 0.0091 Epoch 116/300 42/42 [==============================] - 1s 27ms/step - loss: 0.0108 - val_loss: 0.0090 Epoch 117/300 42/42 [==============================] - 1s 17ms/step - loss: 0.0108 - val_loss: 0.0093 Epoch 118/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0110 - val_loss: 0.0095 Epoch 119/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0109 - val_loss: 0.0091 Epoch 120/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0108 - val_loss: 0.0089 Epoch 121/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0106 - val_loss: 0.0090 Epoch 122/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0113 - val_loss: 0.0093 Epoch 123/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0116 - val_loss: 0.0094 Epoch 124/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0111 - val_loss: 0.0091 Epoch 125/300 42/42 [==============================] - 1s 29ms/step - loss: 0.0115 - val_loss: 0.0094 Epoch 126/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0111 - val_loss: 0.0093 Epoch 127/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0108 - val_loss: 0.0091 Epoch 128/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0106 - val_loss: 0.0090 Epoch 129/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0107 - val_loss: 0.0090 Epoch 130/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0105 - val_loss: 0.0088 Epoch 131/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0105 - val_loss: 0.0090 Epoch 132/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0111 - val_loss: 0.0091 Epoch 133/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0120 - val_loss: 0.0088 Epoch 134/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0111 - val_loss: 0.0089 Epoch 135/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0106 - val_loss: 0.0088 Epoch 136/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0105 - val_loss: 0.0089 Epoch 137/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0104 - val_loss: 0.0091 Epoch 138/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0106 - val_loss: 0.0090 Epoch 139/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0106 - val_loss: 0.0092 Epoch 140/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0110 - val_loss: 0.0091 Epoch 141/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0106 - val_loss: 0.0090 Epoch 142/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0106 - val_loss: 0.0090 Epoch 143/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0107 - val_loss: 0.0092 Epoch 144/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0105 - val_loss: 0.0092 Epoch 145/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0101 - val_loss: 0.0089 Epoch 146/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0099 - val_loss: 0.0090 Epoch 147/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0099 - val_loss: 0.0092 Epoch 148/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0107 - val_loss: 0.0089 Epoch 149/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0105 - val_loss: 0.0092 Epoch 150/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0100 - val_loss: 0.0091 Epoch 151/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0099 - val_loss: 0.0092 Epoch 152/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0103 - val_loss: 0.0092 Epoch 153/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0104 - val_loss: 0.0091 Epoch 154/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0099 - val_loss: 0.0089 Epoch 155/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0100 - val_loss: 0.0090 Epoch 156/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0099 - val_loss: 0.0090 Epoch 157/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0100 - val_loss: 0.0092 Epoch 158/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0101 - val_loss: 0.0091 Epoch 159/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0098 - val_loss: 0.0092 Epoch 160/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0100 - val_loss: 0.0091 Epoch 161/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0101 - val_loss: 0.0090 Epoch 162/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0105 - val_loss: 0.0093 Epoch 163/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0098 - val_loss: 0.0090 Epoch 164/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0094 - val_loss: 0.0089 Epoch 165/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0098 - val_loss: 0.0091 Epoch 166/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0096 - val_loss: 0.0094 Epoch 167/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0098 - val_loss: 0.0092 Epoch 168/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0094 - val_loss: 0.0094 Epoch 169/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0099 - val_loss: 0.0094 Epoch 170/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0108 - val_loss: 0.0098 Epoch 171/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0100 - val_loss: 0.0094 Epoch 172/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0095 - val_loss: 0.0091 Epoch 173/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0099 - val_loss: 0.0095 Epoch 174/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0096 - val_loss: 0.0094 Epoch 175/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0094 - val_loss: 0.0094 Epoch 176/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0094 - val_loss: 0.0098 Epoch 177/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0093 - val_loss: 0.0095 Epoch 178/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0091 - val_loss: 0.0094 Epoch 179/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0094 - val_loss: 0.0093 Epoch 180/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0097 - val_loss: 0.0091 Epoch 181/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0092 - val_loss: 0.0090 Epoch 182/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0090 - val_loss: 0.0093 Epoch 183/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0095 - val_loss: 0.0093 Epoch 184/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0091 - val_loss: 0.0093 Epoch 185/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0089 - val_loss: 0.0093 Epoch 186/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0088 - val_loss: 0.0092 Epoch 187/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0087 - val_loss: 0.0095 Epoch 188/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0089 - val_loss: 0.0094 Epoch 189/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0086 - val_loss: 0.0094 Epoch 190/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0083 - val_loss: 0.0095 Epoch 191/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0090 - val_loss: 0.0093 Epoch 192/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0086 - val_loss: 0.0094 Epoch 193/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0083 - val_loss: 0.0091 Epoch 194/300 42/42 [==============================] - 1s 25ms/step - loss: 0.0092 - val_loss: 0.0098 Epoch 195/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0098 - val_loss: 0.0099 Epoch 196/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0094 - val_loss: 0.0097 Epoch 197/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0085 - val_loss: 0.0094 Epoch 198/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0085 - val_loss: 0.0097 Epoch 199/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0084 - val_loss: 0.0093 Epoch 200/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0084 - val_loss: 0.0097 Epoch 201/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0083 - val_loss: 0.0099 Epoch 202/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0084 - val_loss: 0.0098 Epoch 203/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0083 - val_loss: 0.0099 Epoch 204/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0083 - val_loss: 0.0094 Epoch 205/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0084 - val_loss: 0.0100 Epoch 206/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0088 - val_loss: 0.0094 Epoch 207/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0087 - val_loss: 0.0098 Epoch 208/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0079 - val_loss: 0.0096 Epoch 209/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0079 - val_loss: 0.0097 Epoch 210/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0078 - val_loss: 0.0097 Epoch 211/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0081 - val_loss: 0.0098 Epoch 212/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0079 - val_loss: 0.0103 Epoch 213/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0079 - val_loss: 0.0093 Epoch 214/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0079 - val_loss: 0.0099 Epoch 215/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0074 - val_loss: 0.0099 Epoch 216/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0077 - val_loss: 0.0099 Epoch 217/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0079 - val_loss: 0.0099 Epoch 218/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0075 - val_loss: 0.0099 Epoch 219/300 42/42 [==============================] - 1s 18ms/step - loss: 0.0074 - val_loss: 0.0098 Epoch 220/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0076 - val_loss: 0.0095 Epoch 221/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0083 - val_loss: 0.0095 Epoch 222/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0076 - val_loss: 0.0097 Epoch 223/300 42/42 [==============================] - 1s 17ms/step - loss: 0.0076 - val_loss: 0.0097 Epoch 224/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0073 - val_loss: 0.0096 Epoch 225/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0071 - val_loss: 0.0100 Epoch 226/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0072 - val_loss: 0.0100 Epoch 227/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0072 - val_loss: 0.0098 Epoch 228/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0075 - val_loss: 0.0099 Epoch 229/300 42/42 [==============================] - 1s 20ms/step - loss: 0.0070 - val_loss: 0.0099 Epoch 230/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0071 - val_loss: 0.0102 Epoch 231/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0068 - val_loss: 0.0098 Epoch 232/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0067 - val_loss: 0.0102 Epoch 233/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0071 - val_loss: 0.0096 Epoch 234/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0073 - val_loss: 0.0098 Epoch 235/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0066 - val_loss: 0.0101 Epoch 236/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0064 - val_loss: 0.0101 Epoch 237/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0063 - val_loss: 0.0102 Epoch 238/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0066 - val_loss: 0.0103 Epoch 239/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0067 - val_loss: 0.0100 Epoch 240/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0064 - val_loss: 0.0101 Epoch 241/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0060 - val_loss: 0.0104 Epoch 242/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0063 - val_loss: 0.0102 Epoch 243/300 42/42 [==============================] - 1s 20ms/step - loss: 0.0071 - val_loss: 0.0105 Epoch 244/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0067 - val_loss: 0.0107 Epoch 245/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0061 - val_loss: 0.0102 Epoch 246/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0059 - val_loss: 0.0107 Epoch 247/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0063 - val_loss: 0.0103 Epoch 248/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0064 - val_loss: 0.0103 Epoch 249/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0064 - val_loss: 0.0099 Epoch 250/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0061 - val_loss: 0.0113 Epoch 251/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0057 - val_loss: 0.0103 Epoch 252/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0058 - val_loss: 0.0098 Epoch 253/300 42/42 [==============================] - 1s 19ms/step - loss: 0.0057 - val_loss: 0.0111 Epoch 254/300 42/42 [==============================] - 1s 19ms/step - loss: 0.0057 - val_loss: 0.0108 Epoch 255/300 42/42 [==============================] - 1s 22ms/step - loss: 0.0057 - val_loss: 0.0108 Epoch 256/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0059 - val_loss: 0.0105 Epoch 257/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0065 - val_loss: 0.0115 Epoch 258/300 42/42 [==============================] - 1s 17ms/step - loss: 0.0058 - val_loss: 0.0111 Epoch 259/300 42/42 [==============================] - 3s 62ms/step - loss: 0.0054 - val_loss: 0.0107 Epoch 260/300 42/42 [==============================] - 2s 46ms/step - loss: 0.0052 - val_loss: 0.0111 Epoch 261/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0052 - val_loss: 0.0121 Epoch 262/300 42/42 [==============================] - 1s 30ms/step - loss: 0.0053 - val_loss: 0.0113 Epoch 263/300 42/42 [==============================] - 1s 17ms/step - loss: 0.0056 - val_loss: 0.0106 Epoch 264/300 42/42 [==============================] - 1s 17ms/step - loss: 0.0051 - val_loss: 0.0109 Epoch 265/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0050 - val_loss: 0.0110 Epoch 266/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0048 - val_loss: 0.0113 Epoch 267/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0050 - val_loss: 0.0118 Epoch 268/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0049 - val_loss: 0.0114 Epoch 269/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0046 - val_loss: 0.0120 Epoch 270/300 42/42 [==============================] - 1s 21ms/step - loss: 0.0046 - val_loss: 0.0128 Epoch 271/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0049 - val_loss: 0.0114 Epoch 272/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0050 - val_loss: 0.0116 Epoch 273/300 42/42 [==============================] - 0s 11ms/step - loss: 0.0053 - val_loss: 0.0113 Epoch 274/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0049 - val_loss: 0.0129 Epoch 275/300 42/42 [==============================] - 1s 24ms/step - loss: 0.0048 - val_loss: 0.0108 Epoch 276/300 42/42 [==============================] - 1s 28ms/step - loss: 0.0047 - val_loss: 0.0123 Epoch 277/300 42/42 [==============================] - 1s 21ms/step - loss: 0.0047 - val_loss: 0.0120 Epoch 278/300 42/42 [==============================] - 1s 12ms/step - loss: 0.0045 - val_loss: 0.0118 Epoch 279/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0045 - val_loss: 0.0117 Epoch 280/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0042 - val_loss: 0.0121 Epoch 281/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0040 - val_loss: 0.0117 Epoch 282/300 42/42 [==============================] - 2s 47ms/step - loss: 0.0043 - val_loss: 0.0115 Epoch 283/300 42/42 [==============================] - 1s 34ms/step - loss: 0.0046 - val_loss: 0.0121 Epoch 284/300 42/42 [==============================] - 1s 17ms/step - loss: 0.0050 - val_loss: 0.0120 Epoch 285/300 42/42 [==============================] - 1s 27ms/step - loss: 0.0043 - val_loss: 0.0117 Epoch 286/300 42/42 [==============================] - 1s 23ms/step - loss: 0.0043 - val_loss: 0.0113 Epoch 287/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0041 - val_loss: 0.0120 Epoch 288/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0040 - val_loss: 0.0131 Epoch 289/300 42/42 [==============================] - 1s 18ms/step - loss: 0.0043 - val_loss: 0.0117 Epoch 290/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0043 - val_loss: 0.0119 Epoch 291/300 42/42 [==============================] - 3s 72ms/step - loss: 0.0040 - val_loss: 0.0126 Epoch 292/300 42/42 [==============================] - 1s 14ms/step - loss: 0.0041 - val_loss: 0.0114 Epoch 293/300 42/42 [==============================] - 1s 15ms/step - loss: 0.0039 - val_loss: 0.0126 Epoch 294/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0038 - val_loss: 0.0119 Epoch 295/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0040 - val_loss: 0.0123 Epoch 296/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0039 - val_loss: 0.0128 Epoch 297/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0036 - val_loss: 0.0127 Epoch 298/300 42/42 [==============================] - 0s 12ms/step - loss: 0.0038 - val_loss: 0.0121 Epoch 299/300 42/42 [==============================] - 1s 16ms/step - loss: 0.0037 - val_loss: 0.0128 Epoch 300/300 42/42 [==============================] - 1s 13ms/step - loss: 0.0034 - val_loss: 0.0136 DirAcc: 67.97% Pearson Correlation: 0.588
# 15) Distribution
plt.figure(figsize=(6,3))
plt.hist(y_pred, bins=40, alpha=0.7)
plt.xlabel("Predicted value (tanh ∈ [-1, 1])")
plt.ylabel("Count")
plt.title("Distribution of model predictions")
plt.grid(True)
plt.show()
Simple back test on test data.
# 16) Get raw daily return in test set
ret_col = 'Change %'
ret_arr = df_test[ret_col].astype(float).values
# 17) Generate position signals
threshold = 0.01 # if |y_pred| ≤ threshold → do nothing
position = (y_pred > threshold).astype(float)
# 18) Backtest
equity_pred, equity_bh = [1.0], [1.0]
for t in range(len(y_pred)):
r = ret_arr[t + window]
equity_pred.append(equity_pred[-1] * (1 + position[t] * r))
equity_bh.append( equity_bh[-1] * (1 + r))
equity_pred, equity_bh = np.array(equity_pred), np.array(equity_bh)
# 19) Performance metrics
def cagr(eq, ppy=365): # Compound annual growth rate
return (eq[-1] / eq[0]) ** (ppy / len(eq)) - 1
def sharpe(eq, ppy=365): # Sharpe ratio
r = np.diff(eq) / eq[:-1]
return r.mean() / r.std() * np.sqrt(ppy)
print("\n=== Strategy vs Buy & Hold ===")
print(f"CAGR : {cagr(equity_pred):6.2%} | {cagr(equity_bh):6.2%}")
print(f"Sharpe : {sharpe(equity_pred):5.2f} | {sharpe(equity_bh):5.2f}")
# 20) Equity curve visualization
plt.figure(figsize=(8, 4))
plt.plot(equity_pred, label='Prediction Strategy')
plt.plot(equity_bh, label='Buy & Hold', linestyle='--')
plt.title("Equity Curve Comparison")
plt.xlabel("Step"); plt.ylabel("Equity"); plt.grid(True)
plt.legend(); plt.show()
# Prediction diagnostics
# 21) Directional accuracy and Pearson correlation
hit = (np.sign(y_pred) == np.sign(y_test))
dir_acc = hit.mean() # directional accuracy
baseline = (y_test > 0).mean() # always-long benchmark
pears, p_val = pearsonr(y_pred, y_test) #
print(f"DirAcc : {dir_acc*100:5.2f}%")
print(f" ↳ baseline (long): {baseline*100:5.2f}% → Δ = {(dir_acc-baseline)*100:+4.2f}%")
print(f"Pearson corr : {pears:6.3f} (p = {p_val:.3g})")
# 22) Binomial test: is hit-rate > 50 %? #
p_binom = binomtest(hit.sum(), n=len(hit), p=0.5, alternative='greater').pvalue
print(f"Binom test vs 50% : p = {p_binom:.3g}")
# 23) Rolling stability plots
# This is tracking how well the model performs over the last 100 days by plotting its hit-rate and correlation with actual returns.
# at each point, the value means in past 100 days, how the many days the model predict right/ how hard the model predict right.
win = 100 #rolling window last 100 days
roll_dir = pd.Series(hit).rolling(win).mean() #
roll_corr = (pd.Series(y_pred).rolling(win) #
.corr(pd.Series(y_test)))
fig, ax = plt.subplots(2, 1, figsize=(10, 5), sharex=True)
ax[0].plot(roll_dir, label=f'Rolling DirAcc ({win})')
ax[0].axhline(0.5, ls='--', c='grey')
ax[0].set_ylabel('DirAcc'); ax[0].legend(); ax[0].grid(True)
ax[1].plot(roll_corr, label=f'Rolling Pearson ({win})', c='orange')
ax[1].axhline(0, ls='--', c='grey')
ax[1].set_ylabel('Corr'); ax[1].set_xlabel('Test sample index')
ax[1].legend(); ax[1].grid(True)
plt.tight_layout(); plt.show()
=== Strategy vs Buy & Hold === CAGR : 44.45% | 134.94% Sharpe : 1.26 | 2.02
DirAcc : 67.97% ↳ baseline (long): 50.74% → Δ = +17.23% Pearson corr : 0.588 (p = 2.68e-70) Binom test vs 50% : p = 2.9e-23
From these two graphs, we could saw the model in the worse scenario, the DirAcc is still above 55%.
About the tries: I tried removing the price column when I was getting closer to a successful run. The reason was: when using minmaxscale only on the training set, values above 1 would appear during testing. I was worried this would affect generalization. And I was right—when I included price as both an input and the target, the prediction accuracy dropped significantly. Even if I kept price as an input only, but not as the target, the model still didn’t perform well. Only when I completely removed price from the data, the overall accuracy and precision finally improved.
mse/mae? I honestly didn’t test this in detail. But I know MSE punishes larger errors more heavily, while MAE punishes errors linearly. From what I saw so far, both gave similar directional predictions (actually MAE worked a bit better,but I was still wondering if there was some kind of data leakage I didn’t notice), So I plan to revisit the MSE vs. MAE choice later, if I want to use this in a real market to make profit.
Actually, I didn’t plan to use percentage change for analysis at first. I started with raw daily prices, but the results were pretty bad. So I thought, maybe I could switch to some other reference form for simulation. That’s when I brought in the percentage change of these three assets as the new input strategy. My guess is, what we’re actually predicting(or caring) is the next day’s rate of change. So in such a way, price is like the integral of the rate of change. Which makes it further away from what we actually want. And that makes it harder for LSTM to learn.
Honestly, I didn’t realize at first that what I cared about was the rate of change. But later I figured out—yes, the only thing we’re really interested in here is the change. After all, even if you know yesterday’s price, You can still make a statistically better guess of the next day’s value by just randomly picking something within ±5%.
Alright—To be honest, during this recent BTC bull cycle, simple quant-based models really couldn’t bring more profit compared to just buying and holding.It’s just that, if I had spent a bit more time digging into that ~65% win rate, maybe I could’ve found a few arbitrage opportunities. But those things have nothing to do with AI or LSTM, and even if I found them, I wouldn’t include them here—because financial markets are the kind where the more people use a same strategy, the worse it performs.That said, I’m genuinely satisfied with this round of optimization. The version you see now performs way better than the one I submitted as a final course assignment (which was floating around 50%).At one point, I even suspected there might be some data leakage, but after double-checking everything,I felt a bit more at ease.
So yeah—this will my first full-packaged personal project I’ve ever finished. Not bad at all. 🎉
You probably wouldn’t seriously expect an undergrad physics major to pull off something groundbreaking on their first try at an AI side project.
After all, it’s your money, and I’m just here trying to satisfy my curiosity. Honestly, from a personal perspective, I’ve always felt like I’d never beat AI when it comes to solving math problems that look like they belong in grade school. So instead, I just wanted to see if the code I wrote could perform better than I would when it comes to investing. And now, looking back—it didn’t do too bad. At least it doesn’t do things like going 100x leverage long on a second-tier crypto token and wiping out the entire portfolio on a tiny price swing.(Yes, I have no idea why I ever did something that insane decision either.)
Anyway.