Cross-Asset Information in Bitcoin Intraday Price Prediction,Using Gold and the S&P 500¶

1. Introduction:When finding the relationship between Bitcoin and traditional macro-financial assets, we notice an interesting thing¶

In the early time of Bitcoin’s development, it acted more like a substitute to traditional commodities—when one price went up, the other often went down, showing a kind of inverse movement.

But in recent years, as Bitcoin’s market size grew and more people get into it, this relationship started to shift to another side.

Now, more and more, Bitcoin’s price trend begins to move in the same direction (in phase) with the real economy or traditional assets. In other words, its price began to align positively with macro market trends like U.S. stocks, gold, or even policy changes.

This makes us wonder: is Bitcoin becoming a kind of complement rather than a substitute?

If so, can we try to use the early moves of traditional markets to predict Bitcoin’s future direction?

That’s the core question I want to explore in this project.

When choosing the model, I started with LSTM as the base layer. One reason is that it was relatively mature for time series tasks (at least in 2 years ago when I finished most of the work of this project). Another reason is honestly because I hadn’t really touched Transformer models yet at taht time, haha.

If I get more time later, I might try to extend the model—maybe use Diffusion-based architectures and see if they can improve prediction performance.

In [1]:
# import 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from scipy.stats import pearsonr, binomtest

print("TensorFlow ver:", tf.__version__)
print("GPU number:", len(tf.config.list_physical_devices("GPU")))
TensorFlow ver: 2.6.0
GPU number: 1

2. Data & Preprocessing¶

To build this model, I mainly used three types of market data, all at daily frequency (1 day).

Why daily? Because BTC trades 24/7, but obviously stocks don’t—so to avoid too much data loss, I picked 1-day as the most stable and consistent option.

The time span is roughly from Jan 1, 2014 to Dec 31, 2024—almost 4000 days—basically covering the full phase from Bitcoin’s speculative period to its recent gradual financialization.

Data sources:

Bitcoin (BTC/USDT): from Investing.com, includes daily price and volume

Gold (GLD): from Investing.com, used as a traditional safe-haven asset

S&P 500 Index (SP500): also from Investing, used as a representation for the overall U.S. stock market

I aligned all data by date and handled missing values using forward fill, to avoid abrupt breaks in the time series. At the same time, I removed clearly abnormal values (like extreme spikes due to zero volume).I chose forward fill here because it matches more of a real-world investor mindset—after all, during U.S. stock market holidays, most traders still see the same last-available price and volume.

For the next steps, all features will be extracted, combined, and normalized based on this cleaned dataset.

In [2]:
# 1) Read three CSV files
btc   = pd.read_csv("BTC1.csv",   parse_dates=["Date"], index_col="Date")[["Price","Change %","Vol."]]
gld   = pd.read_csv("GOLD1.csv",   parse_dates=["Date"], index_col="Date")[["Change %","Vol."]]
sp500 = pd.read_csv("SP5001.csv", parse_dates=["Date"], index_col="Date")[["Change %","Vol."]]

# 2) Rename close price and volume columns
gld.rename(columns={"Change %":"GLD_Change %",   "Vol.":"GLD_Vol"},   inplace=True)
sp500.rename(columns={"Change %":"SP500_Change %","Vol.":"SP500_Vol"}, inplace=True)

# 3) Merge +  forward-fill + drop na
df = pd.concat([
    btc[["Change %","Vol."]],
    gld[["GLD_Change %","GLD_Vol"]],
    sp500[["SP500_Change %"]]
], axis=1)

df.ffill(inplace=True)
df.dropna(inplace=True)
print(df.head())
print(df.columns)
df.shape
           Change %   Vol. GLD_Change % GLD_Vol SP500_Change %
Date                                                          
2014-01-02    4.69%  0.13K        1.62%   7.57M         -0.89%
2014-01-03    4.78%  0.13K        1.09%   5.88M         -0.03%
2014-01-04   -1.26%  0.01K        1.09%   5.88M         -0.03%
2014-01-05   12.74%  0.02K        1.09%   5.88M         -0.03%
2014-01-06    3.38%  0.10K        0.18%  10.11M         -0.25%
Index(['Change %', 'Vol.', 'GLD_Change %', 'GLD_Vol', 'SP500_Change %'], dtype='object')
Out[2]:
(4015, 5)

Here, I chose the percentage changes (volatility) of BTC, gold, and SP500 as inputs, instead of their actual real price values. The detailed reason for this choice is discussed later in the last thoughts section.

In [3]:
# 4) Strip potential leading/trailing spaces
df.columns = df.columns.str.strip()     

pct_cols = ['Change %', 'GLD_Change %', 'SP500_Change %']

df[pct_cols] = (
    df[pct_cols]
      .replace({'%': ''}, regex=True)   # Remove '%'
      .astype(float)                  # convert to float 
      .div(100)                       # divide by 100 
)                       # Remove '%' → convert to float → divide by 100

# 5)Replace K/M/B with scientific notation
vol_cols = [ 'Vol.', 'GLD_Vol']

df[vol_cols] = (
    df[vol_cols]
      .replace({',':''}, regex=True)      # Remove commas
      .replace({'K':'e3','M':'e6', 'B':'e9'}, regex=True)  # K→e3, M→e6
      .astype(float)                      # convert to float
)

# 6) Split by first appearance into train/test sets (80/20)
train_size = int(len(df) * 0.8)
df_train = df.iloc[:train_size]
df_test  = df.iloc[train_size:]

#Please note,for step 7,8,9 it is kind of useless here, most of the values do not change.I wrote when the price was in df.

# 7) Initialize MinMaxScaler 
scaler = MinMaxScaler(feature_range=(0, 1))

# 8) Fit only on train set to avoid data leakage
scaler.fit(df_train)

# 9) Transform train and test sets
train_scaled = scaler.transform(df_train)
test_scaled  = scaler.transform(df_test)

df_train
Out[3]:
Change % Vol. GLD_Change % GLD_Vol SP500_Change %
Date
2014-01-02 0.0469 130.0 0.0162 7570000.0 -0.0089
2014-01-03 0.0478 130.0 0.0109 5880000.0 -0.0003
2014-01-04 -0.0126 10.0 0.0109 5880000.0 -0.0003
2014-01-05 0.1274 20.0 0.0109 5880000.0 -0.0003
2014-01-06 0.0338 100.0 0.0018 10110000.0 -0.0025
... ... ... ... ... ...
2022-10-16 0.0103 780.0 -0.0125 5470000.0 -0.0237
2022-10-17 0.0148 1450.0 0.0029 4350000.0 0.0265
2022-10-18 -0.0113 2120.0 0.0022 4530000.0 0.0114
2022-10-19 -0.0104 860.0 -0.0134 8260000.0 -0.0067
2022-10-20 -0.0039 1270.0 -0.0016 5020000.0 -0.0080

3212 rows × 5 columns

3 Model Bulid¶

In [4]:
# 10) Sliding window
window = 60                # Use past 60 days to predict the next day
price_idx = 0              # the 0th column of price
eps=1e-11 
def make_xy_regression(data, window, price_idx):
    X, y = [], []
    for i in range(window, len(data)):
        X.append(data[i-window:i])
        ret = (data[i, price_idx] - data[i-1, price_idx]) / (data[i-1, price_idx]+eps)
        ret = np.clip(ret, -1, 1)
        y.append(ret)
    return np.array(X), np.array(y) # Use simple return as prediction target, clip to [-1, 1] to avoid outliers

X_train, y_train = make_xy_regression(train_scaled, window, price_idx)
X_test,  y_test  = make_xy_regression(test_scaled,  window, price_idx)

print("Train shape:", X_train.shape, y_train.shape)
print("Test  shape:", X_test.shape,  y_test.shape)

# 11) Build model
model = Sequential([
    LSTM(128, return_sequences=True, input_shape=(window, train_scaled.shape[1])),
    Dropout(0.05),
    LSTM(128, return_sequences=False),
    Dropout(0.01),
    Dense(1, activation='tanh')    # tanh ∈ [-1, 1]                          # Predict a single scalar
])

model.compile(optimizer='adam',
              loss='mse')   # or loss='mae'

# 11) Training
es = EarlyStopping(monitor='val_loss', patience=50, restore_best_weights=True)
ckpt = ModelCheckpoint('best_lstm.h5', save_best_only=True)

history = model.fit(
    X_train, y_train,
    epochs=300,
    batch_size=64,
    validation_split=0.15,
    shuffle=False,
    verbose=1
)

y_pred = model.predict(X_test).squeeze()

# 12) DirAcc (directional accuracy: proportion of correct up/down predictions)
sign_true = np.sign(y_test)
sign_pred = np.sign(y_pred)
diracc = np.mean(sign_true == sign_pred)
print(f"DirAcc: {diracc*100:.2f}%")

# 13) Correlation (how well predicted returns align with actual returns)
corr, _ = pearsonr(y_pred, y_test)
print(f"Pearson Correlation: {corr:.3f}")

# 14) Plot
plt.scatter(y_test, y_pred, alpha=0.3)
plt.xlabel("True Return"); plt.ylabel("Predicted")
plt.title("pre vs real")
plt.grid(True); plt.show()
Train shape: (3152, 60, 5) (3152,)
Test  shape: (743, 60, 5) (743,)
Epoch 1/300
42/42 [==============================] - 4s 19ms/step - loss: 0.0263 - val_loss: 0.0180
Epoch 2/300
42/42 [==============================] - 0s 9ms/step - loss: 0.0248 - val_loss: 0.0178
Epoch 3/300
42/42 [==============================] - 0s 9ms/step - loss: 0.0245 - val_loss: 0.0176
Epoch 4/300
42/42 [==============================] - 0s 9ms/step - loss: 0.0242 - val_loss: 0.0175
Epoch 5/300
42/42 [==============================] - 0s 9ms/step - loss: 0.0237 - val_loss: 0.0171
Epoch 6/300
42/42 [==============================] - 0s 9ms/step - loss: 0.0219 - val_loss: 0.0141
Epoch 7/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0181 - val_loss: 0.0116
Epoch 8/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0154 - val_loss: 0.0101
Epoch 9/300
42/42 [==============================] - 2s 37ms/step - loss: 0.0143 - val_loss: 0.0102
Epoch 10/300
42/42 [==============================] - 0s 10ms/step - loss: 0.0137 - val_loss: 0.0096
Epoch 11/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0136 - val_loss: 0.0098
Epoch 12/300
42/42 [==============================] - 1s 20ms/step - loss: 0.0134 - val_loss: 0.0098
Epoch 13/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0134 - val_loss: 0.0094
Epoch 14/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0134 - val_loss: 0.0097
Epoch 15/300
42/42 [==============================] - 1s 26ms/step - loss: 0.0130 - val_loss: 0.0093
Epoch 16/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0131 - val_loss: 0.0093
Epoch 17/300
42/42 [==============================] - 1s 17ms/step - loss: 0.0130 - val_loss: 0.0097
Epoch 18/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0131 - val_loss: 0.0095
Epoch 19/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0129 - val_loss: 0.0092
Epoch 20/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0129 - val_loss: 0.0093
Epoch 21/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0128 - val_loss: 0.0092
Epoch 22/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0128 - val_loss: 0.0094
Epoch 23/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0128 - val_loss: 0.0098
Epoch 24/300
42/42 [==============================] - 1s 25ms/step - loss: 0.0129 - val_loss: 0.0103
Epoch 25/300
42/42 [==============================] - 1s 21ms/step - loss: 0.0129 - val_loss: 0.0094
Epoch 26/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0128 - val_loss: 0.0092
Epoch 27/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0127 - val_loss: 0.0098
Epoch 28/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0127 - val_loss: 0.0106
Epoch 29/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0125 - val_loss: 0.0095
Epoch 30/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0126 - val_loss: 0.0098
Epoch 31/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0129 - val_loss: 0.0092
Epoch 32/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0125 - val_loss: 0.0093
Epoch 33/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0127 - val_loss: 0.0092
Epoch 34/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0124 - val_loss: 0.0091
Epoch 35/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0122 - val_loss: 0.0094
Epoch 36/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0124 - val_loss: 0.0095
Epoch 37/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0125 - val_loss: 0.0095
Epoch 38/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0125 - val_loss: 0.0096
Epoch 39/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0128 - val_loss: 0.0097
Epoch 40/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0133 - val_loss: 0.0103
Epoch 41/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0128 - val_loss: 0.0100
Epoch 42/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0124 - val_loss: 0.0093
Epoch 43/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0124 - val_loss: 0.0100
Epoch 44/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0123 - val_loss: 0.0089
Epoch 45/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0121 - val_loss: 0.0094
Epoch 46/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0123 - val_loss: 0.0100
Epoch 47/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0122 - val_loss: 0.0092
Epoch 48/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0119 - val_loss: 0.0088
Epoch 49/300
42/42 [==============================] - 1s 17ms/step - loss: 0.0121 - val_loss: 0.0092
Epoch 50/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0121 - val_loss: 0.0090
Epoch 51/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0122 - val_loss: 0.0098
Epoch 52/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0123 - val_loss: 0.0089
Epoch 53/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0120 - val_loss: 0.0088
Epoch 54/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0119 - val_loss: 0.0088
Epoch 55/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0118 - val_loss: 0.0088
Epoch 56/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0118 - val_loss: 0.0088
Epoch 57/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0117 - val_loss: 0.0087
Epoch 58/300
42/42 [==============================] - 1s 17ms/step - loss: 0.0117 - val_loss: 0.0090
Epoch 59/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0119 - val_loss: 0.0087
Epoch 60/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0116 - val_loss: 0.0088
Epoch 61/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0116 - val_loss: 0.0091
Epoch 62/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0117 - val_loss: 0.0088
Epoch 63/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0117 - val_loss: 0.0088
Epoch 64/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0116 - val_loss: 0.0087
Epoch 65/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0116 - val_loss: 0.0089
Epoch 66/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0122 - val_loss: 0.0124
Epoch 67/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0129 - val_loss: 0.0094
Epoch 68/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0122 - val_loss: 0.0097
Epoch 69/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0117 - val_loss: 0.0090
Epoch 70/300
42/42 [==============================] - 1s 34ms/step - loss: 0.0117 - val_loss: 0.0090
Epoch 71/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0118 - val_loss: 0.0088
Epoch 72/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0117 - val_loss: 0.0088
Epoch 73/300
42/42 [==============================] - 1s 21ms/step - loss: 0.0118 - val_loss: 0.0088
Epoch 74/300
42/42 [==============================] - 1s 23ms/step - loss: 0.0116 - val_loss: 0.0088
Epoch 75/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0117 - val_loss: 0.0089
Epoch 76/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0116 - val_loss: 0.0088
Epoch 77/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0114 - val_loss: 0.0088
Epoch 78/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0113 - val_loss: 0.0088
Epoch 79/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0115 - val_loss: 0.0092
Epoch 80/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0115 - val_loss: 0.0088
Epoch 81/300
42/42 [==============================] - 1s 17ms/step - loss: 0.0115 - val_loss: 0.0088
Epoch 82/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0113 - val_loss: 0.0089
Epoch 83/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0113 - val_loss: 0.0089
Epoch 84/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0113 - val_loss: 0.0088
Epoch 85/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0115 - val_loss: 0.0093
Epoch 86/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0116 - val_loss: 0.0088
Epoch 87/300
42/42 [==============================] - 1s 20ms/step - loss: 0.0112 - val_loss: 0.0088
Epoch 88/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0111 - val_loss: 0.0087
Epoch 89/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0110 - val_loss: 0.0087
Epoch 90/300
42/42 [==============================] - 1s 24ms/step - loss: 0.0111 - val_loss: 0.0092
Epoch 91/300
42/42 [==============================] - 1s 20ms/step - loss: 0.0114 - val_loss: 0.0088
Epoch 92/300
42/42 [==============================] - 1s 23ms/step - loss: 0.0119 - val_loss: 0.0101
Epoch 93/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0119 - val_loss: 0.0090
Epoch 94/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0114 - val_loss: 0.0089
Epoch 95/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0114 - val_loss: 0.0088
Epoch 96/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0112 - val_loss: 0.0089
Epoch 97/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0114 - val_loss: 0.0088
Epoch 98/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0111 - val_loss: 0.0087
Epoch 99/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0111 - val_loss: 0.0089
Epoch 100/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0112 - val_loss: 0.0087
Epoch 101/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0110 - val_loss: 0.0088
Epoch 102/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0110 - val_loss: 0.0088
Epoch 103/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0110 - val_loss: 0.0088
Epoch 104/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0111 - val_loss: 0.0089
Epoch 105/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0114 - val_loss: 0.0092
Epoch 106/300
42/42 [==============================] - 1s 28ms/step - loss: 0.0113 - val_loss: 0.0091
Epoch 107/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0112 - val_loss: 0.0090
Epoch 108/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0113 - val_loss: 0.0089
Epoch 109/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0110 - val_loss: 0.0089
Epoch 110/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0109 - val_loss: 0.0089
Epoch 111/300
42/42 [==============================] - 1s 18ms/step - loss: 0.0110 - val_loss: 0.0089
Epoch 112/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0110 - val_loss: 0.0089
Epoch 113/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0109 - val_loss: 0.0092
Epoch 114/300
42/42 [==============================] - 1s 19ms/step - loss: 0.0111 - val_loss: 0.0090
Epoch 115/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0108 - val_loss: 0.0091
Epoch 116/300
42/42 [==============================] - 1s 27ms/step - loss: 0.0108 - val_loss: 0.0090
Epoch 117/300
42/42 [==============================] - 1s 17ms/step - loss: 0.0108 - val_loss: 0.0093
Epoch 118/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0110 - val_loss: 0.0095
Epoch 119/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0109 - val_loss: 0.0091
Epoch 120/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0108 - val_loss: 0.0089
Epoch 121/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0106 - val_loss: 0.0090
Epoch 122/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0113 - val_loss: 0.0093
Epoch 123/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0116 - val_loss: 0.0094
Epoch 124/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0111 - val_loss: 0.0091
Epoch 125/300
42/42 [==============================] - 1s 29ms/step - loss: 0.0115 - val_loss: 0.0094
Epoch 126/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0111 - val_loss: 0.0093
Epoch 127/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0108 - val_loss: 0.0091
Epoch 128/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0106 - val_loss: 0.0090
Epoch 129/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0107 - val_loss: 0.0090
Epoch 130/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0105 - val_loss: 0.0088
Epoch 131/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0105 - val_loss: 0.0090
Epoch 132/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0111 - val_loss: 0.0091
Epoch 133/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0120 - val_loss: 0.0088
Epoch 134/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0111 - val_loss: 0.0089
Epoch 135/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0106 - val_loss: 0.0088
Epoch 136/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0105 - val_loss: 0.0089
Epoch 137/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0104 - val_loss: 0.0091
Epoch 138/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0106 - val_loss: 0.0090
Epoch 139/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0106 - val_loss: 0.0092
Epoch 140/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0110 - val_loss: 0.0091
Epoch 141/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0106 - val_loss: 0.0090
Epoch 142/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0106 - val_loss: 0.0090
Epoch 143/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0107 - val_loss: 0.0092
Epoch 144/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0105 - val_loss: 0.0092
Epoch 145/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0101 - val_loss: 0.0089
Epoch 146/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0099 - val_loss: 0.0090
Epoch 147/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0099 - val_loss: 0.0092
Epoch 148/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0107 - val_loss: 0.0089
Epoch 149/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0105 - val_loss: 0.0092
Epoch 150/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0100 - val_loss: 0.0091
Epoch 151/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0099 - val_loss: 0.0092
Epoch 152/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0103 - val_loss: 0.0092
Epoch 153/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0104 - val_loss: 0.0091
Epoch 154/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0099 - val_loss: 0.0089
Epoch 155/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0100 - val_loss: 0.0090
Epoch 156/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0099 - val_loss: 0.0090
Epoch 157/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0100 - val_loss: 0.0092
Epoch 158/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0101 - val_loss: 0.0091
Epoch 159/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0098 - val_loss: 0.0092
Epoch 160/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0100 - val_loss: 0.0091
Epoch 161/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0101 - val_loss: 0.0090
Epoch 162/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0105 - val_loss: 0.0093
Epoch 163/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0098 - val_loss: 0.0090
Epoch 164/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0094 - val_loss: 0.0089
Epoch 165/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0098 - val_loss: 0.0091
Epoch 166/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0096 - val_loss: 0.0094
Epoch 167/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0098 - val_loss: 0.0092
Epoch 168/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0094 - val_loss: 0.0094
Epoch 169/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0099 - val_loss: 0.0094
Epoch 170/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0108 - val_loss: 0.0098
Epoch 171/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0100 - val_loss: 0.0094
Epoch 172/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0095 - val_loss: 0.0091
Epoch 173/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0099 - val_loss: 0.0095
Epoch 174/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0096 - val_loss: 0.0094
Epoch 175/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0094 - val_loss: 0.0094
Epoch 176/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0094 - val_loss: 0.0098
Epoch 177/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0093 - val_loss: 0.0095
Epoch 178/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0091 - val_loss: 0.0094
Epoch 179/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0094 - val_loss: 0.0093
Epoch 180/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0097 - val_loss: 0.0091
Epoch 181/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0092 - val_loss: 0.0090
Epoch 182/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0090 - val_loss: 0.0093
Epoch 183/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0095 - val_loss: 0.0093
Epoch 184/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0091 - val_loss: 0.0093
Epoch 185/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0089 - val_loss: 0.0093
Epoch 186/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0088 - val_loss: 0.0092
Epoch 187/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0087 - val_loss: 0.0095
Epoch 188/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0089 - val_loss: 0.0094
Epoch 189/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0086 - val_loss: 0.0094
Epoch 190/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0083 - val_loss: 0.0095
Epoch 191/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0090 - val_loss: 0.0093
Epoch 192/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0086 - val_loss: 0.0094
Epoch 193/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0083 - val_loss: 0.0091
Epoch 194/300
42/42 [==============================] - 1s 25ms/step - loss: 0.0092 - val_loss: 0.0098
Epoch 195/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0098 - val_loss: 0.0099
Epoch 196/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0094 - val_loss: 0.0097
Epoch 197/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0085 - val_loss: 0.0094
Epoch 198/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0085 - val_loss: 0.0097
Epoch 199/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0084 - val_loss: 0.0093
Epoch 200/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0084 - val_loss: 0.0097
Epoch 201/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0083 - val_loss: 0.0099
Epoch 202/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0084 - val_loss: 0.0098
Epoch 203/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0083 - val_loss: 0.0099
Epoch 204/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0083 - val_loss: 0.0094
Epoch 205/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0084 - val_loss: 0.0100
Epoch 206/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0088 - val_loss: 0.0094
Epoch 207/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0087 - val_loss: 0.0098
Epoch 208/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0079 - val_loss: 0.0096
Epoch 209/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0079 - val_loss: 0.0097
Epoch 210/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0078 - val_loss: 0.0097
Epoch 211/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0081 - val_loss: 0.0098
Epoch 212/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0079 - val_loss: 0.0103
Epoch 213/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0079 - val_loss: 0.0093
Epoch 214/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0079 - val_loss: 0.0099
Epoch 215/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0074 - val_loss: 0.0099
Epoch 216/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0077 - val_loss: 0.0099
Epoch 217/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0079 - val_loss: 0.0099
Epoch 218/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0075 - val_loss: 0.0099
Epoch 219/300
42/42 [==============================] - 1s 18ms/step - loss: 0.0074 - val_loss: 0.0098
Epoch 220/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0076 - val_loss: 0.0095
Epoch 221/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0083 - val_loss: 0.0095
Epoch 222/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0076 - val_loss: 0.0097
Epoch 223/300
42/42 [==============================] - 1s 17ms/step - loss: 0.0076 - val_loss: 0.0097
Epoch 224/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0073 - val_loss: 0.0096
Epoch 225/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0071 - val_loss: 0.0100
Epoch 226/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0072 - val_loss: 0.0100
Epoch 227/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0072 - val_loss: 0.0098
Epoch 228/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0075 - val_loss: 0.0099
Epoch 229/300
42/42 [==============================] - 1s 20ms/step - loss: 0.0070 - val_loss: 0.0099
Epoch 230/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0071 - val_loss: 0.0102
Epoch 231/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0068 - val_loss: 0.0098
Epoch 232/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0067 - val_loss: 0.0102
Epoch 233/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0071 - val_loss: 0.0096
Epoch 234/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0073 - val_loss: 0.0098
Epoch 235/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0066 - val_loss: 0.0101
Epoch 236/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0064 - val_loss: 0.0101
Epoch 237/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0063 - val_loss: 0.0102
Epoch 238/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0066 - val_loss: 0.0103
Epoch 239/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0067 - val_loss: 0.0100
Epoch 240/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0064 - val_loss: 0.0101
Epoch 241/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0060 - val_loss: 0.0104
Epoch 242/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0063 - val_loss: 0.0102
Epoch 243/300
42/42 [==============================] - 1s 20ms/step - loss: 0.0071 - val_loss: 0.0105
Epoch 244/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0067 - val_loss: 0.0107
Epoch 245/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0061 - val_loss: 0.0102
Epoch 246/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0059 - val_loss: 0.0107
Epoch 247/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0063 - val_loss: 0.0103
Epoch 248/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0064 - val_loss: 0.0103
Epoch 249/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0064 - val_loss: 0.0099
Epoch 250/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0061 - val_loss: 0.0113
Epoch 251/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0057 - val_loss: 0.0103
Epoch 252/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0058 - val_loss: 0.0098
Epoch 253/300
42/42 [==============================] - 1s 19ms/step - loss: 0.0057 - val_loss: 0.0111
Epoch 254/300
42/42 [==============================] - 1s 19ms/step - loss: 0.0057 - val_loss: 0.0108
Epoch 255/300
42/42 [==============================] - 1s 22ms/step - loss: 0.0057 - val_loss: 0.0108
Epoch 256/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0059 - val_loss: 0.0105
Epoch 257/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0065 - val_loss: 0.0115
Epoch 258/300
42/42 [==============================] - 1s 17ms/step - loss: 0.0058 - val_loss: 0.0111
Epoch 259/300
42/42 [==============================] - 3s 62ms/step - loss: 0.0054 - val_loss: 0.0107
Epoch 260/300
42/42 [==============================] - 2s 46ms/step - loss: 0.0052 - val_loss: 0.0111
Epoch 261/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0052 - val_loss: 0.0121
Epoch 262/300
42/42 [==============================] - 1s 30ms/step - loss: 0.0053 - val_loss: 0.0113
Epoch 263/300
42/42 [==============================] - 1s 17ms/step - loss: 0.0056 - val_loss: 0.0106
Epoch 264/300
42/42 [==============================] - 1s 17ms/step - loss: 0.0051 - val_loss: 0.0109
Epoch 265/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0050 - val_loss: 0.0110
Epoch 266/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0048 - val_loss: 0.0113
Epoch 267/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0050 - val_loss: 0.0118
Epoch 268/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0049 - val_loss: 0.0114
Epoch 269/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0046 - val_loss: 0.0120
Epoch 270/300
42/42 [==============================] - 1s 21ms/step - loss: 0.0046 - val_loss: 0.0128
Epoch 271/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0049 - val_loss: 0.0114
Epoch 272/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0050 - val_loss: 0.0116
Epoch 273/300
42/42 [==============================] - 0s 11ms/step - loss: 0.0053 - val_loss: 0.0113
Epoch 274/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0049 - val_loss: 0.0129
Epoch 275/300
42/42 [==============================] - 1s 24ms/step - loss: 0.0048 - val_loss: 0.0108
Epoch 276/300
42/42 [==============================] - 1s 28ms/step - loss: 0.0047 - val_loss: 0.0123
Epoch 277/300
42/42 [==============================] - 1s 21ms/step - loss: 0.0047 - val_loss: 0.0120
Epoch 278/300
42/42 [==============================] - 1s 12ms/step - loss: 0.0045 - val_loss: 0.0118
Epoch 279/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0045 - val_loss: 0.0117
Epoch 280/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0042 - val_loss: 0.0121
Epoch 281/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0040 - val_loss: 0.0117
Epoch 282/300
42/42 [==============================] - 2s 47ms/step - loss: 0.0043 - val_loss: 0.0115
Epoch 283/300
42/42 [==============================] - 1s 34ms/step - loss: 0.0046 - val_loss: 0.0121
Epoch 284/300
42/42 [==============================] - 1s 17ms/step - loss: 0.0050 - val_loss: 0.0120
Epoch 285/300
42/42 [==============================] - 1s 27ms/step - loss: 0.0043 - val_loss: 0.0117
Epoch 286/300
42/42 [==============================] - 1s 23ms/step - loss: 0.0043 - val_loss: 0.0113
Epoch 287/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0041 - val_loss: 0.0120
Epoch 288/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0040 - val_loss: 0.0131
Epoch 289/300
42/42 [==============================] - 1s 18ms/step - loss: 0.0043 - val_loss: 0.0117
Epoch 290/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0043 - val_loss: 0.0119
Epoch 291/300
42/42 [==============================] - 3s 72ms/step - loss: 0.0040 - val_loss: 0.0126
Epoch 292/300
42/42 [==============================] - 1s 14ms/step - loss: 0.0041 - val_loss: 0.0114
Epoch 293/300
42/42 [==============================] - 1s 15ms/step - loss: 0.0039 - val_loss: 0.0126
Epoch 294/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0038 - val_loss: 0.0119
Epoch 295/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0040 - val_loss: 0.0123
Epoch 296/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0039 - val_loss: 0.0128
Epoch 297/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0036 - val_loss: 0.0127
Epoch 298/300
42/42 [==============================] - 0s 12ms/step - loss: 0.0038 - val_loss: 0.0121
Epoch 299/300
42/42 [==============================] - 1s 16ms/step - loss: 0.0037 - val_loss: 0.0128
Epoch 300/300
42/42 [==============================] - 1s 13ms/step - loss: 0.0034 - val_loss: 0.0136
DirAcc: 67.97%
Pearson Correlation: 0.588
In [5]:
# 15) Distribution
plt.figure(figsize=(6,3))
plt.hist(y_pred, bins=40, alpha=0.7)
plt.xlabel("Predicted value (tanh ∈ [-1, 1])")
plt.ylabel("Count")
plt.title("Distribution of model predictions")
plt.grid(True)
plt.show()

4. Back test and model performace¶

Simple back test on test data.

In [6]:
# 16) Get raw daily return in test set

ret_col = 'Change %'                          
ret_arr = df_test[ret_col].astype(float).values

# 17) Generate position signals

threshold = 0.01                             # if |y_pred| ≤ threshold → do nothing
position  = (y_pred > threshold).astype(float)

# 18) Backtest

equity_pred, equity_bh = [1.0], [1.0]

for t in range(len(y_pred)):
    r = ret_arr[t + window]                  
    equity_pred.append(equity_pred[-1] * (1 + position[t] * r))
    equity_bh.append(  equity_bh[-1]   * (1 + r))

equity_pred, equity_bh = np.array(equity_pred), np.array(equity_bh)

# 19) Performance metrics

def cagr(eq, ppy=365):   # Compound annual growth rate
    return (eq[-1] / eq[0]) ** (ppy / len(eq)) - 1

def sharpe(eq, ppy=365): # Sharpe ratio
    r = np.diff(eq) / eq[:-1]
    return r.mean() / r.std() * np.sqrt(ppy)

print("\n=== Strategy vs Buy & Hold ===")
print(f"CAGR   : {cagr(equity_pred):6.2%}  |  {cagr(equity_bh):6.2%}")
print(f"Sharpe : {sharpe(equity_pred):5.2f}  |  {sharpe(equity_bh):5.2f}")

# 20) Equity curve visualization

plt.figure(figsize=(8, 4))
plt.plot(equity_pred, label='Prediction Strategy')
plt.plot(equity_bh,   label='Buy & Hold', linestyle='--')
plt.title("Equity Curve Comparison")
plt.xlabel("Step"); plt.ylabel("Equity"); plt.grid(True)
plt.legend(); plt.show()

# Prediction diagnostics

# 21) Directional accuracy and Pearson correlation
hit      = (np.sign(y_pred) == np.sign(y_test)) 
dir_acc  = hit.mean()                             # directional accuracy
baseline = (y_test > 0).mean()                    # always-long benchmark
pears, p_val = pearsonr(y_pred, y_test)        #

print(f"DirAcc            : {dir_acc*100:5.2f}%")
print(f"  ↳ baseline (long): {baseline*100:5.2f}%  →  Δ = {(dir_acc-baseline)*100:+4.2f}%")
print(f"Pearson corr       : {pears:6.3f}  (p = {p_val:.3g})")

# 22) Binomial test: is hit-rate > 50 %?  #
p_binom = binomtest(hit.sum(), n=len(hit), p=0.5, alternative='greater').pvalue
print(f"Binom test vs 50%  : p = {p_binom:.3g}")

# 23) Rolling stability plots 
# This is tracking how well the model performs over the last 100 days by plotting its hit-rate and correlation with actual returns.
# at each point, the value means in past 100 days, how the many days the model predict right/ how hard the model predict right.
win = 100                                         #rolling window last 100 days
roll_dir  = pd.Series(hit).rolling(win).mean()    #
roll_corr = (pd.Series(y_pred).rolling(win)       #
             .corr(pd.Series(y_test)))

fig, ax = plt.subplots(2, 1, figsize=(10, 5), sharex=True)

ax[0].plot(roll_dir, label=f'Rolling DirAcc ({win})')
ax[0].axhline(0.5, ls='--', c='grey')
ax[0].set_ylabel('DirAcc'); ax[0].legend(); ax[0].grid(True)
ax[1].plot(roll_corr, label=f'Rolling Pearson ({win})', c='orange')
ax[1].axhline(0, ls='--', c='grey')
ax[1].set_ylabel('Corr'); ax[1].set_xlabel('Test sample index')
ax[1].legend(); ax[1].grid(True)

plt.tight_layout(); plt.show()
=== Strategy vs Buy & Hold ===
CAGR   : 44.45%  |  134.94%
Sharpe :  1.26  |   2.02
DirAcc            : 67.97%
  ↳ baseline (long): 50.74%  →  Δ = +17.23%
Pearson corr       :  0.588  (p = 2.68e-70)
Binom test vs 50%  : p = 2.9e-23

From these two graphs, we could saw the model in the worse scenario, the DirAcc is still above 55%.

Some thoughts and other things¶

About the tries: I tried removing the price column when I was getting closer to a successful run. The reason was: when using minmaxscale only on the training set, values above 1 would appear during testing. I was worried this would affect generalization. And I was right—when I included price as both an input and the target, the prediction accuracy dropped significantly. Even if I kept price as an input only, but not as the target, the model still didn’t perform well. Only when I completely removed price from the data, the overall accuracy and precision finally improved.

mse/mae? I honestly didn’t test this in detail. But I know MSE punishes larger errors more heavily, while MAE punishes errors linearly. From what I saw so far, both gave similar directional predictions (actually MAE worked a bit better,but I was still wondering if there was some kind of data leakage I didn’t notice), So I plan to revisit the MSE vs. MAE choice later, if I want to use this in a real market to make profit.

Actually, I didn’t plan to use percentage change for analysis at first. I started with raw daily prices, but the results were pretty bad. So I thought, maybe I could switch to some other reference form for simulation. That’s when I brought in the percentage change of these three assets as the new input strategy. My guess is, what we’re actually predicting(or caring) is the next day’s rate of change. So in such a way, price is like the integral of the rate of change. Which makes it further away from what we actually want. And that makes it harder for LSTM to learn.

Honestly, I didn’t realize at first that what I cared about was the rate of change. But later I figured out—yes, the only thing we’re really interested in here is the change. After all, even if you know yesterday’s price, You can still make a statistically better guess of the next day’s value by just randomly picking something within ±5%.

Alright—To be honest, during this recent BTC bull cycle, simple quant-based models really couldn’t bring more profit compared to just buying and holding.It’s just that, if I had spent a bit more time digging into that ~65% win rate, maybe I could’ve found a few arbitrage opportunities. But those things have nothing to do with AI or LSTM, and even if I found them, I wouldn’t include them here—because financial markets are the kind where the more people use a same strategy, the worse it performs.That said, I’m genuinely satisfied with this round of optimization. The version you see now performs way better than the one I submitted as a final course assignment (which was floating around 50%).At one point, I even suspected there might be some data leakage, but after double-checking everything,I felt a bit more at ease.

So yeah—this will my first full-packaged personal project I’ve ever finished. Not bad at all. 🎉

Disclaimer¶

You probably wouldn’t seriously expect an undergrad physics major to pull off something groundbreaking on their first try at an AI side project.

After all, it’s your money, and I’m just here trying to satisfy my curiosity. Honestly, from a personal perspective, I’ve always felt like I’d never beat AI when it comes to solving math problems that look like they belong in grade school. So instead, I just wanted to see if the code I wrote could perform better than I would when it comes to investing. And now, looking back—it didn’t do too bad. At least it doesn’t do things like going 100x leverage long on a second-tier crypto token and wiping out the entire portfolio on a tiny price swing.(Yes, I have no idea why I ever did something that insane decision either.)

Anyway.

This project is for academic purposes only.¶

Nothing here constitutes investment advice.¶

Please make your own decisions and bear your own risk.¶