The goal of our final deep learning model is to answer this question: can we use deep learning techniques to improve security selection, thus improving a portfolio's risk return profile?
Data is always the starting point. We read through multiple literatures, identified the most relevant datasets, then merged and cleaned the data before performing exploratory data analysis (EDA). More specifically:
Data
Main Data Source: WDRS (a research platform), specifically: CRSP, Compustat, JKP databases
Stock Universe: S&P500;
Date Range: 2000/1/1 to end of 2024, daily data
Market level data: Yahoo Finance (S&P return)
Data Volume: The joined data is about 1GB, with ~3MM rows and 30+ columns
Data checks:
Null values are largely within reasonable expectations
We identified outlier values for the operating profit to book equity ratio. After validating, we removed the abnormal values
Following this correction, the overall distributions and time series patterns of numerical variables appear reasonable— including newly derived features
Preliminary Insights
Pearson correlation coefficient among numerical variables shows immaterial correlations with the potential target variable (‘return’), except for return-based metrics. This suggests that a purely linear approach may fail to capture meaningful signal.
Now, how do we know our machine learning model is performing well? We need a benchmark model. If we want to claim our model improved upon a 'traditional' method, the most natural choice of benchmark model is a traditional statistical model. In fact, this choice effectively raises the bar for our machine learning approach, because a well performing statistical model beats the market, and we have to beat the statistical model - great, challenge is on!
After conducting extensive research on prior literatures, we decided to adopt the approach based on rolling regression using extended Fama-French factors (such as size, value, growth, profitability, momentum) as key inputs, followed by optimization. This is generally considered a reasonable approach in practice. Not surprisingly, the benchmark model beats the market in the in-sample period.
So how do we approach security selection using machine learning? In essence, we aimed to find a non-linear and dynamic framework for predicting expected security returns— an area where neural networks are particularly well-suited to capture complex relationships and interactions. More specifically, we adopted a transformer architecture (the same cutting-edge technology used in machine translation) to capture both temporal and cross-asset relationships. We used two transformer structures to capture the relationships subsequently. Essentially, our goal was to use similar input features as the benchmark model, but apply a different information processing architecture to predict expected returns— ultimately improving security selection.
After conducting an extensive logic and numeric based parameter search, we selected the best in-sample performing model based on its Sharpe ratio and maximum drawdown profile during backtesting. Because the training period includes two major crises, we also aimed to identify a model that performs well during a separate validation period (also containing a crisis). Our final model demonstrates reasonably strong performance in the out-of-sample period as well.
As it turns out, our selected deep learning model beats the benchmark model by Sharpe ratio while registering a smaller maximum drawdown.