Carroll School of Management - AI Research in Finance

Nathan Dong, PhD, CFA

Seidner Department of Finance, Carroll School of Management, Boston College

Academic Research Guide to Artificial Intelligence in Finance

Guide to Literature in Financial Economics

Machine Learning

Bali, T.G., Beckmeyer, H., Moerke, M., Weigert, F., 2023. Option Return Predictability with Machine Learning and Big Data, Review of Financial Studies 36(9), 3548–3602.

Bianchi, D., Büchner, M., Tamoni, A., 2021. Bond Risk Premiums with Machine Learning. Review of Financial Studies 34(2), 1046-1089.

Barbaglia, L., Manzan, S., Tosetti, E., 2023. Forecasting Loan Default in Europe with Machine Learning. Journal of Financial Econometrics 21, 569–596.

Bogousslavsky, V., Fos, V., Muravyev, D., 2024. Informed Trading Intensity. Journal of Finance 79(2), 903-948.

Chen, A.Y., McCoy, J., 2024. Missing Values Handling for Machine Learning Portfolios. Journal of Financial Economics 155, 103815.

DeMiguel, V., Gil-Bazo, J., Nogales, F.J., Santos, A.A., 2023. Machine Learning and Fund Characteristics Help to Select Mutual Funds with Positive Alpha. Journal of Financial Economics 150(3), 103737.

De Silva, T., Thesmar, D., 2024. Noise in Expectations: Evidence from Analyst Forecasts. Review of Financial Studies 37(5), 1494-1537.

Duarte, V., Duarte, D., Silva, D.H., 2024. Machine Learning for Continuous-Time Finance, Review of Financial Studies 37(11), 3217–3271.

Erel, I., Stern, L.H., Tan, C., Weisbach, M.S., 2021. Selecting Directors Using Machine Learning. Review of Financial Studies 34, 3226–3264.

Easley, D., López de Prado, M., O’Hara, M., Zhang, Z., 2021. Microstructure in the Machine Age. Review of Financial Studies 34(7), 3316-3363.

Feng, G., Giglio, S., Xiu, D., 2020. Taming the Factor Zoo: A Test of New Factors. Journal of Finance 75(3), 1327-1370.

Fuster, A., Goldsmith‐Pinkham, P., Ramadorai, T., Walther, A., 2022. Predictably Unequal? The Effects of Machine Learning on Credit Markets. Journal of Finance 77(1), 5-47.

Gu, S., Kelly, B., Xiu, D., 2020. Empirical Asset Pricing via Machine Learning. Review of Financial Studies 33(5), 2223-2273.

Iskhakov, F., Rust, J., Schjerning, B., 2020. Machine Learning and Structural Econometrics: Contrasts and Synergies. The Econometrics Journal 23, S81–S124.

Kaniel, R., Lin, Z., Pelger, M., Van Nieuwerburgh, S., 2023. Machine-Learning the Skill of Mutual Fund Managers. Journal of Financial Economics 150(1), 94-138.

Leippold, M., Wang, Q., Zhou, W., 2022. Machine Learning in the Chinese Stock Market. Journal of Financial Economics 145(2), 64–82.

Li, K., Mai, F., Shen, R., Yan, X., 2021. Measuring Corporate Culture using Machine Learning. Review of Financial Studies, 34(7), 3265-3315.

Martin, I.W., Nagel, S., 2022. Market Efficiency in the Age of Big Data. Journal of Financial Economics, 145(1), 154-177.

Mullainathan, S., Spiess, J., 2017. Machine Learning: an Applied Econometric Approach. Journal of Economic Perspectives 31(2), 87-106.

Murray, S., Xia, Y., Xiao, H., 2024. Charting by Machines. Journal of Financial Economics 153, 103791.

Sautner, Z., Van Lent, L., Vilkov, G., Zhang, R., 2023. Firm‐level Climate Change Exposure. Journal of Finance, 78(3), 1449-1498.

Van Binsbergen, J.H., Han, X., Lopez-Lira, A., 2023. Man versus Machine Learning: The Term Structure of Earnings Expectations and Conditional Biases. Review of Financial Studies 36(6), 2361–2396.

AI Deep Learning

Cao, S., Jiang, W., Wang, J.L., Yang, B., 2024. From Man vs. Machine to Man Machine: The Art and AI of Stock Analyses. Journal of Financial Economics 160, 103910.

Maliar, L., Maliar, S., Winant, P., 2021. Deep Learning for Solving Dynamic Economic Models. Journal of Monetary Economics 122, 76–101.

Sadhwani, A., Giesecke, K., Sirignano, J., 2021. Deep Learning for Mortgage Risk. Journal of Financial Econometrics 19, 313–368.

Dong, G. Nathan, 2024. Can AI Replace Stock Analysts? Evidence from Deep Learning Financial Statements, Boston College Working Paper.

Natural Language Processing (NLP)

Adams, R.B., Ragunathan, V., Tumarkin, R., 2021. Death by Committee? An Analysis of Corporate Board (Sub-) Committees. Journal of Financial Economics, 141(3), 1119-1146.

Aleti, S., Bollerslev, T., 2024. News and Asset Pricing: A High-Frequency Anatomy of the SDF. Review of Financial Studies.

Bybee, L., Kelly, B., Manela, A., Xiu, D., 2024. Business News and Business Cycles. Journal of Finance 79(5), 3105-3147.

Cookson, J.A., Lu, R., Mullins, W., Niessner, M., 2024. The Social Signal. Journal of Financial Economics, 158, 103870.

Garcia, D., Hu, X., Rohrer, M., 2023. The Colour of Finance Words. Journal of Financial Economics, 147(3), 525-549.

Gorodnichenko, Y., Pham, T., Talavera, O., 2023. The Voice of Monetary Policy. American Economic Review 113, 548–584.

Graham, J.R., Grennan, J., Harvey, C.R., Rajgopal, S., 2022. Corporate Culture: Evidence from the Field. Journal of Financial Economics, 146(2), 552-593.

Hassan, T.A., Hollander, S., van Lent, L., Schwedeler, M., Tahoun, A., 2023. Firm-Level Exposure to Epidemic Diseases: COVID-19, SARS, and H1N1, Review of Financial Studies, 36(12), 4919–4964.

Computer Vision

Khachiyan, A., Thomas, A., Zhou, H., Hanson, G., Cloninger, A., Rosing, T., Khandelwal, A.K., 2022. Using Neural Networks to Predict Microspatial Economic Growth. American Economic Review: Insights 4, 491–506.

Natural Language Generation

TBA

Guide to Software Installation for Neural-Network Training

Operating System: Linux Ubuntu 22.04

KhMicrosoft Windows operating system must be avoided for neural-network training purpose at all costs.

Install Python PIP, IDLE, and GIT on Ubuntu

sudo apt update

sudo apt install python3-pip

sudo apt install idle3

sudo apt install git

Run Python IDLE on Ubuntu

python3 -m idlelib

Check NVLink if two RTX 2080Ti cards are connected

nvidia-smi topo -m

Install Nvidia driver for Tesla K80 or M40 GPU

ubuntu-drivers devices

sudo apt install nvidia-driver-470

Install Nvidia driver for Tesla P100 GPU

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin

sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600

wget https://developer.download.nvidia.com/compute/cuda/12.2.1/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.1-535.86.10-1_amd64.deb

sudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.1-535.86.10-1_amd64.deb

sudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/

sudo apt-get update

sudo apt-get -y install cuda

Install Pytorch for Nvidia GeForce RTX 2080 Ti or RTX 3080 on Ubuntu:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Install Pytorch for Nvidia Tesla K80 or M40 on Ubuntu:

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 torchtext==0.13.1 torchdata==0.4.1 --index-url https://download.pytorch.org/whl/cu113

Install TensorRT in Pytorch with CUDA 11.8:

Install CUDA Tooklit 11.8: https://developer.nvidia.com/cuda-11-8-0-download-archive

pip install torch torch-tensorrt tensorrt --extra-index-url https://download.pytorch.org/whl/cu118

Guide to Pytorch Coding in Python

How to choose optimizer: ADAM or SGD?

ADAM (adaptive moment estimation)can achieve convergence faster and it is more robust to bad hyperparameters initialization

But it may converge to less optimal local minima
ADAM is more suitable for training transformers in natural language processing (NLP)

SGD (stochastic gradient descent) can produce more accurate models, but may converge slowly and less stable

SGD adjusts learning rates for parameters togather, whereas ADAM adjusts them separately
SGD is more suitable CNN (convolutional neural network) for image recognition

How to choose activation function: ReLU, Sigmoid, Softmax, or Tanh?

The choice depends on the type of application and the range of output values

RELU: using neural network to predict values that are greater than 1
Sigmoid or Tanh: the output values are in the range of [0,1] or [-1, 1]
Softmax in the last layer: classification when predicting a probability distribution over mutually exclusive class labels

How to use regularization technique: L1, L2, or Dropout?

Regularization helps prevent a NN model from becoming too complex or having large parameter values, and hence avoids over-fitting

L1 (Lasso) adds a penalty (the absolute value of the weights) to the model’s objective function, leading to a sparse model, where some weights are exactly equal to zero

l1 = sum(p.abs().sum() for p in model.parameters())

loss = loss + 0.1*l1

L2 (Ridge) adds a penalty (the square of the weights) to the model's objective function, causing a model to have small, non-zero weights

optimizer = Adam(model.parameters(), lr=0.01, weight_decay=0.1)

Dropout randomly sets zeros to the weights of a fraction of neurons during training, forcing the remaining neurons to learn more features. For example, a dropout rate of 0.1 means that one tenth of neurons are dropped out in each epoch

def __init__(self):

self.dropout = nn.Dropout(p=0.1)

def forward(self, x):

return self.sigmoid(self.layer(self.dropout(...)))

My experience is to start with the Dropout, and then try either or both L1 and L2.

How to use mixed-precision to speed up training?

Using half-precision floating-point numbers (FP16) rather than full-precision floating-point numbers (FP32) can help reduce computing power and memory usage. Mixed precision allows for FP16-based training while still preserving much of the FP32-based network accuracy:

scaler = GradScaler(enabled=True)

for epoch in range(epochs):

with autocast(enabled=True):

predicted = model(input)

loss = loss_fn(predicted, realized)

scaler.scale(loss).backward()

scaler.step(optimizer)

scaler.update()

However, the gain in efficiency does come at a cost of lower precision. My experience is to double the speed but lose less than 5% accuracy, which is not bad at all.

How to choose learning-rate scheduler?

Learning rate is the magnitude of change/update to model weights during the backpropagation training process. It controls how big of a step for an optimizer to reach the minima of the loss function. Learning rate scheduler adjusts the learning rate between epochs as the training progresses

- The exponential decay performs the best:

scheduler = ExponentialLR(optimizer, gamma=0.99)

- Time-based decay is not too bad:

scheduler = StepLR(optimizer, step_size=30, gamma=0.1)

- The linear and step-based decay often lead to model overfitting:

scheduler = LinearLR(optimizer, start_factor=0.5, total_iters=100)

Install sklearn on Ubuntu:

pip install -U scikit-learn

Guide to Hardware Setup for Training NN Model

DIY Setup vs. cloud GPU

You would choose to build your own deep-learning computer over using cloud GPUs if you prioritize cost-effectiveness, control, privacy and performance needs that might not be fully met by cloud solutions. Your own GPU setup gives you complete control over the hardware, software, and entire system. You can tailor the configuration to your specific requirements, ensuring optimal performance for your particular workload. This is especially valuable for specialized tasks or when you need to optimize for specific hardware and software combinations. You can achieve higher performance with a DIY setup, particularly if you are dealing with large datasets or intensive workloads. Cloud GPU instances may experience performance bottlenecks due to factors like network latency and I/O limitations. With a DIY system, you can ensure that hardware components are optimized for your specific needs, minimizing bottlenecks and maximizing performance. While cloud GPUs offer scalability, building your own system can be more flexible in terms of scaling. You can incrementally add more GPUs or upgrade components as needed, without the constraints of cloud instance types or potential limitations on resource availability. With a DIY setup, you have full control over the security of your data and computing environment. This is crucial if you're handling sensitive information or require a high level of data protection. If you are passionate about learning about GPU programming and hardware, building your own system provides a great opportunity to gain in-depth knowledge and experience. You can experiment with different hardware and software configurations, fine-tuning the system to achieve optimal performance for your specific projects.

My favorite build

Gigabyte Z590 Aorus Ultra + Intel Core i5-11400T + Samsung DDR4 2666MHz 64GB

Western Digital SN580 NVMe 1TB + Zotac Gaming GeForce RTX 3080 Trinity

Ubuntu 22.04 + Python 3.10 + Pytorch 2.2 + Scikit-learn 1.4

The worst hardware that you should avoid

Dell RTX 2080 Ti OEM

Dell Alienware RTX 3080

Overclock Tesla K80 or M40 GPU to boost performance

nvidia-smi -q -i 0 -d CLOCK

sudo nvidia-smi -pm ENABLED -i 0

sudo nvidia-smi -rac -i 0

nvidia-smi -q -i 0 -d SUPPORTED_CLOCKS

sudo nvidia-smi -ac 3004,875 -i 0

# For K80, repeat these steps for the 2nd GPU with "-i 1" because K80 has two GPU units in one card.

Ubuntu log overflow causing HD/SSD out-of-space
- Disable PCIe power management (ASPM) for all GPU devices

sudo gedit /etc/default/grub

change to: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pcie_aspm=off"

or try: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=noaer"

also try: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=nomsi"

sudo update-grub

- Clear log files to save space

sudo su

echo "" > /var/log/kern.log

echo "" > /var/log/syslog

service syslog restart

journalctl --vacuum-size=50M

Nvidia GPU overheating problem
- Dried-out thermal paste
- Inefficient thermal pad
- Poor ventilation
- Bad GPU x16 extension cable
- GPU is connected to PCIe at a lower generation, such as gen3 -> gen2

nvidia-smi --query-gpu=pcie.link.gen.max,pcie.link.gen.current --format=csv

- GPU is connected to PCIe at a lower width/lane, such as x16 -> x4

nvidia-smi --query-gpu=pcie.link.width.max,pcie.link.width.current --format=csv

- To lower GPU power limit to TDP 100W, preventing overheating damage

nvidia-smi -q -d power

sudo nvidia-smi -i 0 -pm enabled

sudo nvidia-smi -i 0 -pl 100