AI for Finance Research Track

Overview

This page is for students who want to study from machine learning / reinforcement learning / LLM basics to AI-for-finance systems and eventually work on research topics such as:

Financial Reinforcement Learning and Benchmarking
Financial LLMs and AI Agents
Data-Centric / Infrastructure-Centric AI for Finance

This page is designed for self-study. The goal is not to start from heavy economic theory, but to build enough intuition and engineering experience to understand recent AI-for-finance research and eventually propose practical research ideas that are feasible from an AI engineering perspective.

How to use this page

Start from Core AI Finance.
Choose one track among RL / LLM & Agents / Data & Infrastructure.
Read papers in order.
Reproduce at least one baseline using the provided code.

What I am looking for

I am looking for students who can gradually become capable of reading, reproducing, and eventually extending recent AI-for-finance papers.

Good candidates usually:

know basic PyTorch training
can read experiments carefully
can compare methods instead of only summarizing them
can reproduce at least one public codebase
can propose a small but concrete research question

Part I. What I recommend students do first

Option 1. Easiest and most practical path

Read the AI-for-finance overview paper
Read FinRL
Read FinRL-Meta
Reproduce one small FinRL baseline on a standard market environment
Then move to FinGPT or FinRobot

Option 2. Best path for publishable ideas

Read the AI-for-finance overview paper
Read FinRL-Meta
Read FinGPT
Read FinRobot
Choose one among FinRL Contests / FinRobot / data-centric evaluation ideas

Option 3. Best path for students interested in current trends

FinGPT
FinRobot
FinRL Contests
one benchmark / evaluation paper
one adaptation or agent-design project

Part II. Suggested first mini-projects

Students should not begin with a huge project. Start with a small project that can realistically be finished.

Project idea A — financial RL engineering

Compare:

FinRL
FinRL-Meta
one simple RL baseline in ElegantRL

on a small stock-trading or portfolio-allocation benchmark. Then study reproducibility, environment design, and stability rather than trying to beat the market in a highly realistic setting.

Project idea B — financial LLM / agent systems

Compare:

FinGPT
FinRobot
one smaller open-source base model with simple prompting or LoRA

and study whether specialized data curation or tool use improves financial QA, sentiment understanding, or report generation.

Project idea C — data-centric AI finance

Compare:

FinRL-Meta
FinRL Contests
one custom cleaned dataset pipeline

and study how different data-cleaning or benchmark-design decisions change model evaluation.

Part III. What to avoid at the beginning

Do not begin with:

projects that depend on deep economic theory or market microstructure assumptions before you understand the AI pipeline
claims about live trading performance without a reproducible benchmark
large-scale proprietary-data projects before you can run a public baseline
methods with no public code unless you are already experienced
highly realistic high-frequency trading setups as the first project

The goal is to build depth, not to fail because the setup is too large or too domain-specific.

Part IV. When to contact me

Contact me after you satisfy some of the following.

I understand one of the main AI-for-finance pipelines at a conceptual level.
I selected one track among RL / LLM & Agents / Data & Infrastructure.
I read at least 3 papers in that track.
I ran at least one public codebase successfully.
I can explain the strengths and weaknesses of two methods.
I can suggest one concrete research question.
I prepared a 1–2 page memo.

Part V. Core AI Finance

These papers are the minimum background. Read them first.

1. AI-for-Finance Overview (Engineering-Oriented Survey, 2025)

Paper: Advancing Financial Engineering with Foundation Models
Why read it: broad overview of how foundation-model ideas are entering finance
Focus on: which parts are actually AI engineering problems versus finance-specific theory problems

2. FinRL (ACM ICAIF 2021 / arXiv 2020)

Paper: FinRL: Deep Reinforcement Learning Framework to Automate Trading in Quantitative Finance
Project / Code: https://github.com/AI4Finance-Foundation/FinRL
Why read it: the main open-source starting point for financial RL from AI4Finance
Focus on: environment design, state/action/reward formulation, and reproducible experimentation
Good for students because: clear engineering pipeline and public code

3. FinRL-Meta (Machine Learning 2024 / arXiv 2023)

Paper: Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Project / Code: https://github.com/AI4Finance-Foundation/FinRL-Meta
Why read it: very useful if you care about data pipelines and benchmark design
Focus on: DataOps, environment generation, and reproducibility
Good for students because: less economics-heavy and more systems/data-centric

4. FinGPT (arXiv 2023)

Paper: FinGPT: Open-Source Financial Large Language Models
Project / Code: https://github.com/AI4Finance-Foundation/FinGPT
Why read it: the main open-source entry point for financial LLM work
Focus on: data curation, instruction tuning, and low-rank adaptation
Good for students because: highly accessible if they already know LLM fine-tuning basics

Part VI. Track A — Financial Reinforcement Learning and Benchmarking

This track is the most recommended starting point for students who want a concrete engineering entry into AI for finance.

Typical question:

How do we build, benchmark, and stabilize financial RL systems without overclaiming unrealistic market performance?

Why start with this track

This track helps students build intuition for:

why finance is difficult for RL from a data and environment perspective
how benchmark design changes conclusions
how to make experiments reproducible and comparable

Recommended order

A1. FinRL (ACM ICAIF 2021 / arXiv 2020)

Paper: FinRL: Deep Reinforcement Learning Framework to Automate Trading in Quantitative Finance
Code: https://github.com/AI4Finance-Foundation/FinRL
Why read it: the standard entry paper for AI4Finance-style financial RL
Main idea: end-to-end open-source RL pipeline for automated trading
Good for students because: simple conceptual structure and many tutorials

A2. FinRL-Meta (Machine Learning 2024)

Paper: Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Code: https://github.com/AI4Finance-Foundation/FinRL-Meta
Why read it: important if you want to understand why environment and data design matter so much
Main idea: dynamic datasets + automatic market-environment construction
Good for students because: practical and reproducibility-oriented

A3. FinRL Contests (arXiv 2025)

Paper: FinRL Contests: Benchmarking Data-driven Financial Reinforcement Learning from 2023 to 2025
Why read it: useful if you want a modern benchmark-centric view rather than a single algorithm paper
Main idea: standardized tasks, starter kits, and reproducible evaluation for financial RL
Good for students because: focuses on benchmarking and engineering rather than economics-heavy modeling

What students should reproduce first in this track

Choose one:

FinRL
FinRL-Meta

Then use one follow-up reading:

FinRL Contests

Good starter benchmarks for this track

single-stock trading environments
simple portfolio-allocation tasks
contest-style public benchmark settings

Avoid highly realistic high-frequency trading setups at the beginning.

Part VII. Track B — Financial LLMs and AI Agents

This track is for students who are interested in using LLMs, instruction tuning, and agent systems for financial applications.

Typical question:

Can we build finance-specialized LLM or agent systems that are useful, reproducible, and not overly dependent on proprietary infrastructure?

Why this track is good

This track is suitable for students who want:

a modern AI-system angle rather than a market-theory angle
relatively intuitive algorithmic ideas
accessible experimentation with open-source models and data

Recommended order

B1. FinGPT (arXiv 2023)

Paper: FinGPT: Open-Source Financial Large Language Models
Code: https://github.com/AI4Finance-Foundation/FinGPT
Why read it: the clearest open-source entry point for finance-specific LLM work
Main idea: data-centric LLM pipeline for finance with instruction tuning and LoRA
Good for students because: easier to reproduce than building a new domain model from scratch

B2. FinRobot (arXiv 2024)

Paper: FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models
Code: https://github.com/AI4Finance-Foundation/FinRobot
Why read it: useful if you want to move from one-model pipelines to multi-agent financial workflows
Main idea: finance-specialized LLM agents with tool use and chain-of-thought-style task decomposition
Good for students because: highly engineering-oriented and easy to connect to demos

B3. FinGPT-Research (GitHub Research Hub)

Project / Code: https://github.com/AI4Finance-Foundation/FinGPT-Research
Why read it: useful if you want practical fine-tuning and replication materials
Main idea: hands-on LoRA, sentiment, and task-specific FinGPT workflows
Good for students because: closer to an engineering lab notebook than a purely theoretical paper

What students should reproduce first in this track

Choose one:

FinGPT
FinRobot

Then use one follow-up resource:

FinGPT-Research

Good starter benchmarks for this track

sentiment classification
financial QA
earnings-call or report summarization
agent-assisted research workflows

Avoid large closed-source agent stacks at the beginning.

Part VIII. Track C — Data-Centric / Infrastructure-Centric AI for Finance

This track is for students interested in the engineering question of how to build better datasets, evaluation pipelines, and reusable infrastructure for AI-finance research.

Typical question:

If we keep the model fixed, how much can data quality, environment design, and evaluation protocol change the result?

Why this track is attractive

directly useful for reproducible research
easier to scale down into manageable student projects
less dependent on beating strong proprietary systems
strongly aligned with AI engineering skills

Recommended order

C1. FinRL-Meta (Machine Learning 2024)

Paper: Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Code: https://github.com/AI4Finance-Foundation/FinRL-Meta
Why read it: the strongest AI4Finance paper for data and benchmark infrastructure
Main idea: automatic curation of dynamic datasets into market environments
Good for students because: highly engineering-oriented and benchmark-friendly

C2. FinRL Contests (arXiv 2025)

Paper: FinRL Contests: Benchmarking Data-driven Financial Reinforcement Learning from 2023 to 2025
Why read it: useful if you care about standardized evaluation and starter kits
Main idea: benchmark design for financial RL across multiple tasks
Good for students because: clearly focused on reproducibility and evaluation

C3. ElegantRL (GitHub Framework)

Project / Code: https://github.com/AI4Finance-Foundation/ElegantRL
Why read it: lightweight RL framework useful for cleaner baselines and smaller experiments
Main idea: structurally clean RL implementation with minimal engineering overhead
Good for students because: suitable for fast baseline testing without too much framework complexity

What students should reproduce first in this track

Choose one:

FinRL-Meta
one simple benchmark in FinRL Contests

Then use one follow-up resource:

ElegantRL

Good starter benchmarks for this track

single-market public datasets
portfolio-allocation toy tasks
simple contest-style starter environments

Avoid overly broad infrastructure projects before reproducing one benchmark first.

Part IX. Recommended code / benchmark libraries

These libraries are useful because students should not waste too much time building everything from scratch.

1. FinRL

GitHub: https://github.com/AI4Finance-Foundation/FinRL
Why use it: the main open-source AI4Finance RL entry point

2. FinRL-Meta

GitHub: https://github.com/AI4Finance-Foundation/FinRL-Meta
Why use it: useful for data-centric benchmark design and environment construction

3. FinGPT

GitHub: https://github.com/AI4Finance-Foundation/FinGPT
Why use it: the main open-source financial LLM entry point

4. FinRobot

GitHub: https://github.com/AI4Finance-Foundation/FinRobot
Why use it: useful for AI-agent-style finance applications

5. ElegantRL

GitHub: https://github.com/AI4Finance-Foundation/ElegantRL
Why use it: lightweight RL framework for smaller and cleaner experiments

6. FinRL-Tutorials

GitHub: https://github.com/AI4Finance-Foundation/FinRL-Tutorials
Why use it: practical demos and entry-level tutorials for the ecosystem