AI for Finance Research Track
Overview
This page is for students who want to study from machine learning / reinforcement learning / LLM basics to AI-for-finance systems and eventually work on research topics such as:
Financial Reinforcement Learning and Benchmarking
Financial LLMs and AI Agents
Data-Centric / Infrastructure-Centric AI for Finance
This page is designed for self-study. The goal is not to start from heavy economic theory, but to build enough intuition and engineering experience to understand recent AI-for-finance research and eventually propose practical research ideas that are feasible from an AI engineering perspective.
How to use this page
Start from Core AI Finance.
Choose one track among RL / LLM & Agents / Data & Infrastructure.
Read papers in order.
Reproduce at least one baseline using the provided code.
What I am looking for
I am looking for students who can gradually become capable of reading, reproducing, and eventually extending recent AI-for-finance papers.
Good candidates usually:
know basic PyTorch training
can read experiments carefully
can compare methods instead of only summarizing them
can reproduce at least one public codebase
can propose a small but concrete research question
Part I. What I recommend students do first
Option 1. Easiest and most practical path
Read the AI-for-finance overview paper
Read FinRL
Read FinRL-Meta
Reproduce one small FinRL baseline on a standard market environment
Then move to FinGPT or FinRobot
Option 2. Best path for publishable ideas
Read the AI-for-finance overview paper
Read FinRL-Meta
Read FinGPT
Read FinRobot
Choose one among FinRL Contests / FinRobot / data-centric evaluation ideas
Option 3. Best path for students interested in current trends
FinGPT
FinRobot
FinRL Contests
one benchmark / evaluation paper
one adaptation or agent-design project
Part II. Suggested first mini-projects
Students should not begin with a huge project. Start with a small project that can realistically be finished.
Project idea A — financial RL engineering
Compare:
FinRL
FinRL-Meta
one simple RL baseline in ElegantRL
on a small stock-trading or portfolio-allocation benchmark. Then study reproducibility, environment design, and stability rather than trying to beat the market in a highly realistic setting.
Project idea B — financial LLM / agent systems
Compare:
FinGPT
FinRobot
one smaller open-source base model with simple prompting or LoRA
and study whether specialized data curation or tool use improves financial QA, sentiment understanding, or report generation.
Project idea C — data-centric AI finance
Compare:
FinRL-Meta
FinRL Contests
one custom cleaned dataset pipeline
and study how different data-cleaning or benchmark-design decisions change model evaluation.
Part III. What to avoid at the beginning
Do not begin with:
projects that depend on deep economic theory or market microstructure assumptions before you understand the AI pipeline
claims about live trading performance without a reproducible benchmark
large-scale proprietary-data projects before you can run a public baseline
methods with no public code unless you are already experienced
highly realistic high-frequency trading setups as the first project
The goal is to build depth, not to fail because the setup is too large or too domain-specific.
Part IV. When to contact me
Contact me after you satisfy some of the following.
I understand one of the main AI-for-finance pipelines at a conceptual level.
I selected one track among RL / LLM & Agents / Data & Infrastructure.
I read at least 3 papers in that track.
I ran at least one public codebase successfully.
I can explain the strengths and weaknesses of two methods.
I can suggest one concrete research question.
I prepared a 1–2 page memo.
Part V. Core AI Finance
These papers are the minimum background. Read them first.
1. AI-for-Finance Overview (Engineering-Oriented Survey, 2025)
Paper: Advancing Financial Engineering with Foundation Models
Why read it: broad overview of how foundation-model ideas are entering finance
Focus on: which parts are actually AI engineering problems versus finance-specific theory problems
2. FinRL (ACM ICAIF 2021 / arXiv 2020)
Paper: FinRL: Deep Reinforcement Learning Framework to Automate Trading in Quantitative Finance
Project / Code: https://github.com/AI4Finance-Foundation/FinRL
Why read it: the main open-source starting point for financial RL from AI4Finance
Focus on: environment design, state/action/reward formulation, and reproducible experimentation
Good for students because: clear engineering pipeline and public code
3. FinRL-Meta (Machine Learning 2024 / arXiv 2023)
Paper: Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Project / Code: https://github.com/AI4Finance-Foundation/FinRL-Meta
Why read it: very useful if you care about data pipelines and benchmark design
Focus on: DataOps, environment generation, and reproducibility
Good for students because: less economics-heavy and more systems/data-centric
4. FinGPT (arXiv 2023)
Project / Code: https://github.com/AI4Finance-Foundation/FinGPT
Why read it: the main open-source entry point for financial LLM work
Focus on: data curation, instruction tuning, and low-rank adaptation
Good for students because: highly accessible if they already know LLM fine-tuning basics
Part VI. Track A — Financial Reinforcement Learning and Benchmarking
This track is the most recommended starting point for students who want a concrete engineering entry into AI for finance.
Typical question:
How do we build, benchmark, and stabilize financial RL systems without overclaiming unrealistic market performance?
Why start with this track
This track helps students build intuition for:
why finance is difficult for RL from a data and environment perspective
how benchmark design changes conclusions
how to make experiments reproducible and comparable
Recommended order
A1. FinRL (ACM ICAIF 2021 / arXiv 2020)
Paper: FinRL: Deep Reinforcement Learning Framework to Automate Trading in Quantitative Finance
Why read it: the standard entry paper for AI4Finance-style financial RL
Main idea: end-to-end open-source RL pipeline for automated trading
Good for students because: simple conceptual structure and many tutorials
A2. FinRL-Meta (Machine Learning 2024)
Paper: Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Why read it: important if you want to understand why environment and data design matter so much
Main idea: dynamic datasets + automatic market-environment construction
Good for students because: practical and reproducibility-oriented
A3. FinRL Contests (arXiv 2025)
Paper: FinRL Contests: Benchmarking Data-driven Financial Reinforcement Learning from 2023 to 2025
Why read it: useful if you want a modern benchmark-centric view rather than a single algorithm paper
Main idea: standardized tasks, starter kits, and reproducible evaluation for financial RL
Good for students because: focuses on benchmarking and engineering rather than economics-heavy modeling
What students should reproduce first in this track
Choose one:
FinRL
FinRL-Meta
Then use one follow-up reading:
FinRL Contests
Good starter benchmarks for this track
single-stock trading environments
simple portfolio-allocation tasks
contest-style public benchmark settings
Avoid highly realistic high-frequency trading setups at the beginning.
Part VII. Track B — Financial LLMs and AI Agents
This track is for students who are interested in using LLMs, instruction tuning, and agent systems for financial applications.
Typical question:
Can we build finance-specialized LLM or agent systems that are useful, reproducible, and not overly dependent on proprietary infrastructure?
Why this track is good
This track is suitable for students who want:
a modern AI-system angle rather than a market-theory angle
relatively intuitive algorithmic ideas
accessible experimentation with open-source models and data
Recommended order
B1. FinGPT (arXiv 2023)
Why read it: the clearest open-source entry point for finance-specific LLM work
Main idea: data-centric LLM pipeline for finance with instruction tuning and LoRA
Good for students because: easier to reproduce than building a new domain model from scratch
B2. FinRobot (arXiv 2024)
Paper: FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models
Why read it: useful if you want to move from one-model pipelines to multi-agent financial workflows
Main idea: finance-specialized LLM agents with tool use and chain-of-thought-style task decomposition
Good for students because: highly engineering-oriented and easy to connect to demos
B3. FinGPT-Research (GitHub Research Hub)
Project / Code: https://github.com/AI4Finance-Foundation/FinGPT-Research
Why read it: useful if you want practical fine-tuning and replication materials
Main idea: hands-on LoRA, sentiment, and task-specific FinGPT workflows
Good for students because: closer to an engineering lab notebook than a purely theoretical paper
What students should reproduce first in this track
Choose one:
FinGPT
FinRobot
Then use one follow-up resource:
FinGPT-Research
Good starter benchmarks for this track
sentiment classification
financial QA
earnings-call or report summarization
agent-assisted research workflows
Avoid large closed-source agent stacks at the beginning.
Part VIII. Track C — Data-Centric / Infrastructure-Centric AI for Finance
This track is for students interested in the engineering question of how to build better datasets, evaluation pipelines, and reusable infrastructure for AI-finance research.
Typical question:
If we keep the model fixed, how much can data quality, environment design, and evaluation protocol change the result?
Why this track is attractive
directly useful for reproducible research
easier to scale down into manageable student projects
less dependent on beating strong proprietary systems
strongly aligned with AI engineering skills
Recommended order
C1. FinRL-Meta (Machine Learning 2024)
Paper: Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Why read it: the strongest AI4Finance paper for data and benchmark infrastructure
Main idea: automatic curation of dynamic datasets into market environments
Good for students because: highly engineering-oriented and benchmark-friendly
C2. FinRL Contests (arXiv 2025)
Paper: FinRL Contests: Benchmarking Data-driven Financial Reinforcement Learning from 2023 to 2025
Why read it: useful if you care about standardized evaluation and starter kits
Main idea: benchmark design for financial RL across multiple tasks
Good for students because: clearly focused on reproducibility and evaluation
C3. ElegantRL (GitHub Framework)
Project / Code: https://github.com/AI4Finance-Foundation/ElegantRL
Why read it: lightweight RL framework useful for cleaner baselines and smaller experiments
Main idea: structurally clean RL implementation with minimal engineering overhead
Good for students because: suitable for fast baseline testing without too much framework complexity
What students should reproduce first in this track
Choose one:
FinRL-Meta
one simple benchmark in FinRL Contests
Then use one follow-up resource:
ElegantRL
Good starter benchmarks for this track
single-market public datasets
portfolio-allocation toy tasks
simple contest-style starter environments
Avoid overly broad infrastructure projects before reproducing one benchmark first.
Part IX. Recommended code / benchmark libraries
These libraries are useful because students should not waste too much time building everything from scratch.
1. FinRL
Why use it: the main open-source AI4Finance RL entry point
2. FinRL-Meta
Why use it: useful for data-centric benchmark design and environment construction
3. FinGPT
Why use it: the main open-source financial LLM entry point
4. FinRobot
Why use it: useful for AI-agent-style finance applications
5. ElegantRL
Why use it: lightweight RL framework for smaller and cleaner experiments
6. FinRL-Tutorials
GitHub: https://github.com/AI4Finance-Foundation/FinRL-Tutorials
Why use it: practical demos and entry-level tutorials for the ecosystem