Currently I'm running a quantitative strategies group on Wall St. Previously I was a PhD student in Computer Science (Machine Learning Department) at Carnegie Mellon, being advised by Geoff Gordon. I am also a candidate in the CFA (Chartered Financial Analyst): I completed my level 3 exam in June 2007.
This is a list of papers related to finance that I have read. I made a list of paper summaries for my own personal use so that I could quickly refer back to papers or concepts which I found interesting, and eventually this list grew long enough that I decided to post it. I also randomly added scores to papers based on how much I liked and / or understood them.
Feel free to email me with comments if you disagree or agree with my summaries or if I made a significant mistake (my email address is at the top of this page)
Machine Learning and Finance papers (may require some knowledge of Machine Learning)
Pure Machine Learning (theory and algorithms)
Trading Strategies (things like momentum, value, pairs trading, trading volatility with GARCH models)
Stock and fund returns and betas. These papers require some knowledge of portfolio theory.
Derivatives: Papers on stochastic processes for modelling derivatives
Numerical Methods: Derivative Pricing
Other Finance and Machine Learning Papers This is a very random selection
So far my favorite paper was:
"The Interaction of Value and Momentum Strategies" (Clifford S Asness)
Plots of Option Greeks
These are some articles I wrote and tried to get published in the Wall Street Journal (but failed). In both articles I try to use basic economics and common since to make interesting, if not practical, points.
My opinion pieces are here.
Current research projects I'm working on or have worked on:
My KDD project looks at how buy-and-hold has optimal worst-case regret in the portfolio selection problem. I generalize this to long-short portfolios and show how to incorporate "feature bets" ( like value,momentum) with feature portfolios. Here as a PDF.
This research project looks at the diversification benefits of pursuing multiple long-short trading strategies ,like momentum, long-term reversal, value, size, etc. The goal is to see how these strategies interact, and I find that correlations between the strategies' return are very low. Project paper is here.
This idea uses graph algorithms (linear programming could also be used) to find the best set of transactions to undertake to buy any given security. It is novel because it takes advantage of synthetic positions using put-call parity, forward contracts, and the like, but it can also take advantage of things like covered interest rate parity. This algorithm will not only identify risk-free arbitrage opportunities, but report the cheapest way to buy any asset (e.g. if I want yen in six months, should I trade dollars to yen at the spot rate and lend the yen for six months, lend the dollars for six months and exchange the dollars for yen forward, use options, etc...this algorithm will tell the optimal set of transactions to undertake)
Worked with my advisor Geoff Gordon to create a dynamic programming algorithm to find sets of subgame perfect equilibrium in a stochastic game, given one sub-game perfect equilibrium to use as a punishment policy. I can't post this paper because it was recently submitted and is under review. A current extension of this work is using linear programming to solve the same algorithm for stochastic games.
Current Version as a CMU tech report: PDF
Worked with Hao Cen and Christos Faloutsos to create a method to analyze spatio-temporal correlation in disk traces.
Specifically, the goal is the measure the extent to which the trace-generating process favors recently-accessed disk blocks over disk blocks which haven't been recently accessed.
The method properly accounts for the fact that the marginal distribution of accesses to disk blocks may be very non-uniform In addition, the statistics underlying the method may be used to analyze any temporally-evolving series of events drawn from a finite set. I want to use this technique to analyze periods of high volatility in stock prices (events are movements of more than 1% per day)
Used data from a UC data mining competition to try to predict credit defaults based on a large list of factors. This project involved working with large data sets (~600 MB learning data) and optimizing decision trees on numerical input features.
Wall Street Journal
Professor Kenneth French's Website
Some info on logistic regression and generalized linear models
Reinforcement Learning Survey (PDF: a paper by Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore)