Programming Languages and Tools
Over the years, I have used many programming languages to solve all sorts of problems. During the PhD, I have mainly used SAS and Matlab. When working for the industry, I used VBA for Excel, Perl and SQL. Additionally, sometimes I integrate multiple languages to solve one problem.
1. SAS with SQL
1. SAS with SQL
SAS is an excellent software to work with large datasets. In particular, I have used SAS to compute realized volatility and realized skewness from high frequency data for the cross-section of stocks on a daily basis. To give an idea of the size of intraday data, the SAS file for September 30, 2009 is 1.4GB. And that's only ONE day! Now imagine computing daily realized variance and realized skewness on a daily basis for 16 years (4,000 days). I have also used SAS to extract options data from Optionmetrics for the cross section of US stocks. Find below the code to extract realized volatility using intraday data (as in Ait-Sahalia, Mykland & Zhang (2005)) and the code to extract options data.
Daily Realized Volatility from High Frequency data (multigrid): SAS program.
Extracting out-of-the-money options for the cross-section: SAS program.
Matlab is a very good software for numerical computation. It is good to test algorithms and does not need to be compiled like C. I have used it in both worlds, academic and practitioners. What I like the most is the be able to write the mathematical formulas in Matlab just as they look on paper. It does not always happens, but I have seen it many times. Moreover, Matlab has thousands of built-in functions that make my life easier. Find below the code of papers I have replicated using Matlab.
"Approximating American Option Prices in the GARCH Framework", 2003. Duan, Gauthier, Sasseville and Simonato - Journal of Futures Markets.
Computing moments of NGARCH: [ MontecarloGarch.m ] [ American_Option_Garch.m ]
Replication: [ example.m ] [ gap.m ] [ third_r2.dll ]
Extension of the paper with Johnson Curves: [ example_jsn.m ] [ johnson_pdf.dll ]
[ American_Option_Garch_Johnson.m ]
"An Explicit Finite Difference Approach to the Pricing of Barrier Options", 1998. Boyle and Tian - Applied Mathematical Finance.
To price barrier options is not trivial. These videos show the price of barrier option using binomial trees and trinomial trees. Depending on the number of steps there might be big price differentials.
[ Binomial Double Barrier.AVI ]
[ Trinomial Double Barrier.AVI ]
Replication (Use main.m file): [ Replication.ZIP ]
Perl is a free software that is very efficient to parse large text files (more than 1GB). Perl uses regular expressions to recognize text patterns so that relevant information can be extracted. While working at the bank, I had to extract CDS quotes for 300 companies (that's before Mark-it Partners existed) from traders' emails. I only had two days to do it and the amount of emails and quotes made this task very challenging. Bloomberg stores emails for the whole bank on a text file on a daily basis. Using Perl, I was able to extract relevant CDS quotes for each company to then price the CDS book at market prices. Find below the Perl code.
Download Perl here.
Perl code to get CDS quotes from text file.
VBA is the programming language of Microsoft Office tools. I have mainly used it to create desktop applications. Find enclosed a tool that extracts correlations for given risk factors, computes Value-at-Risk given the greeks (Delta, Gamma, Vega, Rho) of those factors and performs a principal component analysis of the correlation matrix. Find below the Excel desktop calculator programmed with VBA.
Microsoft support site for VB: MSDN.
Desktop Calculator in Excel (Click on Load Correlations to start)
If you need to speed up your numerical computations, C (or C++) is the solution. Given that the code runs in machine language, its speed is unbeatable. Find below the replication of a paper developed in C.
Replication of "The GARCH Option Pricing Model", 1995 by Duan - Mathematical Finance.
The C code computes the GARCH option price using as control variable to reduce the variance the theoretical Black-Scholes price.
This code uses the GSL library found here.
6. Working simultaneously with two or more languages
SAS, Matlab and MySQL
Each programming language has its own strengths. SAS strength is its ability to work with large datasets (more than 5GB). Matlab, on the other hand, is great to solve specific numerical problems, like numerical integration. However, numerical integration in SAS is not simple to program and Matlab cannot load large datasets.
At some point in my research, I computed the risk neutral moments (as in Bakshi, Kapadia and Madan (2003, RFS)) for the cross section of options on a daily basis. I solved the problem by 1) extracting the data from WRDS using SAS, 2) loading the data from SAS into MySQL, 3) having Matlab read the data from MySQL and compute the risk neutral moments, 4) loading back the data from Matlab into MySQL and 5) getting the risk neutral moments into SAS from MySQL. It took one week of machine time to run this process. Find below a PDF document on how to link these three packages as well as the code I used.
A PDF document from Matlab Central on how to link Matlab and SAS.
Load data from MySQL into SAS and viceversa: SAS program.
Matlab program to compute risk neutral moments in MATLAB, get data from MySQL to Matlab and viceversa. This is the Matlab function to compute the risk neutral moments.
Excel and SQL databases
At banks, people use Excel to solve any numerical problem. Therefore, to have an impact on the end user (i.e. traders, risk managers), the development of computer tools is done in VBA for Excel. Additionally, portfolio positions and greeks are stored on large databases. Hence, an Excel desktop application must be able to access these databases (using ODBC) to retrieve the data. Find below the VBA code to access mainframe databases. (Refer to the PDF document on how to link Matlab and SAS to learn how to set up an ODBC)
VBA code to access an external database.
Matlab and C (Visual C++)
So you have your C++ code and you don't know how to debug it when running Matlab. This link tells you how to do it.