Quantitative Methods
Video Lessons
J. Anthony Cookson
J. Anthony Cookson
This page is where I am posting links to the full set of asynchronous content I developed in support of my Quantitative Methods in Finance (and Real Estate) course for Fall 2020. This course is a introductory course whose goal is to take students from a minimal understanding of basic statistics and no understanding R programming to having a solid foundation in both. The course is rigorous, but it substitutes the analytic rigor found in most mathematical statistics courses with a set of R simulations that convey the same underlying intuition for how sampling, and thus, statistics works.
I have organized this page to have expandable units that represent (roughly) one week each of my master's level course. Although the material on this page is comprehensive, it is not the full experience of my quantitative methods course. It is better to think of this page as the "video textbook" for my students. I expect students to engage actively with this material weekly, and then, in a flipped class, my class time is spent applying these concepts and clarifying some of the deeper questions that come up in the study of statistics.
That said, in posting these publicly, I hope that the resources I have posted here will be helpful for someone interested in self-study or additional perspective on introductory statistics and/or how to use R to highlight introductory statistics concepts.
The purpose of this week's video lessons is to get you familiar with R.
Download and Install R & RStudio.
Familiarize yourself with basic navigation.
First exposure to data creation commands (sequences, vectors) and storing in objects.
Set up your class folder for storing scripts.
To follow along, you do not need any particular materials this week aside from a computer on which you would like to get up and running with R.
You may need to pause (and rewind) to make sure you follow each step in order.
At the end of the video, you should have produced this script.
R has many "tricks of the trade" that you'll only pick up by continually exposing yourself to it. The two videos above are a good start, but in the interest of giving you more opportunities to learn here below is an older video tutorial that I recorded a while ago to illustrate useful tricks from Tom Short's R Reference Card.
This unit introduces some concepts in probability using R simulations as a tool to illustrate some concepts.
What is probability?
Telling the difference between continuous and discrete RVs.
Introduction to simulating and plotting RVs in R.
To follow along, you will want to download the R scripts and PowerPoint presentations below.
You may need to pause (and rewind) to make sure you follow each step in order. If you miss something the first time, your best first recourse is to rewind.
I expect you to open the scripts (links below), and follow along by running these commands on your own computer. Pause and rewind to make sure you get everything to run on your end, and make sure you understand what's going on.
Video 1: Probability and Sample Spaces [25 minutes].
Script (.R). Script with ad hoc edits (.R)
Slides (.ppt)
Video 2: Continuous versus Discrete Sample Spaces [25 minutes]
This week's topic is random variables, expected value and standard deviation.
What is a random variable (RV)?
How to compute expected value and standard deviation of a RV.
To follow along, you will want to download the R scripts and PowerPoint presentations below.
You may need to pause (and rewind) to make sure you follow each step in order. If you miss something the first time, your best first recourse is to rewind.
I expect you to open the scripts (links below), and follow along by running these commands on your own computer. Pause and rewind to make sure you get everything to run on your end, and make sure you understand what's going on.
There are about 45 minutes of video on random variables, expected value, and standard deviation in the core part of this unit. I expect that working through the calculations and the details in the script will add about half an hour to this time, including pausing, rewinding, etc.
Video 1: Introduction to Random Variables [14 minutes].
Slides (.ppt)
Video 2: Excel Example of Expected Value and Standard Deviation Calculations [11 minutes]
Video 3: R Example of Expected Value and Standard Deviation [19 minutes]
Script (.R)
This week's lessons are on the normal distribution.
What is the normal distribution?
68-95-99.7 Rule.
Skew and kurtosis as violations of normality.
Some practice with built-in distributions in R.
To follow along, you will want to download the R scripts and PowerPoint presentations below.
Watch and follow along in the associated scripts (download to your class folder before starting) to make sure you can perform the calculations presented here. There are 44 minutes (+18 min of external video). I expect ~ an extra half hour to hour of time devoted to pausing and rewinding, so that you can run everything on your machines.
Video 1: Skew, kurtosis and the normal distribution [12 minutes].
Slides (.ppt) (begin on slide 17 where we left off)
Video 2: Normal Probabilities and Quantiles [14 minutes + 18 minutes of external video]
Video 3: Built-in Distributions and the 68-95-99.7 Rule [19 minutes]
Script (.R)
Measures of Risk and Sharpe Ratio [38 minutes]
Goal is conceptual, so I haven't posted scripts/slides to follow along.
This week's material is on the properties of the sampling distribution of the sample mean. This is a central concept in statistics, so spend extra time to understand what is going on in this unit before moving on.
Understand conceptually the sampling distribution.
Grok sampling variability.
Understand standard errors.
Central Limit Theorem.
R simulations of these ideas.
For loops and storage vectors are covered in this unit.
To follow along, you will want to download the R scripts and PowerPoint presentations below.
As this set of concepts is conceptually difficult, I expect lots of pausing, rewinding, and re-running code to get what is going on.
Video 1: Introduction to Sampling Distributions [11 minutes].
Slides (.ppt)
Video 2: Simulating Sampling Distributions [22 minutes]
Script (.R)
Utility scripts: draw_samp_dist.R and plot_sampd.R
Video 3: Simulating Standard Errors [20 minutes]
Script (.R)
Utility scripts: draw_samp_dist.R and plot_sampd.R
Video 4: Central Limit Theorem [20 minutes]
Script (.R)
Utility scripts: draw_samp_dist.R and plot_sampd.R
This week's lesson concludes our study of sampling distributions. The analytic perspective in this week's lesson complement's last week's video lesson, which uses simulation to show properties of sampling distributions.
Introduce linear combinations math & use it to show the standard error formula and show that the sample mean is unbiased.
Apply linear combinations to portfolio math.
Organize the results on the sampling distribution for conducting inference.
Start thinking about statistical inference -- namely, confidence intervals.
To follow along, you will want to download the R scripts and PowerPoint presentations below.
Pause and rewind as needed.
Video 1: Linear Combinations & Sampling Distributions [19 minutes].
Slides (.ppt)
Video 2: Finance Application of Linear Combinations: Building a Portfolio Model [7.5 minutes]
Portfolio Calculator Workbook (.xlsx)
Video 3: Synthesizing Lessons from the Sampling Distribution [10 minutes]
Slides (.ppt)
In Fall 2019, we had four snow days (+ days that construction on the building forced us to evacuate) that cancelled my MS Investments class. Thus, I recorded some video lessons for the investments class (without jump cuts, but still relevant). This video on "Capital Allocation to Risky Assets" shows an application of linear combinations in an investments context.
And... if you don't mind some calculus. Recorded *a while* ago with worse technology.
Please don't let the extra technical details from these videos confuse the intuition of the main videos above. Really, this is only for those of you with more robust math backgrounds.
If you just watch the video to see what Tony looked like 10 years ago, great.
Here are the videos:
Expectation and Variance of a Random Variable (11 minutes; whiteboard illustration)
Sampling Distribution of the Sampling Mean (11 minutes; whiteboard illustration with calculus assumed)
This unit introduces the two foundational elements of inference in statistics -- confidence intervals and hypothesis tests.
Confidence interval interpretations.
Hypothesis testing overview.
Mechanics of hypothesis testing & canned function usage (for one sample t-tests).
How to test hypotheses using confidence intervals.
To follow along, you will want to download the R scripts and PowerPoint presentations below.
The video lessons below are a full unit on statistical inference -- confidence intervals and hypothesis tests. The amount of material is more than the usual expectation for a week of video lessons (51 minutes of new material + 20 minutes of review). We leave some advanced topics for next week's video lesson.
Pause and rewind as needed.
Video 1: Introduction to Confidence Intervals [20 minutes].
Slides (.ppt)
Script (.R)
Utility scripts: draw_samp_dist.R
Video 2: A Fishing Analogy [6 minutes]
Confidence intervals slides from above.
Hypothesis Testing Videos
Overview [15 minutes], one-sided versus two-sided [4 minutes], Nuts and Bolts in R [20 minutes], Using CIs to test hypotheses [5 minutes]
Slides for all (.ppt)
Script (.R)
Utility scripts: draw_samp_dist.R and plot_sampd.R
This week's video lessons are to finish the unit on inference, with a discussion of error rates, power, and two sample t-tests.
Understand Type I and Type II Error.
How to conduct a power analysis.
Understand the difference between a matched pairs t-test and a two-sample t-test that assumes independence.
To follow along, you will want to download the R scripts below & revisit the Hypothesis Testing slides from last week.
The video lessons below are a full unit on statistical inference -- confidence intervals and hypothesis tests. The amount of material is more than the usual expectation for a week of video lessons (51 minutes of new material + 20 minutes of review). We leave some advanced topics for next week's video lesson.
Pause and rewind as needed.
Video 1: Type I Errors, Type II Errors and Power [7 minutes].
Slides for all (.ppt)
Video 2: Statistical Power in R [12 minutes]
Script (.R)
Two Sample T-Tests in R [27 minutes]
Recorded in 2019, so a little clunkier than my 2020 recordings.
Script (.R)
FrenchPortolios data (.txt) [note the misspelling; this comes up in the video]
This video lesson introduces regression, correlation and line fitting under the umbrella heading "Regression Mechanics."
Understand relating two variables to one another visually (scatter plots) and numerically (correlation and through regression).
Understand the line fitting process -- minimizing the sum of squared residuals,
Intuition for regression coefficient equations.
An "under the hood" view of the fitted line.
To follow along, you will want to download the R scripts below and associated PowerPoint files.
This asynchronous lesson contains 67 minutes of pre-recorded material. In weeks past, a few of the asynch lessons have (hopefully) taken less than the 75 minutes. This one will take longer, but the investment of your time is particularly valuable to gain an intuition for the mechanics of single regression
There is a lot to digest in this unit, so please pause and rewind as needed.
Intro Video on Relating Variables [13 minutes].
Slides (.ppt)
Regression Mechanics Videos.
Intro to Regression Mechanics [24 minutes], Variance Decomposition and R-squared [4 minutes], Regression Formulas Applied to an Example in R [16 minutes], Excel optimization example [10 minutes]
Regression slides from 2019 (.pdf)
Regression Mechanics Script (.R)
+ vid_views_example.R which uses data in yt_short.csv.
Excel Illustration of Regression (.xlsx)
The video below is one I recorded in 2011 that goes over how to use the lm() command, which is complementary to the more conceptual videos above. I cannot locate the underlying script for you to follow along, but seeing the illustration can nonetheless be helpful. The video goes through some simulated data examples and shows the following additional aspects of fitting regression models.
How to estimate a regression without an intercept (we don't usually want to do this).
How to use mathematical transformations within the formula statement (this is pretty useful; stuff around 8minutes). See also log transformations.
It also shows the extension to multiple regression (minute 9.5). Don't worry so much about multiple regression yet, because we'll get there eventually.
Running a Regression in R (13 minutes)
This set of lessons goes beyond regression mechanics to refine our understanding of regression interpretations and regression inference.
Apply the lessons of statistical inference to a regression setting.
See a Monte Carlo justification that the inferential tools we had before apply to a regression setting.
Get used to interpretations of slope and intercept in a regression.
To follow along, you will want to download the R scripts below and relevant PowerPoints.
You may also want to review some aspects of the Regression Mechanics unit from last week.
Pause and rewind as needed.
Introduction to Regression Inference [13 minutes].
Slides (.ppt) [start at slide 22]
Monte Carlo of Single Regression [12 minutes]
Script (.R)
Regression Interpretation Examples in R [25 minutes]
Script (.R)
speed_gender_height.csv (.csv)
Slides (.ppt)
This set of lessons is centered on regression diagnostics. In addition, I'm asking you to preview the introduction to multiple regression. There's a lot to digest in that unit, so it is best to get a head start on that material.
Get used to regression diagnostic analysis.
Identify and understand the influence of outliers.
Understand the log transformation.
To follow along, you will want to download the R scripts below and relevant PowerPoints.
You may also want to review some aspects of the Regression Mechanics and Regression Inference units from the last couple of weeks.
Pause and rewind as needed.
This week's video lessons close out our discussion of single regression. I expect you to rewatch these videos (& the previous two weeks) as needed to ensure you get the concepts straight.
Regression Diagnostics Overview & Log Transformation [15 minutes].
Slides (.ppt)
Diagnostics and Outliers Examples [11 minutes]
Slides from above
I didn't run this code explicitly, but this script (diagnostics) with these data sets (golf_dat.csv, denver_nbhds.csv) will produce the plots in the video.
This set of video lessons contains the essentials for multiple regression. This is a big video lesson this week. There are about 70 minutes total to cover in this week's lesson. In a typical course, I'll preview some of the introductory material from this week in an earlier week.
Understand multiple regression interpretations.
Holding Constant
Residual variation
Graphical tools for gaining intuition for multiple regression: binscatter.
Multicollinearity.
To follow along, you will want to download the R scripts below and relevant PowerPoints.
You may also want to review some aspects of the Regression Mechanics and Regression Inference units from the last couple of weeks.
Pause and rewind as needed.
Introduction to Multiple Regression [28.5 minutes].
Slides (.ppt)
Script (.R)
Data: mlb11.csv
Utility Script: binscatter_function.R (or if you have issues with the tidyverse, or want to avoid ggplot, binscatter_lite.R)
Regression: Significance versus Fit [13 minutes]
Slides (.ppt)
Script (.R)
Data: ames.RData, edudat2.dta
Multicollinearity [3.5 minutes]
Slides from "Regression: Significance versus Fit"
This is the last unit in the course, and it is technically two weeks of material. However, you'll want to have access to all of this as you embark on working with multiple regression, so I have put this all together.
Understand how to interpret a linear regression with dummy variable explanatory variables.
Note the link to t.test().
Understand how to fit and interpret regression models with polynomial explanatory variables.
Understand interactions in regression.
Understand fixed effects.
Understand why to run a joint hypothesis test and how to implement such an F-test in R.
Understand heteroskedasticity and how to adjust for it.
Understand non-independence (clustering) of standard errors and how to adjust for it.
To follow along, you will want to download the R scripts below and relevant PowerPoints.
You may also want to review some aspects of the Regression Mechanics and Regression Inference units from the last couple of weeks.
Pause and rewind as needed.
Multiple Regression with dummy variables [31 minutes]
Multiple regression: interactions [20.5 minutes].
Slides continue from the dummy variable slides above.
Script (.R)
Data: ames.RData,
Utility Script: binscatter_function.R (or if you have issues with the tidyverse, or want to avoid ggplot, binscatter_lite.R)
Multiple regression: polynomials [8.5 minutes]
Slides continue from the dummy variable slides above.
Script continues from the interactions script above.
Data: ames.RData (same as above)
Utility Script: binscatter_function.R (or if you have issues with the tidyverse, or want to avoid ggplot, binscatter_lite.R)
Joint Hypothesis Testing [~15 minutes]
Standard Error Issues [25 minutes], and Standard Error Issues and felm(): Deux [22.5 minutes]