Supplementary Statistics Video Lectures for University of Pennsylvania Data Analysis for Life Science (BIOM 610) Course
Lecture 1
What is a statistic?, random variables, expected value (including linearity of expectation), independence, set theory operators for probability (intersections and unions/ representing "and" and "or" through probability), conditional probability, sample mean, unbiasedness, distributions (including normal distribution, Poisson distribution, uniform distribution, binomial), probability density/mass functions, numeric and categorical data, measures of center, skew, population and sample variance, boxplots (IQR, quartiles), and sampling with and without replacement: https://drive.google.com/file/d/15D8VPLRm4DDGucHQqoJhJKPvd0vZj1aT/view?usp=sharing
Lecture 2
Experimental design, correlation, Z-scores, distributions (including t, lognormal, normal, exponential distribution, a little more on the binomial distribution), central limit theorem, the cumulative distribution function and calculating ranges of probabilities, and a little more on sampling: https://drive.google.com/file/d/15d0ZTIt7qzpzEEAbUzXKcbIlt3xzJ9rt/view?usp=sharing
Lecture 3
Probability functions, subsets, rules of probability theory, mutually exclusive, inclusion-exclusion formula, properties of variance and covariance, expected values and variances for samples, Bayes' Theorem, using a normal distribution to approximate the binomial distribution, and geometric random variables: https://drive.google.com/file/d/16kLbXL-4dOgRllZZo-ad8WFrcOa-_ZgH/view?usp=sharing
Lecture 4
General description of regression, supervised vs. unsupervised learning, bias-variance tradeoff, model building steps, linear regression (mathematical notation, assumptions, parameter interpretation), definition of high-dimensional settings, logistic regression (mathematical notation, parameter interpretation, odds (correction where I define pi/(1-pi), this is the odds, not odds ratio)), how to use categorical predictors in regression/dummy coding, metrics for assessing linear regression fit (multiple R^2, adjusted R^2, residuals/sum of squared error/residual plot, Q-Q plots), detecting unusual points (including rules of thumb for assessing outliers, leverage, influence), and metrics for assessing logistic regression fit (Confusion matrix, accuracy, sensitivity, specificity, ROC curves)
Part 1: https://drive.google.com/file/d/18I0O0VZwtMHusGsstd0NjwfJ4RZQ55D0/view?usp=sharing
Part 2: https://drive.google.com/file/d/18O2XHqeOuruuy3BXVcqAeHUiSQygS825/view?usp=sharing
Part 3: https://drive.google.com/file/d/18OdjYkihrIJsEPANdKZoCjtTdXseTHMB/view?usp=sharing
Lecture 5
General steps for conducting a hypothesis test (including null/alternative hypotheses and one-tailed vs two-tailed alternative hypotheses, defining alpha and definitions of beta, power, Type I/II error, some relationships between these quantities and sample size, test statistic, rejection regions and p-values, concluding), dealing with multiple hypothesis tests, confidence intervals (not technically a hypothesis test, but as noted in the videos, they go hand in hand), one-sample and two-sample tests for a difference in means, ANOVA, F-test for equality of variances, one-sample and two-sample tests for proportions, test for association, goodness-of-fit test, and a brief nod at bootstrapping, permutation testing, nonparametrics and semiparametrics, pairwise testing following ANOVA, and checking for increasing/decreasing trends in proportions
Part 1: https://drive.google.com/file/d/1CN5yQ6ftDKP8ZE9ZrRBEltrSZ9oFX-4W/view?usp=sharing
Part 2: https://drive.google.com/file/d/1CNNWuUhzr4NILNsivAe04SV46n8kHbdc/view?usp=sharing
Lecture 6
Survival data, censoring, truncation, parametric vs nonparametric vs semiparametric, basic notation, functions (pdf, cdf, survival function, hazard function, cumulative hazard function, mean residual life function, mean life, pth quantile of t, median lifetime), exponential and weibull distributions, Cox model, and the Kaplan-Meier estimator: https://drive.google.com/file/d/1FnWnPX6eRrVAw0YVFKTrSqCNfRjiksde/view?usp=sharing