Machine Learning
1) Machine learning course on coursera by Andrew Ng (link)
2) UC Irvine Machine Learning dataset repository (link)
3) Introduction to Statistical Learning (ISLR) book and R code (link) (book PDF) (R code) (videos) (playlist) (more advanced book)
Very good books on basic statistics (link)
p-value (video)
effect size (link)
4) Great explanation of the bias-variance tradeoff (link)
ISLR video explanation (link)
https://www.youtube.com/watch?v=VaN1RUDuioQ&list=PLOg0ngHtcqbPTlZzRHA2ocQZqB1D_qZ5V&index=5
http://scott.fortmann-roe.com/docs/BiasVariance.html
VERY GOOD picture
https://github.com/neelsoumya/basic_statistics/blob/master/bias_variance.png
5) Basic statistics (ANOVA, t-test, F-test, etc) (link)
Linear models, ANOVA, mixed effects, fixed effects, random effects and other basics (link) (tutorial 1) (tutorial 2)
Beautiful VERY GOOD tutorial on how most statistical tests are related to linear models (link)
Coursera course on basic statistics (link, github)
6) MIT OCW course on Artificial Intelligence by Patrick Winston (link) (course webpage)
search part 4 video search = choice
very good lecture on neural network, autoencoder and softmax (link)
goal trees and expert systems (link)
7) Area under curve (AUC) and ROC curve explanation (link) (video)
Precision recall curve (link) (link)
VERY GOOD Video tutorial based on ISLR material by Trevor Hastie and Rob Tibshirani (link)
VERY GOOD picture of precision, recall, confusion matrix, false positive, true positive, etc (link)
The number AUC has a probabilistic interpretation.
It is the probability that a randomly chosen positive example is ranked more highly than a randomly chosen negative example (link)
Sensitivity and specificity
https://github.com/neelsoumya/basic_statistics/blob/master/800px-Sensitivity_and_specificity.svg.png
Explanation of AUC (area under curve) (after the model is selected you can play around with threshold for logistic regression prediction) (picture courtesy Chris Penfold)
https://github.com/neelsoumya/basic_statistics/blob/master/auc_explanation.png
8) Techniques for visualizing and thinking about higher dimensions
How to use tSNE (link)
Explaining and exploring tSNE visually (link)
Great explanation of PCA (principal components analysis) (link) (link)
tSNE and PCA in your browser (link)
Eigenvectors and basis (link)
Uniform distribution in high dimensions (link)
Difference between UMAP and t-SNE (link) (link)
Coursera course on mathematics of PCA, dot product (link)
NCERT textbook on matrix and determinants (link) (link)
Mathematics of machine learning book by Marc Deisenroth (link)
9) Linear algebra basics
Determinant (link)
Eigenvectors and basis (link)
Linear algebra basics course MIT OCW (link)
NCERT textbook on matrix and determinants (link) (link)
Mathematics of machine learning book by Marc Deisenroth (link)
10) About the Wishart distribution (conjugate prior for the precision matrix of a multivariate normal distribution) (link)
The Wishart distribution is often used as a model for the distribution of the sample covariance matrix for multivariate normal random data, after scaling by the sample size. If x is a bivariate normal random vector with mean zero and covariance matrix sigma then you can use the Wishart distribution to generate a sample covariance matrix without explicitly generating x itself. Notice how the sampling variability is quite large when the degrees of freedom is small.
Sigma = [1 .5; .5 2]; df = 10; S1 = wishrnd(Sigma,df)/df
S1 = 1.7959 0.64107 0.64107 1.5496
df = 1000; S2 = wishrnd(Sigma,df)/df
S2 = 0.9842 0.50158 0.50158 2.1682
6) Multivariate normal distribution (from link)
7) Excellent description of Mahalanobis distance (link, by Rick Wiklin)
8) Another tutorial on the Mahalanobis distance (link, Tutorial The Mahalanobis distance by R. De Maesschalck, D. Jouan-Rimbaud, D.L. Massart)
9) Great resources and videos on youtube from mathematicalmonk
10) R and other machine learning tutorials (Roger Peng's channel on youtube)
11) Great videos on bayesian regression by mathematicalmonk
12) Great videos on multivariate gaussian by mathematicaimonk
13) Derivations for conditional and marginal for gaussian
http://gbhqed.wordpress.com/2010/02/21/conditional-and-marginal-distributions-of-a-multivariate-gaussian/
14) Tutorials on machine learning by mathematicalmonk
15) Probability primers by mathematicalmonk
16) List of data science skills from Insight Data Science
My list of data science skills and tools
Great article on skills required for being a data scientist
17) Forecasting and time-series resources
Online book on forecasting using machine learning techniques (link) (link)
Forecasting using Facebook tool Prophet (link)
Bayesian structural time-series model from Google (CausalImpact)
Time series in R (link)
Interrupted time series analysis (link)
Difference in differences (link)
Community detection and clustering in time-series (link) (igraph)
Packages and public data on time series (link)
19) Deep learning resources
Deep learning book with each chapter an executable notebook
Deep learning book by Michael Nielsen
Backpropagation lectures by Andrew Ng
Backpropagation lecture by Andrej Karpathy
Deep Learning course on Udacity
Animation of Deep Learning and Visualizing Deep Neural Networks
Visual and interactive guide to neural networks
Hyperparameter optimization in deep learning
Theoretical understanding of autoencoders
Some of my implementations of deep learning algorithms
Some of my notes for a short tutorial on deep learning
Variational autoencoders explained
basic tutorial on Zachary karate club
Generative adversarial network (GAN) tutorial
20) Machine learning in C++
21) Machine learning algorithms book in python
22) Poisson distribution and derivation from first principles from Bernoulli trials
23) Poisson point process (link)
24) Least absolute deviations (link)
26) Cross validation
27) Bayesian logistic regression
Bayesian methods and great tutorial on logistic regression (link)
28) Maximum a posteriori estimation
30) IPython notebook on probability by Peter Norvig (link)
31) Some machine learning tools and projects I have worked on
32) Code and writeup on machine learning techniques (link)
33) Theory and basics of Bayesian techniques by Jordan (link)
34) Generalized linear model (link)
35) Harvard data science course (link)
37) Graphical models coursera course (link)
38) Machine learning cheatsheet (link)
39) Natural language processing (NLP)
Software (GATE)
Notes on how to use GATE software (link)
Dataset (link)
Test data from a Lewy body dementia paper (link)
Great article on how to apply NLP (link)
Using BERT pre-trained embeddings to perform transfer learning on NLP (link)
Topic modelling tools (link)
Great coursera course on using Tensorflow for NLP (link)
Link to my github with tutorials and resources and code on NLP (link)
Simple examples of NLP (on bitbucket)
Very good R package for NLP (quanteda)
42) VERY GOOD Logistic regression with mixed effects (fixed and random effects) (link)
43) VERY GOOD repository of excellent statistical algorithms in R (link)
44) Introduction to linear mixed effects models (link)
45) Great tutorial on survival analysis in R (link)
Survival analysis (using survminer) (cheatsheet)
Hazard ratio and survival time to event models (link)
VERY GOOD tutorial on survival models and time to event models (link)
BEST explanation of hazard ratio (link) (link)
Survival analysis code (my code on bitbucket)
Advanced analysis using strata (link) and time-varying models (link)
46) Probabilistic programming and Bayesian inference using Stan, PyMc3
Linear mixed effects model in PyMc3 (link)
Very good tutorial on linear mixed effects models in PyMc3 (link)
Bayesian inference of a dynamical system (Lotka-Volterra model) (link)
Linear mixed effects models in rstanarm (link)
Mixed-effects Bayesian neural networks (link) (link)
Very nice tutorial on using RStanarm (link)
Good tutorial on using RStan for linear mixed effects models (link)
Picking prior distributions in RStanarm (link)
Tidy package for visualizing RStanarm analysis (link)
Shiny interface for exploring posterior distributions (link)
bayesplot package to explore posterior distributions (link)
Simple examples of GLM and GLM mixed effects models in frequentist and Bayesian (using rstanarm) (on bitbucket)
Stan conference talks (link) (Stan code)
47) Non-parametric Bayesian techniques (link)
48) Reinforcement learning
Simple tutorial and explanation (link)
Great coursera course on reinforcement learning (link, github, colab)
50) Machine learning in tensorflow and tensorflowhub for engineers and developers
Coursera course on tensorflow
Google colab CNN horse detector
Some code for a failed butterfly detector
Machine learning course for developers by Google Education
51) Python notebooks covering a number of machine learning techniques (Machine learning for physicists) (link)
52) Widget and game to explain loss function (link) (link)
53) Computational art (link)
56) Mendelian randomization for causality in observational studies (link)
57) Teaching materials for basic statistics and machine learning from a bootcamp (code, tutorials, notes) (link)
58) Machine learning and data science
Machine learning resources (link, link to playlists)
Data science tools and reproducible machine learning (link)
Open source data science projects
Bayesian techniques
Tutorial that I created on Bayesian linear regression
Tutorial that I created on Bayesian LASSO
Deep learning teaching resources
Natural language processing (NLP) teaching resources
Teaching materials for basic statistics and machine learning from a bootcamp (code, tutorials, notes) (link)
59) Machine learning course materials (CS 229 Stanford University)