Machine Learning

1) Machine learning course on coursera by Andrew Ng (link)

2) UC Irvine Machine Learning dataset repository (link)

3) Introduction to Statistical Learning (ISLR) book and R code (link) (book PDF) (R code) (videos) (playlist) (more advanced book)

        Very good books on basic statistics (link)

        p-value (video)

        effect size (link)

4) Great explanation of the bias-variance tradeoff (link)

                    ISLR video explanation (link)

                    https://www.youtube.com/watch?v=VaN1RUDuioQ&list=PLOg0ngHtcqbPTlZzRHA2ocQZqB1D_qZ5V&index=5

                    http://scott.fortmann-roe.com/docs/BiasVariance.html

    

                   VERY GOOD picture

                         https://github.com/neelsoumya/basic_statistics/blob/master/bias_variance.png

5) Basic statistics (ANOVA, t-test, F-test, etc) (link)

      Linear models, ANOVA, mixed effects, fixed effects, random effects and other basics (link) (tutorial 1) (tutorial 2)

      Beautiful VERY GOOD tutorial on how most statistical tests are related to linear models (link)

      Coursera course on basic statistics (link, github)

6) MIT OCW course on Artificial Intelligence by Patrick Winston (link) (course webpage)

            search part 4 video  search = choice

            very good lecture on neural network, autoencoder and softmax (link)

            goal trees and expert systems (link)

7) Area under curve (AUC) and ROC curve explanation (link) (video)

      Precision recall curve (link) (link)

      VERY GOOD Video tutorial based on ISLR material by Trevor Hastie and Rob Tibshirani (link)

      VERY GOOD picture of precision, recall, confusion matrix, false positive, true positive, etc (link)

      The number AUC has a probabilistic interpretation.

      It is the probability that a randomly chosen positive example is ranked more highly than a randomly chosen negative example (link)

      Sensitivity and specificity

                   https://github.com/neelsoumya/basic_statistics/blob/master/800px-Sensitivity_and_specificity.svg.png 

     Explanation of AUC (area under curve) (after the model is selected you can play around with threshold for logistic regression prediction) (picture courtesy Chris Penfold)

     https://github.com/neelsoumya/basic_statistics/blob/master/auc_explanation.png

8) Techniques for visualizing and thinking about higher dimensions

        How to use tSNE (link)

        Explaining and exploring tSNE visually (link)

        Great explanation of PCA (principal components analysis) (link) (link)

        tSNE and PCA in your browser (link)

        Eigenvectors and basis (link)

        Uniform distribution in high dimensions (link)

        Difference between UMAP and t-SNE (link) (link)

        Coursera course on mathematics of PCA, dot product (link)

        NCERT textbook on matrix and determinants (link) (link)

        Mathematics of machine learning book by Marc Deisenroth (link)


9) Linear algebra basics

            Determinant (link)

            Eigenvectors and basis (link)

            Linear algebra basics course MIT OCW (link)

            NCERT textbook on matrix and determinants (link) (link)

            Mathematics of machine learning book by Marc Deisenroth (link)


10) About the Wishart distribution (conjugate prior for the precision matrix of a multivariate normal distribution) (link)

            The Wishart distribution is often used as a model for the distribution of the sample covariance matrix for multivariate normal random data, after scaling by the sample size. If x is a bivariate normal random vector with mean zero and covariance matrix sigma then you can use the Wishart distribution to generate a sample covariance matrix without explicitly generating x itself. Notice how the sampling variability is quite large when the degrees of freedom is small.

Sigma = [1 .5; .5 2]; df = 10; S1 = wishrnd(Sigma,df)/df 

S1 =        1.7959      0.64107       0.64107       1.5496 

 

df = 1000; S2 = wishrnd(Sigma,df)/df 

S2 =        0.9842      0.50158       0.50158       2.1682

6) Multivariate normal distribution (from link)

7) Excellent description of Mahalanobis distance (link, by Rick Wiklin)

8) Another tutorial on the Mahalanobis distance (link, Tutorial The Mahalanobis distance by R. De Maesschalck, D. Jouan-Rimbaud, D.L. Massart)

9) Great resources and videos on youtube from mathematicalmonk

10) R and other machine learning tutorials (Roger Peng's channel on youtube)

11) Great videos on bayesian regression by mathematicalmonk

12) Great videos on multivariate gaussian by mathematicaimonk

13) Derivations for conditional and marginal for gaussian

http://gbhqed.wordpress.com/2010/02/21/conditional-and-marginal-distributions-of-a-multivariate-gaussian/

14) Tutorials on machine learning by mathematicalmonk

15) Probability primers by mathematicalmonk

16) List of data science skills from Insight Data Science

      My list of data science skills and tools

      Great article on skills required for being a data scientist 

17) Forecasting and time-series resources

        Online book on forecasting using machine learning techniques (link) (link)

        Forecasting using Facebook tool Prophet (link)

        Bayesian structural time-series model from Google (CausalImpact)

        Time series in R (link)

        Interrupted time series analysis (link)

        Difference in differences (link)

        Community detection and clustering in time-series (link) (igraph)

        Packages and public data on time series (link)

19) Deep learning resources

Best course on deep learning 

Deep Learning book

Deep learning book with each chapter an executable notebook

Deep learning book by Michael Nielsen

Another deep learning book

Backpropagation lectures by Andrew Ng

Backpropagation lecture by Andrej Karpathy

Google DeepDream

Google TensorFlow

Deep Learning course on Udacity

Neural Network Zoo

Animation of Deep Learning and Visualizing Deep Neural Networks

Visual and interactive guide to neural networks

Hyperparameter optimization in deep learning

Autoencoder in keras

Theoretical understanding of autoencoders

Great introduction to LSTM

Some of my implementations of deep learning algorithms

Some of my notes for a short tutorial on deep learning

Variational autoencoders explained

Graph deep learning

        tutorials

        basic tutorial on Zachary karate club

Generative adversarial network (GAN) tutorial

20) Machine learning in C++

dlib 

21) Machine learning algorithms book in python

22) Poisson distribution and derivation from first principles from Bernoulli trials

23) Poisson point process (link)

24) Least absolute deviations (link

25) Likelihood function

26) Cross validation

27) Bayesian logistic regression

Bayesian methods and great tutorial on logistic regression (link)

28) Maximum a posteriori estimation

29) Hidden Markov Model

30) IPython notebook on probability by Peter Norvig (link)

31) Some machine learning tools and projects I have worked on

32) Code and writeup on machine learning techniques (link)

33) Theory and basics of Bayesian techniques by Jordan (link)

34) Generalized linear model (link)

35) Harvard data science course (link)

37) Graphical models coursera course (link)

38) Machine learning cheatsheet (link)

39) Natural language processing (NLP)

        Software (GATE)

        Notes on how to use GATE software (link

                Dataset (link)

                Test data from a Lewy body dementia paper (link)

        Great article on how to apply NLP (link)

        Using BERT pre-trained embeddings to perform transfer learning on NLP (link)

   

        Topic modelling tools (link)

        Great coursera course on using Tensorflow for NLP (link)

        Link to my github with tutorials and resources and code on NLP (link)

        Simple examples of NLP (on bitbucket)

        Very good R package for NLP (quanteda)

42) VERY GOOD Logistic regression with mixed effects (fixed and random effects) (link)

43) VERY GOOD repository of excellent statistical algorithms in R (link)

44) Introduction to linear mixed effects models (link)

45) Great tutorial on survival analysis in R (link)

        Survival analysis (using survminer) (cheatsheet)

        Hazard ratio and survival time to event models (link)

        VERY GOOD tutorial on survival models and time to event models (link)

       BEST explanation of hazard ratio (link) (link)

       Survival analysis code (my code on bitbucket)

       Advanced analysis using strata (link) and time-varying models (link)

46) Probabilistic programming and Bayesian inference using Stan, PyMc3

                Linear mixed effects model in PyMc3 (link)         

                Very good tutorial on linear mixed effects models in PyMc3 (link)

                Bayesian inference of a dynamical system (Lotka-Volterra model) (link)  

                Linear mixed effects models in rstanarm (link)

                Mixed-effects Bayesian neural networks (link) (link)

                Very nice tutorial on using RStanarm (link)

                Good tutorial on using RStan for linear mixed effects models (link)

                Picking prior distributions in RStanarm (link)

                Tidy package for visualizing RStanarm analysis (link)

                Shiny interface for exploring posterior distributions (link)

                bayesplot package to explore posterior distributions (link)

                Simple examples of GLM and GLM mixed effects models in frequentist and Bayesian (using rstanarm) (on bitbucket)

                Stan conference talks (link) (Stan code)

47) Non-parametric Bayesian techniques (link)

48) Reinforcement learning

            Simple tutorial and explanation (link)

            Great coursera course on reinforcement learning (link, github, colab)

50) Machine learning in tensorflow and tensorflowhub for engineers and developers

            Coursera course on tensorflow

            Tensorflowhub

            Google colab seedbank

            Google colab CNN

            Google colab CNN horse detector

            Some code for a failed butterfly detector

            Machine learning course for developers by Google Education

            

51) Python notebooks covering a number of machine learning techniques (Machine learning for physicists) (link)

52) Widget and game to explain loss function (link) (link)

53) Computational art (link)

56) Mendelian randomization for causality in observational studies (link)

57) Teaching materials for basic statistics and machine learning from a bootcamp (code, tutorials, notes) (link)

58) Machine learning and data science

          Machine learning resources (link, link to playlists)

          Data science tools and reproducible machine learning (link)

          Open source data science projects

          Bayesian techniques

                Tutorial that I created on Bayesian linear regression

                Tutorial that I created on Bayesian LASSO

          Deep learning teaching resources

          Natural language processing (NLP) teaching resources

          Teaching materials for basic statistics and machine learning from a bootcamp (code, tutorials, notes) (link)

59) Machine learning course materials (CS 229 Stanford University)

60) Gaussian process (link  link)