# Resources

## Flux

MDST has shared computing resources available from the University's HPC cluster, called *Flux*. All MDST members have access to these resources by request. To gain access, follow the steps in the "Getting Access to Flux" document.

# Data Science Resource List

All resources here were recommended by MDST members and include testimonials! To add your own, fill out this google form.

### Webpages

**How to Learn Machine Learning**

Covers: Learning how to learn ML!

Source: http://karlrosaen.com/ml/

Endorsed by: @thealex -- “This is from a longtime developer who read "The Master Algorithm" and caught the ML bug. He put is VP of Tech / Product Dev / Software Engineering life on hold and took the summer off to study machine learning. He maintained a learning log during his study and ultimately got a job as a Research Engineer in the Ford Autonomous Vehicles Lab. Former MDST regular.”

**Distill**

Covers: Theory presented in a very approachable way.

Source: http://distill.pub/

Endorsed by: @thealex -- “Incredibly high standards for clarity. Once you know enough about ML to learn fringe concepts, working through these pages can be both enjoyable and enlightening.” @stroud -- "Interactive figures are super useful and intuitive.”

**Understanding LSTM Networks**

Covers: LSTMs and RNNs

Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Endorsed by: stroud -- “Best tutorial on LSTMs I've ever seen. Really cool figures that get the point across well.”

**Neural Network Playground**

Covers: Neural Networks

Source: http://playground.tensorflow.org/

Endorsed by: stroud -- “So much fun”

**An Overview of Gradient Descent Optimization Algorithms**

Covers: SGD et al.

Source: http://ruder.io/optimizing-gradient-descent/

Endorsed by: samtenka -- “Nesterov, Adam, RMSProp... what a mess! This unsystematic but insightful comparison helps us master the menagerie of gradient-based optimizers. It's often more clear than then corresponding papers, too. ”

**MSAIL**

Covers: MSAIL---michigan ML club

Source: http://msail.github.io

Endorsed by: samtenka -- “Website is terrible. Ugly. Outdated. But the club's pretty fun.”

### MOOCs

**Applied Data Science with Python Specialization**

Covers: Data science, Python, pandas, machine learning, social network analysis, natural language processing

Source: https://www.coursera.org/specializations/data-science-python

Endorsed by: jpgard -- “This is a great intermediate introduction to both Python and its use for solving applied data science problems. Taught by UM professors, this specialization has a fairly high bar but features high-quality video and engaging interactive programming and end-of-course assignments that will push you to fully develop your data science skills.”

**Machine Learning **

Covers: Basic machine learning topics.

Source: https://www.coursera.org/learn/machine-learning

Endorsed by: pgad -- “Andrew Ng teaches it very well and the MOOC comes with programming exercises. A great starter course.”

### Tutorial/Coding Demos

**CUDA C/C++ Basics**

Covers: CUDA

Source: Link

Endorsed by: samtenka -- “Get your hands dirty with low-level GPU computing as quickly as possible by following these slides! I thought they were fun, and so can you.”

**The Unreasonable Effectiveness of Recurrent Neural Networks **

Covers: RNNs

Source: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Endorsed by: samtenka -- “Turned me on to Recurrent Neural Nets.”

jaredaw -- “Karpathy is an expert on recurrent neural networks and put a lot of time into this explanation of them. The visualizations and examples are simple and effective. This post really helped me understand RNNs.”

**Computational Statistics in Python**

Covers: Computational statistics

Source: https://people.duke.edu/~ccc14/sta-663/index.html

Endorsed by: xinyutan -- “A very nice introduction to some "advanced" topics which are rarely seen introduced at such an introductory level.”

**Probabilistic Programming and Bayesian Methods for Hackers**

Covers: Bayesian techniques explored through iPython Notebooks

Source: Github

Endorsed by: thealex -- “It's a set of iPython Notebooks! Download them before a long flight and browse at your leisure. I found the section of picking good priors to be especially helpful because I do research on bandit problems.”

**Python Data Science Handbook**

Covers: Bayesian techniques explored through iPython Notebooks

Source: Github

Endorsed by: pktan -- “Covers most of the python packages that beginners need to get started. It's actually a book, but the author decided to open-source it, so do ask the members to buy the book and support the author if they like it.”

**MNIST for ML Beginners**

Covers: Tensorflow for beginners, MNIST, Neural Networks

Source: https://www.tensorflow.org/get_started/mnist/beginners

Endorsed by: stroud -- “Eases you into the basics of Tensorflow with good figures and examples. Suitable for total beginners.”

**Deep MNIST for Experts**

Covers: Tensorflow Basics

Source: https://www.tensorflow.org/get_started/mnist/pros

Endorsed by: samtenka -- “If you understand CNNs in theory, this rapid yet clear tutorial will get you started with their practical implementation. Pairs nicely with LeCun's paper introducing CNNs.”

stroud -- “Gentle introduction to CNNs in Tensorflow for people with machine learning experience. Maintained by the Tensorflow team, so it will change as the tools change.”

**Neural Networks**

Covers: PyTorch, CNNs

Source: Link

Endorsed by: stroud -- “Covers all the basics of CNNs in PyTorch.”

**Unsupervised Feature Learning and Deep Learning**

Covers: Unsupervised Learning and Deep Learning

Source: http://deeplearning.stanford.edu/wiki/index.php/UFLDL_Tutorial

Endorsed by: pgad -- “A nice tutorial with coding exercises. Covers Autoencoders.”

**Variable Sharing in Tensorflow**

Covers: Variable Sharing in Tensorflow

Source:https://jasdeep06.github.io/posts/variable-sharing-in-tensorflow/

Endorsed by: samtenka -- “Finally! A clear and correct explanation!”

### Textbooks

**Machine Learning: A Probabilistic Perspective**

Covers: General topics in Machine Learning

Source: Amazon

Endorsed by: pgad -- “Murphy explains all the topics he covers really well, and gives great statistical perspective.”

**Learning from Data**

Covers: Statistical learning theory

Source: https://work.caltech.edu/textbook.html

Endorsed by: samtenka -- “Grounds the theoretically oriented beginner in the philosophies and tools of machine learning. I would highly recommend this book to physicists, cows, and those who ask "why?" more than "how?". The book might or might not be available for free online.”

**An Introduction to Statistical Learning (with Applications in R)**

Covers: Supervised and unsupervised learning, data mining, R

Source: http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Sixth%20Printing.pdf

Endorsed by: jpgard -- “This is a great overview of many of the core machine learning techniques, and doubles as a user-friendly introduction to R. The book is well-written and filled with insights. The same authors also have a great series of videos aligned with the book chapters (available on YouTube) and, for more advanced reading or more in-depth coverage, a similar book entitled Elements of Statistical Learning.”

**R Graphics Cookbook: Practical Recipes for Visualizing Data**

Covers: R, GGplot2

Source: Amazon

Endorsed by: jpgard -- “As a hardcore R user, I often find myself looking for a reference to adjust the same things on my visualizations -- how to I remove axis ticks, add text labels to graphical elements, or adjust legends? This is my go-to resource. A "free" PDF is floating around the internet.”

**Python for Data Analysis**

Covers: pandas, numpy

Source: http://www3.canisius.edu/~yany/python/Python4DataAnalysis.pdf

Endorsed by: xinyutan -- “Pandas is very confusing to me at first. This book is written by Wes McKinney, the main author of the pandas library. He introduces some logics and reasoning of pandas design, making it easier to remember (at least some core functions) and use pandas.”

**Convex Optimization**

Covers: Convex Optimization

Source: https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf

Endorsed by: pgad -- “Advanced. The reader can learn about stuff like gradient descent, Newton's algorithm, Lagrange Duals and other stuff used in Machine Learning in great detail.”

**Statlect**

Covers: Probability and Measure Theory

Source: statlect.com

Endorsed by: samtenka -- “It is formal and rigorous.”

### Papers

**Wasserstein GAN**

Covers: Generative Adversarial Deep Learning without tears

Source: https://arxiv.org/abs/1701.07875

Endorsed by: samtenka -- “WGANs are the future. This seminal paper both explains the technique lucidly and inspires a more general understanding of neural net training. I'd recommend anyone who is excited about deep unsupervised learning to peruse this paper.”

### Lecture Videos

**Learning: Support Vector Machines**

Covers: SVM

Source: YouTube

Endorsed by: samtenka -- “Avuncular and expert, Patrick Winston takes us on a leisurely stroll through the intuition and implementation of SVMs. Let this video (played at 1.5 speed) be your guide to this important class of models.”

### Lecture Slides

**Deep Learning Software**

Covers: Deep Learning Software/Hardware

Source: http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture8.pdf

Endorsed by: stroud -- “Very in-depth, covers advantages and disadvantages of deep learning frameworks.”.

**Theoretical Foundations of Machine Learning**

Covers: Statistical Learning Theory

Source: http://web.eecs.umich.edu/~jabernet/eecs598course/fall2015/web/

Endorsed by: pgad -- “Advanced. A nice introduction to Statistical Learning Theory. Homeworks available.”.

**Data Visualisation**

Covers: Data Visualisation

Source: http://courses.cs.washington.edu/courses/cse512/14wi/

Endorsed by: acell -- “The website is quite comprehensive, complete with assignments and readings by topic and a resource list that has tools, tutorials, data sets, and links to blogs / other courses.”