"Machine Learning in Practice", to be held in term Fall/Monsoon 2021, will be the debut offering of this course. It will be available to students across all TIFR campuses. Several branches of science, engineering, and technology have witnessed a hockey-stick growth with the advent of machine learning, which promises inference and prediction from large observational data. There is hardly a field interacting with data that has not been significantly affected by this revolution, with thousands of domain specific machine learning inspired startups too routinely springing forth in every nook and corner of the world. In this introductory course, we invite students from various backgrounds to take advantage of this unique offering. It would focus on basic concepts, examples, and case studies, and assume no prerequisites. Students from TIFR-Mumbai, TIFR-Hyderabad, ICTS, NCBS, CAM, NCRA, etc. are encouraged to join the course.
We have the following criteria for students who can credit or audit (sit-through) the course:
1. Credit Students : Only those students who enrolled as graduate students in the TIFR system in the Year 2019 or earlier can credit the course. They will have full access to zoom live participation, an online course forum, assignments, and office hours. These students will also have their course performance graded. (The link for joining the course forum will be sent separately to the crediting students.)
2. Audit Students : Students enrolled in the year 2020 or later or those who do not wish to credit the course and be graded, can attend live lectures through Youtube live. They will not be able to attend live zoom lectures or have access to office hours or the online course forum.
This webpage for the course will have course-notes, assignments, and links for recorded lectures, and will be accessible to all.
Please note:
-- The last day to credit the course (if applicable) is Monday, August 30, 2021.
-- The last day to drop the course (if credited) is Thursday, September 30, 2021.
(a) Module 0 : Python Primer (~2.5 weeks)
-- Getting data into a program. Rudimentary data exploration
-- Plotting data
-- Basic Linear Algebra/Fitting linear models. Singular Values and Eigenvalues
-- Basic Programming: functions, conditionals, and iteration
(b) Module I : Basics of Probability, Random Variables, Statistics (~3 weeks)
-- Probability: Sample Space, Events, Counting, Probability, Conditional Probability, and Independence, Bayes Theorem. Monty Hall Game (if time permits)
-- Random Variables: Discrete and Continuous Random Variables and Joint, Conditional, and Marginal Distribution Functions [probability mass functions (pmfs), probability density functions (pdfs), and cumulative density functions (cdfs).
-- Moments: Mean, Variance, Correlation, Covariance, and Covariance Matrix. Empirical Distributions
-- Limit theorems: Law of Large Numbers , Central Limit Theorem, and Applications of Gaussian Distributions
-- Estimation: Including method of moments and maximum likelihood estimation, confidence intervals
-- Confidence intervals (contd.) and Hypothesis Testing
(c) Module II : Regression, Classification, Support Vector Machines, Neural Networks, GANs (~4.5 weeks)
-- Regression
-- Classification
-- Support Vector Machines
-- Introduction to Deep Learning, Neural Networks, and Back-propagation
-- Training Methods and the Neural Network Zoo
-- Variational Autoencoders
-- Generative Adversarial Networks
(d) Module III : Dimensionality Reduction, Clustering, Graphical Models (~4 weeks)
-- Dimensionality reduction (factor analysis and PCA, CCA, ICA)
-- Clustering (k-means, heierarchical, spectral)
-- GMM, EM algorithm
-- Graphical models (HMM & MRF)
-- Course commences on Monday, August 16 with Module 0.
-- Classes will be every Monday and Wednesday, 11:00 am - 12:30 pm.
-- Every Friday, 11:00 am - 12:30 pm, office hours will be held by the TAs. Please note that there will not be any office hours for the first two weeks, and the Friday slot will be utilised for Module 0.
The class is a 4-credit semester-long course. Grades would be based on:
-- Class participation and in-class quizzes (20%)
-- Assignments (roughly 2 per module) (50%)
-- End-term (30%)
Himanshu Asnani (TIFR, Mumbai) [course coordinator],
Sandeep Juneja (TIFR, Mumbai),
Praneeth Netrapalli (Google Research India, Bengaluru),
Piyush Srivastava (TIFR, Mumbai),
Sreekar Vadlamani (TIFR-CAM, Bengaluru)