AICTE Training and Learning (ATAL) Academy

Online Faculty Development Programme (FDP)

Training Programme on Statistical Foundations of Data Science and Machine Learning

Schedule of FDP - (From October 27 - 31, 2021)

  • Objectives:

    • To train the participants with statistical foundations of Machine Learning and Artificial Intelligence.

    • To train the participants with various statistical simulation techniques and their application in probability computations.

    • To train the participants with the ideas of Multi-model inference and Model Section and their applications in Machine Learning.

    • To train the participants with basic ideas of Bayesian computation and their application in Machine Learning.

    • To train the participants with a pedagogical approach to teach machine learning through statistical simulation.

  • FDP outcomes:

    • Participants will be able to understand the concepts of probability and probability distribution using simulation.

    • Participants will be able to understand the convergence concepts and their utility in statistics and machine learning concepts.

    • Participants will be able to apply Bayesian computation in solving real-life problems.

    • Participants will be able to apply Monte Carlo methods to compute probabilities with applications in real life.

    • Participants will be able to apply statistical Machine learning methods for model selection problems in the context of regression and classification problems.

T. J. Rao, Ph.D.

Professor

Indian Statistical Institute, Kolkata

Webpage: https://www.isical.ac.in/~tjrao/

Inaugural address: P. C. Mahalanobis: Statistician, Data Scientist or Both?

Inaugural Video: Click Here


This is an introductory and elementary talk as a 'curtain raiser' for the 5-day training programme on Data Science. I shall discuss Prof. P. C. Mahalanobis's innovative contributions which recognized him as the person "who put India at the centre of the Statistical World".

Speakers

Buddhananda Banerjee, Ph.D.

Assistant Professor

Department of Mathematics and Centre of Excellence in AI

IIT Kharagpur

Webpage: https://sites.google.com/site/buddhanandastat/

Session 1A: (11:00 AM – 01:00 PM) Foundations of Probability Theory – I

Session 1C: (04:30 PM – 06:30 PM) Foundations of Probability Theory – II

Session 2A: (09:30 AM – 11:30 AM) Convergence Concepts and Monte Carlo Methods

Lecture Video: Session 1A, Session 1C, Session 2A

These sessions aim to cover the foundations of probability theory which are essential to understand various statistical machine learning algorithms. In addition, the sessions will cover the concept of the sampling distribution, problem of estimation, and maximum likelihood method. Basic principles of Monte Carlo methods and their application to approximation integrals and computation of probability of events will also be discussed. The lecture materials are uploaded in the ATAL Portal and available for the approved participants.

Ajit Kumar, Ph.D.

Associate Professor

Department of Mathematics

Institute of Chemical Technology, Mumbai

Webpage: https://ajitmathsoft.wordpress.com/

Session 1B: (02:00 PM – 04:00 PM) Data Science tools: Introduction to R programming

Session 2B: (01:30 PM – 03:30 PM) Mathematical Programming using R

Lecture Video: Session 1B, Session 2B


These two sessions are designed to introduce R Programming and its applications in statistical computations. Emphasization will be given on the Mathematical Computations related to linear algebra, calculus, and optimization points of view which are essential to understand the concepts in Machine Learning.

Arindom Chakraborty, Ph.D.

Assistant Professor

Department of Statistics

Visva-Bharati University, Santiniketan

Webpage: https://www.visvabharati.ac.in/arindam_chakraborty.html

Session 2C: (04:00 PM – 06:00 PM) Foundation of Machine Learning (Bayesian Inference– I)

Session 3A: (01:30 PM – 03:30 PM) Foundation of Machine Learning (Bayesian Inference–II)

Lecture Video: Session 2C, Session 3A

The aim of these sessions is to make the participants comfortable with Bayesian Computation which is an integral component of many Machine Learning Algorithms. Starting with the Bayes theorem, the session will uncover its applications in developing various computational algorithms. In particular, the following will be discussed: Introduction to the Bayesian Inference with Bayesian Analysis for single and multi-parameter models; Introduction of MCMC computations for simulating samples from the posterior distribution; Gibbs sampling and conditional posterior. The lecture materials are uploaded in the ATAL Portal and available for the approved participants.

Shrikrishna G. Dani, Ph.D.

Professor

Centre for Excellence in Basic Sciences, Mumbai

Webpage: https://www.cbs.ac.in/people/faculty-s-g-dani

Session 3B: (01:30 PM – 03:30 PM) Lecture on Indian Knowledge Systems: Glimpses of Ancient Indian Mathematics

Lecture Video: Click Here


The aim of the talk will be to give an overview of the mathematical traditions in India over various historical periods, highlighting the motivations for the pursuit of various traditions and the key features of the mathematical ideas at different times, together with a discussion of some of the significant achievements

Venkata Reddy Konasani

Corporate Trainer and Cofounder,

Statinfer Solutions LLP

Webpage: https://github.com/venkatareddykonasani

Session 3C: (04:00 PM – 06:00 PM) Foundations of Machine Learning (Industrial Perspective)

Lecture Video: Click Here


This session is planned to give an understanding of the Machine Learning projects from an industrial point of view. The following ideas will be discussed: (a) various steps in Machine Learning projects, (b) What is Model Building! (c) Industrial Case studies on model building in credit risk domain, (d) Data validation and cleaning, (e) Model building, (f) Model validation .

Radhenushka Srivastava, Ph.D.

Assistant Professor

Department of Mathematics

IIT Bombay

Webpage: https://rnd.iitb.ac.in/faculty/prof-radhendushka-srivastava

Session 4A: (09:30 AM – 11:30 AM) Machine Learning Foundation – I

Session 4B: (01:30 PM – 03:30 PM) Machine Learning Foundation – II

Session 5B: (01:30 PM – 03:30 PM) Machine Learning Foundation – III

Lecture Video: Session 4A, Session 4B, Session 5B

Regression analysis is a powerful tool to understand the relationship between response and explanatory variables. We will cover multiple linear regression techniques and its' diagnostics. We will also learn techniques related to ridge regression and lasso. We will also learn some statistical classification methods during this course.

Priyavrat Deshpande, Ph.D.


Assistant Professor

Chennai Mathematical Institute

Webpage: https://www.cmi.ac.in/~pdeshpande/

Session 4C: (04:00 PM – 06:00 PM) Foundations of Topological Data Analysis

Lecture Video: Click Here


Topological Data Analysis is an emerging area in the field of Machine Learning and several real-life problems are now investigated using TDA. The aim of this talk is to a brief introduction to TDA and discuss some case studies where such method can be applied. We also plan to discuss some of the available packages in the existing software and programming platforms.

Sourish Das, Ph.D.


Associate Professor

Chennai Mathematical Institute

Webpage: https://www.cmi.ac.in/~sourish/

Session 5A: (09:30 AM – 11:30 AM) Bayesian Statistics and Machine Learning: Industrial Applications

Lecture Video: Click Here


The talk will be based on the article: https://link.springer.com/article/10.1007/s00180-020-00970-8

Statistical Machine Learning (SML) refers to a body of algorithms and methods by which computers are allowed to discover important features of input data sets which are often very large in size. The very task of feature discovery from data is essentially the meaning of the keyword ‘learning’ in SML. Theoretical justifications for the effectiveness of the SML algorithms are underpinned by sound principles from different disciplines, such as Computer Science and Statistics. The theoretical underpinnings particularly justified by statistical inference methods are together termed as a statistical learning theory. This paper provides a review of SML from a Bayesian decision-theoretic point of view—where we argue that many SML techniques are closely connected to making inferences by using the so-called Bayesian paradigm. We discuss many important SML techniques such as supervised and unsupervised learning, deep learning, online learning and Gaussian processes especially in the context of very large data sets where these are often employed. We present a dictionary which maps the key concepts of SML from Computer Science and Statistics. We illustrate the SML techniques with three moderately large data sets where we also discuss many practical implementation issues. Thus the review is especially targeted at statisticians and computer scientists who are aspiring to understand and apply SML for moderately large to big data sets. The lecture materials are uploaded in the ATAL Portal and available for the approved participants.

Workshop Coordinator: Dr. Amiya Ranjan Bhowmick

Assistant Professor

Department of Mathematics

Institute of Chemical Technology, Mumbai