Statistics for Data Science I (Course ID: BSMA1002) is a foundational 4-credit course that introduces the core statistical tools and reasoning required for data analysis and interpretation. The course builds a strong conceptual base in descriptive and inferential statistics, probability theory, random variables, and distributions — forming the groundwork for advanced data science and machine learning techniques. Through theory, practical examples, and structured assessments, students learn to analyze, summarize, and draw valid conclusions from data.
Faculty: Prof. Usha Mohan, Department of Management Studies, IIT Madras
Duration: 12 weeks of coursework
Assessments: Weekly online assignments, two in-person invigilated quizzes, and one in-person invigilated end-term examination.
Week 1: Introduction and types of data; Descriptive vs. Inferential statistics; Scales of measurement
Week 2: Describing categorical data — frequency distribution, graphical representation, mode and median for categorical variables
Week 3: Describing numerical data — frequency tables, measures of central tendency (mean, median, mode), dispersion (range, variance, standard deviation, IQR), five-number summary
Week 4: Association between two variables — contingency tables, scatterplots, covariance, Pearson and point-biserial correlation coefficients
Week 5: Basic counting principles — addition and multiplication rules, factorial concepts
Week 6: Permutations and combinations
Week 7: Fundamentals of probability — definitions, events, and properties
Week 8: Conditional probability — multiplication rule, independence, law of total probability, Bayes’ theorem
Week 9: Random variables — discrete and continuous types, PMF, and CDF
Week 10: Expectation and variance of discrete random variables
Week 11: Binomial and Poisson random variables — Bernoulli trials, i.i.d. assumptions, expectations, and variances
Week 12: Introduction to continuous random variables — area under the curve, PDF properties, uniform and exponential distributions