This is the course webpage for the course 'Statistical Applications with R', offered in PGP Term IV, AY 2025 - 26.
Advanced R, by Hadley Wickham.
R for Data Science, by Hadley Wickham
An Introduction to Statistical Learning, with applications in R, by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani.
Sessions 1 - 2: Intro to R, Handling data types and data structures in R
Session 3 : Data Visualization with ggplot2
Session 4: Data Transformation with dplyr
Exercises: Data Transformation [Solutions]
Session 5: Functions and conditional executions
Session 6: Dynamic reporting using RMarkdown and Quarto
Class Assignment - 1
Session 7: Linear Regression models [Advertising dataset]
Sessions 8 - 9: Multiple Linear Regression [Credit dataset]
Session 10: Practice Session [Datasets: Zillow_train, Zillow_test]
Problem Set on Regression
Session 11: Variable Selection in Regression : Ridge and LASSO
Session 12 - 13: Classification: Logistic Regression
Class Assignment - 2
Session 14: Classification: Linear and Quadratic Discriminant Analysis
Session 15: Classification: k-NN
Session 16: Principal Component Analysis [Market Basket Analysis dataset]
Session 17: Clustering: k-means, clustering with PCA
Session 18: Hierarchical clustering
Session 19: Class Assignment - 3
Session 20: Poster Presentation
Group Projects
A vital part of this course is the group project component.
Students have to form groups of sizes 5 - 6. The group details need to be shared by 01 July, 2025, via Google Sheets.
Students are free to choose any statistical applications project to work on.
All the groups are required to submit a project proposal outlining the problem statement, goals, and related methods, plus data sources and/or other references. I shall review all the proposals and share my feedback.
The last date to submit project proposals is 15 July, 2025.
The student groups need to submit their final project reports via a Google form (to be shared in due time) along with related R codes no later than the last day of the current term. Projects shall be evaluated based on their relevance, rigor, and approach. If I find it necessary, I shall be meeting the groups individually for a group viva regarding their work.
Last date for final project submission: 02 September, 2025.
Poster presentations for projects: 25 August, 2025
Group formation: 01 July, 2025
Proposal submission: 15 July, 2025
Poster presentation: 25 August, 2025
SAR 2024 Midterm [Dataset: world_data.csv]
SAR 2024 Endterm [Dataset: Airlines.csv, Banks.csv, Zillow_train, Zillow_test]