Textbook & References:
1. Probability and Statistics for Computer Science
by David Forsyth.
Springer International Publishing, 2018.
[https://link.springer.com/book/10.1007/978-3-319-64410-3]
2. An Introduction to Statistical Learning
by Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, & Jonathan Taylor,
Springer International Publishing, 2023.
[https://link.springer.com/book/10.1007/978-3-031-38747-0]
Also,
[https://www.statlearning.com/]
3. Foundations of Data Science
by Avrim Blum, John Hopcroft, and Ravindran Kannan.
2020 version is accessible via the first author’s website:
[https://home.ttic.edu/~avrim/book.pdf]
Main Topics:
Statistics for Computer Science:
descriptive statistics, sampling methods, populations of data, significance of evidence, probability models from data.
Statistical learning models:
estimate vs. prediction, maximum likelihood method, bootstrap, regularization, maximum entropy method, ......
Foundations of (Massive and High-dimensional) Data Science:
essential properties of high-dimensional space, best-fit subspaces, singular value decomposition, algorithms for massive data
Evaluation:
Class participation15%, Homework & Project 50%,
Midterm and Final Exam 35%