Data Science

Instructor:

Jagannath Aghav

Text Books:

  • Cathy O'Neil and Rachel Schutt, "Doing Data Science, Straight Talk From The Frontline," O'Reilly Publications, 408 pp, 2014
  • Jure Leskovec, Anand Rajaraman, and Jeffrey D Ullman, "Mining of Massive Datasets," Cambridge University Press, 2014
  • Ethem Alpaydin, “Introduction to Machine Learning,” 3rd ed, MIT Press, 640 pp, August 2014

Reference books:

  • Avrim Blum, John Hopcroft and Ravindran Kannan, "Foundations of Data Science," (Note: this is a book currently being written by the three authors. The authors have made the first draft of their notes for the book available online. The material is intended for a modern theoretical course in computer science.)
  • Thomas H. Davenport, Jeanne G. Harris and Robert Morison, “Analytics at Work: Smarter Decisions, Better Results”, Harvard Business Press, 2010
  • Trevor Hastie, Robert Tibshirani, and Jerome Friedman, Elements of Statistical Learning, Data Mining, Inference, Prediction, 2nd ed, Springer Verlag, 2009

Course Outline:

1. Data Science- Introduction

2. Statistical Inference

3. Exploratory Data Analysis and the Data Science Process

4. Naive Bayes Algorithm

5. Extracting Meaning From Data

6. Recommendation Systems

7. Social-Networks

8. Spam Filters

9. Data Visualization

10. Machine Learning Algorithms & Pipelining

11. Kaggle : Solutions and Process Participation

(Datasets, Big Data Challenges, Deep Learning, Python & R Packages, Regression)

Course Learning Outcomes:

Students will be able to:

  1. Compare data sets by understanding the importance of data science processes
  2. Analyze and implement the statistical descriptors on a chosen dataset
  3. Demonstrate case studies on social networks, recommender systems, and
  4. Investigate solutions to the state- of-the-art problems/competitions.

Additional Links:

1. http://www.kdnuggets.com/2015/06/ top-20-r-machine-learning-packages.html Machine learning and data science packages of R

2. http://www.r2d3.us/visual-intro-to-machine-learning-part-1/ Visual introduction to the state of the art data science and machine learning