Data science has become an important part of various industrial fields, such as manufacturing, marketing, and finance. This course aims to provide a broad introduction to the programming language for data science. Students will learn and practice how to implement data science techniques with Python, including data preprocessing, supervised learning, unsupervised learning, and model selection and evaluation.
* Prerequisites: Python, Linear Algebra, Applied Statistics II, and Data Mining
** You must have taken the prerequisite courses (or equivalent) before taking this course.
Class Time: Mon 12:00~13:15, Wed 13:30~14:45
Location: 26421 (Engineering Building 2)
Language: Korean
Prof. Seokho Kang
Office: 27408B (Engineering Building 2)
E-mail: s.kang@skku.edu
Office Hours: by appointment
Ms. Sunbin Lee (for Exams), Mr. Eugene Yang and Mr. Yoon Hyung Lee (for Assignments)
Office: 27407 - Data Mining Lab. (Engineering Building 2)
E-mails: gozjd1234@naver.com, cneyang@g.skku.edu and lyhyung1013@naver.com
Andreas C. Müller & Sarah Guido, Introduction to Machine Learning with Python: A Guide for Data Scientists, O'Reilly Media, 2016.
Aurélien Géron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, O'Reilly Media, 2019.
Attendance (10%)
Assignments (20%)
Mid-term Exam (30%)
Final Exam (40%)
Total (100% + a)
Syllabus [download]
Course Introduction [download]
1. Supervised Learning - Part 1 (Overview) [download]
1. Supervised Learning - Part 2 (k-NN) [download]
1. Supervised Learning - Part 3 (Linear Models) [download]
1. Supervised Learning - Part 4 (DT, Ensemble) [download]
1. Supervised Learning - Part 5 (SVM) [download]
1. Supervised Learning - Part 6 (ANN, Misc.) [download]
2. Unsupervised Learning - Part 1 (Overview) [download]
2. Unsupervised Learning - Part 2 (PCA) [download]
2. Unsupervised Learning - Part 3 (t-SNE) [download]
2. Unsupervised Learning - Part 4 (k-Means, Hierarchical) [download]
2. Unsupervised Learning - Part 5 (DBSCAN, Misc) [download]
3. Representing Data and Engineering Features [download]
4. Model Evaluation and Improvement [download]
5. Algorithm Chains and Pipelines [download]
6. ML Project Checklist [download]
Special Topics in Data Science: Invited Talk by Hongdo Ki (Deep Learning Researcher, Cognex Deep Learning Lab.)
All assignments must be submitted to icampus (https://icampus.skku.edu/) by midnight of the due date. Late submissions will NOT be accepted.
[A0] Self-introduction (due date: 9/14): introduce yourself within 1 page. include the following - your name, student ID, personal history, your academic interests, what do you expect to learn from this course?, suggestions/comments to the instructor, miscellaneous.
[A1] Classification with k-NN (due date: 9/26) [download]
[A2] Comparison of Supervised Learning Algorithms (due date: 10/17) [download]
[A3] Hyperparameter Tuning for Neural Networks (due date: 10/31) [download]
[A4] Data Visualization (due date: 11/14) [download]
[A5] Feature Engineering (due date: 11/28) [download]
[A6] AutoML (due date: 12/7) [download]
Students are responsible for maintaining high standards of academic integrity in all of their class activities. Cheating or plagiarism in any form will not be tolerated. Any violation of academic integrity is a serious offense and is therefore subject to an appropriate sanction or penalty.