CIS 419/519 Applied Machine Learning (Spring 2021)
Course Description
Machine learning has been essential to the success of many recent technologies, including autonomous vehicles, search engines, genomics, automated medical diagnosis, image recognition, and social network analysis, among many others. This course will introduce the fundamental concepts and algorithms that enable computers to learn from experience, with an emphasis on their practical application to real problems. This course will introduce supervised learning (decision trees, logistic regression, support vector machines, neural networks, and deep learning), unsupervised learning (clustering, dimensionality reduction), and reinforcement learning.
Recommended Background
Programming skills. We will use Python throughout the course. While we will help you pick up Python, if you are not confident of your coding skills in any language at all, be warned that homework for this class could be quite difficult.
Introductory probability and statistics, calculus, linear algebra. We will provide primer/refresher documents for these topics as required, but assume that you have some familiarity with them from before.
Class Format
Instructor: Dinesh Jayaraman (dineshj [at] seas.upenn.edu)
We will follow a flipped classroom model for this course. Lectures will be pre-recorded, and optional live sessions (also recorded) will be used for Q&A, and problem solving.
Additionally, students will be organized into cohorts that will meet each week for group and one-on-one office hours.
Meeting link will be communicated over email if you are already in the class or are on the waitlist. Hours will be announced soon.
Student Communication and Class Links
Canvas (for lectures, quizzes): https://canvas.upenn.edu/courses/1570554 (instructor-added)
Piazza (for class and cohort discussion): The system is highly catered to getting you help fast and efficiently from classmates, the TAs, and myself. Rather than emailing questions to the teaching staff, I encourage you to post your questions on Piazza. If you have any problems or feedback for the developers, email team@piazza.com. Find our class signup link at: https://piazza.com/upenn/spring2021/srs_cis5194012021a (access code in class)
Gradescope (for work submission): https://www.gradescope.com/courses/223932 (access code in class)
Zoom meeting links will be communicated on Piazza
Schedule
Grading Scheme
Weekly lecture-quizzes: 10%
Homeworks (random teams of 2, 5 times): 30%*
Piazza discussion participation: 5%
Cohort attendance: 5%
Final Exam: 25%
Project (teams of 2 or 3): 25%*
*: If you're taking the undergraduate version of this course (CIS 419), then you will be evaluated differently on your homeworks and projects, with possibly optional components. Note that since the two versions have different requirements, you cannot complete the course as CIS 419 and petition afterwards to have it changed to CIS 519 for graduate credit.
Comparison to CIS 520
Due to overwhelming demand, Penn CIS offers two different introductory machine learning courses: CIS 419/519 (Applied Machine Learning) and CIS 520 (Machine Learning). This section briefly describes the differences between these courses.
CIS 419/519 Applied Machine Learning (this course!) is an introductory-level course in machine learning (ML) with an emphasis on applying ML techniques. The course is cross-listed between undergraduate (419) and graduate (519) versions; the graduate course 519 has somewhat different requirements as described above. CIS 419/519 is intended for students who are interested in the practical application of existing machine learning methods to real problems, rather than in the statistical foundations and theory of ML covered in CIS 520 Machine Learning. CIS 419/519 will cover some of the foundations of ML, but is intended to be less mathematically rigorous than CIS520; this does not necessarily mean that it is "easier".
CIS 519 is NOT a prerequisite for CIS 520. However, it makes little sense to take CIS 519 after having already taken CIS 520. It also makes little sense, but is possible to take CIS 419/519 first and then later take CIS 520.
Essentially, you should take CIS 419/519 if:
You're more interested in the applying existing machine learning algorithms to new problems, or
You don't feel that you have the mathematical background to proceed directly into CIS 520.
And, you should take CIS 520 if you're confident in your mathematical background and:
You're interested in pursuing research in machine learning, i.e., developing new ML algorithms, or
You're more interested in the statistical foundations and theory of machine learning methods.
Team
Dinesh Jayaraman (instructor)
Aditya M. Kashyap
Chang Liu
Halley Young
Hanwen Zhang
Hongji Yuan
Jun Wang
Kong Yao Chee
Kyle C Vedder
Nipun Bhanot
Pooja Consul
Shubham Gupta
Siyuan Tian
Ty Nguyen
Yang Yan
Resources
EMAB Tutoring
Penn's Engineering Master’s Advisory Board (EMAB) has announced Tutoring Program for master’s students in Machine Learning. This program will serve as a resource to help students strengthen their skills in the area. Students can drop in at one of our sessions whenever they need help - no commitment required and free of charge. If a student is interested in this program, they are encouraged to learn about our program at https://pennemab.weebly.com/tutoring.html
Textbook for Mathematics Background (Probability, Calculus, Linear Algebra)
Deisenroth, Marc Peter, A. Aldo Faisal, and Cheng Soon Ong. 2020. Mathematics for Machine Learning. Cambridge University Press.
Probability Resources
Linear Algebra Resources
3Blue1Brown's Youtube series on the Essence of Linear Algebra with some superb visualizations
Ali Jadbabaie's Linear Algebra refresher with lots of solved examples, for ESE 504
Python Resources
A Couple of Excellent Resources for Hands-On Machine Learning Through Interactive iPython Notebooks
Other Useful Textbooks, Courses, and Lecture Notes
Reinforcement Learning: An Introduction by Sutton and Barto, MIT Press, 1998. (Full text available online; on reserve in Penn library)
Machine Learning by Tom Mitchell, McGraw Hill, 1997. (On reserve in Penn library)
A Course in Machine Learning by Hal Daumé III.
Machine Learning Lecture Notes by Andrew Ng.
Machine Learning for Intelligent Systems by Kilian Weinberger.
For a more advanced treatment of machine learning topics, I would recommend one of the following books (all freely available online)
Pattern Recognition and Machine Learning by Bishop, Springer, 2006.
Machine Learning: A Probabilistic Perspective by Kevin P. Murphy, MIT Press, 2021.
The Elements of Statistical Learning 2nd edition by Hastie, Tibshirani and Friedman, Springer-Verlag, 2008.
Convex Optimization by Stephen Boyd and Lieven Vandenberghe, Cambridge University Press, 2004.
Information Theory, Inference, and Learning Algorithms by David Mackay, Cambridge University Press, 2003.
Deep Learning by Yoshua Bengio, Ian Goodfellow, and Aaron Courville.
Some Useful Articles
Online Machine Learning Communities
Research Conferences (nearly all freely available proceedings)
Preprint Servers
Code for published papers
Software
We will be using the following software throughout the course
Python : we'll be using python throughout the course to implement various ML algorithms and run experiments
Google Developer Python Tutorial (highly recommended as a way to master python in just a few hours!)
NumPy Tutorial (also highly recommended!)
Python tutorial (work at least through section 5; skip sections 2, 3.1.3)
Scikit-learn machine learning in Python
Pytorch deep learning library