Mathematical Foundations of Machine Learning

Description

Mathematical Foundations of Machine Learning gives a rigorous introduction to the mathematical foundations of machine learning, including but not limited to frequently used tools in linear algebra, calculus, probability, and widely applied methods such as linear regression and logistic regression. In addition, this course provides hands-on training on implementing these algorithms via Python. Students will be trained to use popular python libraries such as numpy, scipy and matplotlib.

Outcomes

Upon completion of this course, you should be able to:

apply basic concepts in linear algebra to problems in machine learning
apply principal component analysis to analyze high-dimensional data
apply gradient descent to solve general optimization problems
apply linear regression to solve real-world problems
use NumPy to solve machine learning problems such as recommender systems

Content

Week 1 - Vectors and Matrices

In machine learning, we represent numerical data as vectors. This week, you will learn how to express vectors in multidimensional spaces, their properties, and basic operations such as addition and scalar multiplication. Then, you will see how to combine these basic operations to create linear combinations of vectors. You will also learn about properties and basic operations of matrices such as addition and scalar multiplication, and one of the most important operations on matrices: matrix multiplication. Finally, you will learn how to use the popular scientific computing library Numpy to express vectors and matrices and implement their operations in Python programming language.

Learning Objectives :

Perform common operations on vectors like sum, difference, scalar multiplication, magnitude, and dot product.
Represent linear combinations by combing vector addition and scalar multiplication.
Perform common operations on matrices like sum, difference, and scalar multiplication.
Understand the difference between a diagonal, identity, and symmetric matrix.
Multiply a matrix with a vector or with another matrix.
Use Numpy library to implement vectors and matrix operations in Python.

Materials: Slides, Lab, Homework.

Week 2 - System of Linear Equations, Matrix Determinant and Inverse

This week you will learn how matrices naturally arise from systems of equations and how to solve a system of linear equations using the elimination method. You will learn also how to compute the determinant of a matrix, its properties, and its applications. Finally, you will learn how to compute the inverse of a matrix and its relation with the determinant.

Learning Objectives

Form and graphically interpret 2x2 and 3x3 systems of linear equations.
Solve a system of linear equations with multiple unknowns using the elimination method.
Compute the determinant of a square matrix and understand its relation with the concept of invertibility.
Calculate the inverse of a matrix, if it exists.
Use NumPy linear algebra package to solve a system of linear equations, and compute the determinant and the inverse of a matrix.

Materials: Slides, Lab, Homework.

Week 3 - Vector Spaces and Subspaces

In this module, you will learn about vector spaces and subspaces and how to compute the four fundamental subspaces for any matrix. You will also learn how to derive the complete solution of a system of linear equations, if it exists. Then, you will learn the concept of linear independence and how to use it to compute the rank of a matrix. Finally, you will understand the concept of basis by combining vector spaces, span of vectors and linear independence.

Learning Objectives

Compute the four fundamental spaces of a matrix.
Derive the complete solution a system of linear equations with multiple unknowns, it it exists.
Understand whether a set of vectors is linearly independent.
Understand whether a vector is in the span of a set of vectors.
Change a vector representation from a basis to another basis.
Learn the most popular Python library for data analysis, Pandas.

Materials: Slides, Lab, Homework.

Week 4 - Orthogonality and Projections

In this week, you will learn about orthogonal vectors and subspaces. You will also learn about projections and how to project vectors into subspaces spanned by columns of a matrix. Then, you will learn how the concept of orthogonal unit vectors (orthonormal vectors) significantly makes working with projections less computationally expensive. You will also learn a systematic approach called Gram-Schmidt process to convert any set of independent vectors into a set of orthonormal vectors. Finally, you will learn how to use the idea of projections to solve Linear Regression, one of the most fundamental problems in Machine Learning.

Learning Objectives

Check whether two vectors or subspaces are orthogonal.
Compute the projection of a vector into a subspace spanned by columns of a matrix.
Analytically solve Linear Regression using the Least Squares Approximation method.
Understand how orthonormal vectors significantly simplify computations while working with projections.
Convert a set of independent vectors into a set of orthonormal vectors using Gram-Schmidt process.

Materials: Slides, Lab, Homework.

Week 5 - Matrix Decomposition

In this module, you will learn about linear transformations an how to represent a system of linear equations as a linear transformation on a vector. You will see special vectors called eigenvectors, that do not change direction after transformation. You will also see how the concept of eigenvectors is used in dimensionality reduction in machine learning problems such as image compression and recommender systems.

Learning Objectives

Represent a system of linear equations as a linear transformation on a vector.
Compute the eigenvectors and eigenvalues of a matrix.
Represent a matrix in terms of three interpretable matrices using Singular Value Decomposition (SVD).
Understand how to use SVD to build a recommender system.
Implement an image compression program in Python using SVD.

Materials: Slides, Lab, Homework.

Week 6 - Dimensionality Reduction with PCA

Nowadays, in machine learning we have to deal with multidimensional data. High dimensional data is hard to analyze, interpret and impossible to visualize. Moreover, it is very computationally expensive to work with data that lies in high dimensional spaces. However, most of the times, high dimensional data lie on lower dimensional spaces. In this module, we will learn a powerful method called Principle Component Analysis that can help us to find these lower dimensional spaces where our data lies on.

Learning Objectives

Compute basic statistics such as mean, variance and covariance.
Perform Principle Component Analysis(PCA) to reduce the dimensionality of high dimensional data.
Use Python to visualize a multidimensional dataset into a two dimensional space using PCA.

Materials: Slides, Lab, Homework.

Week 7 - Derivatives and Optimization

This week, you will delve into the fundamental concept of calculus: derivatives. This essential mathematical concept will be explored in detail, including how to calculate derivatives for basic mathematical functions like constants, linear functions, quadratic polynomials, exponentials, and logarithms. As we progress, we will also employ various rules such as the sum rule, product rule, chain rule, and scalar multiplication to determine derivatives for more complex functions. Now, you might wonder why derivatives and calculus play a pivotal role in machine learning. One compelling reason is their application in optimizing functions, specifically in the context of maximizing or minimizing them. This optimization process is of paramount importance in machine learning because, when seeking the ideal model that fits your data most accurately, you accomplish this by computing a loss function and subsequently minimizing it.

Learning Objectives

Compute derivatives of basic mathematical functions such as constants, linear functions, quadratic polynomials, exponentials, and logarithms.
Compute derivatives of more complex functions using basic differentiation rules.
Approximate derivatives of functions using numerical differentiation.
Analytically optimize different types of functions using properties of derivatives.
Solve different real-life optimization problems.
Compute derivatives of different functions in Python using the `SimPy` package.

Materials: Slides, Lab, Homework.

Week 8 - Gradient Descent

We've been exploring ways to solve challenging optimization problems using derivatives and gradients. As we delve into more complex problems, especially those with multiple variables, things can become quite complicated. During this week, you will learn about a highly efficient, step-by-step approach called "gradient descent" that simplifies the process of optimizing functions. Then, you will learn how to implement gradient descent with simple single-variable functions featuring either one or multiple minima in Python. Finally, you will learn how to optimize functions with two variables by applying gradient descent to solve real-world Linear Regression problems with actual data.

Learning Objectives

Compute partial derivatives of functions with more than one variable.
Compute the gradient of a function at any point.
Optimize functions using gradient descent
Implement Gradient Descent in Python to solve the Linear Regression Problem.

Materials: Slides, Lab, Homework.

Week 9 - Probability Theory

Probability theory is like a toolbox for understanding chance and randomness in many areas, like statistics, finance, engineering and machine learning. In this lecture, we'll start with the basics, like what we mean by sample spaces, events, and how we measure probability. Then, we'll learn some rules for figuring out the chances of things happening, and practice solving probability problems in different situations.

Learning Objectives

understand the difference between an experience, a sample space and an event.
learn the basic axioms of probability
compute the probability in multiple scenarios
apply concepts like conditional probability, Law of Total Probability and Bayes Rule.

Materials: Slides, Lab, Homework.

Week 10 - Probability Distributions

This week, we're diving into Random Variables and Probability Distributions. We'll look at two kinds of random variables: Discrete and Continuous Random Variables. We'll figure out how to compute the expected value and the variance for each of them. Then, we'll learn about things like independence, conditional probability, and Bayes' rule for pairs of these random variables. Lastly, we'll create probability distributions to understand the chances of different results and explore a famous one called the Normal Distribution, which looks like a bell-shaped curve and is super common in lots of areas.

Learning Objectives

understand the difference between a discrete and continuous random variable
compute the expected value and variance of random variables
apply conditional probability, law of total probability and Bayes Rule to a pair of random variables
understand what is a probability distribution
understand the basic properties of Normal Distributions

Materials: Slides, Lab, Homework.

Week 11 - Regression

This week, we're diving into the world of linear regression, one of the most important algorithms in Machine Learning that helps us understand and predict relationships between variables. After understanding the mathematical formulation of linear regression, we're getting hands-on! We'll learn how to build a linear regression model step by step using NumPy and then we'll also figure out how to use scikit-learn, a super handy machine learning package, to implement linear regression.

Learning Objectives

understand how linear regression works.
implement linear regression step by step using `Numpy`.
implement linear regression using `scikit learn`.

Materials: Slides, Lab, Homework.

Week 12 - Classification

This week, we will learn about logistic regression, a pivotal algorithm in Machine Learning used to model the probability of a binary outcome based on one or more predictor variables. Additionally, we'll harness the power of scikit-learn, a versatile machine learning library, to implement logistic regression effectively.