Classes

Here is a brief summary of topics discussed, thus far, in classes. Slides and coding scripts would be distributed after each lecture via the Google groups mailing-list. Refer to the home page on how to subscribe.

Course Classes

20/09/2021 -- Introduction to Fundamentals of Data Science. The first class introduces Data Science, its applications, main data tools, techniques and platforms, as well as pitfalls and promises; furthermore, it introduces machine learning.

23/09/2021 -- Introduction to Machine Learning and Computer Vision. The second class introduces Computer Vision, including basic terminology, digital image formation and the visual cues for 3D vision; furthermore, it introduces terminology for object detection and recognition as well as the related visual challenges.

27/09/2021 -- Linear filtering. Topics include: linear filtering, including convolutions, smoothing and sharpening; box and Gaussian filtering.

30/10/2021 -- Multi-scale Image Representations and Edge Detection. Topics include: separable box and Gaussian filters; template matching; image smoothing and resizing, aliasing, multi-scale Gaussian pyramid; edge detection, first derivative filters for 1D signals.

04/10/2021 -- Image gradients, Laplacians. Topics include: image gradient, magnitude and orientation; thinning, Canny edge detector; second-order image derivatives, the Laplacian operator and pyramid.

07/10/2021 -- Object recognition with color. Topics include: review of first and second derivatives in edge detection and the Laplacian pyramid; challenges in object recognition, image representation with histograms of colors, modes of variations and invariances.

11/10/2021 -- Performance evaluation. Topics include: review of how the image representation, training set and recognition process may address the object appearance variations; precision, recall, true positive rate, false positive rate, F1 measure and overall accuracy.

14/10/2021 -- Linear regression. Topics include: ROC and PR curves and the area under the curves; log-loss, brier score and model calibration; introduction of univariate and multivariate linear regression; hypothesis and parametrization; cost function and minimization of the squared sum of errors.

18/10/2021 -- Gradient descent. Topics include: gradient descent; optimization of linear regression parameters with gradient descent; batch, mini-batch and stochastic gradient descent variants; gradient descent for multivariate linear regression; feature normalization by the mean and range or standard deviation; debugging the learning rate; features and polynomial regression, including the choice of features.

21/10/2021 -- The normal equation. Topics include: the design matrix and re-writing linear regression with matrix operations; derivation of the normal equation; comparison of the analytical closed-form solution for linear regression Vs gradient descent.

25/10/2021 -- Probability review and probabilistic interpretation of least squares. Topics include: MSE and correlation; review of conditional probability and Bayes' rule, conditioned Bayes' rule and chain rule; locally-weighted regression; probabilistic interpretation of least squares.

28/10/2021 -- Logistic regression. Topics include: more review of probability theory including independence, continuous and discrete random variables, expectation, as well as the case of multiple random variables; introduction of classification, the sigmoid or logistic function, probabilistic interpretation of the hypothesis, decision boundaries, polynomial features.

4/11/2021 -- MLE for logistic regression. Topics include: maximum likelihood estimate and the maximization of the log likelihood of the parameters with gradient ascent, as well as with Newton's method.

8/11/2021 -- Generalized Linear Model. Topics include: general formulation of the exponential family of distributions as well as the specific cases of the Gaussian and Bernoulli distributions, with references to least squares and logistic regression; the generalized linear model and its application to ordinary least squares and logistic regression.

18/11/2021 -- Generalized Linear Models and Multinomial Distributions. Topics include: multinomial distributions in relation to the exponential family of distributions, the softmax function, softmax regression and cross-entropy minimization.

22/11/2021 -- Gaussian Discriminant Analysis (GDA). Topics include: introduction to generative learning algorithms; review of multivariate Gaussian distribution; assumptions, model and MLE parameters for GDA; predicting with a generative model and with the GDA model in particular; relation between logistic regression and GDA.

25/11/2021 -- Naïve Bayes Classification. Topics include: representation of emails via a multivariate Bernoulli model; the naïve Bayes assumption; the naïve Bayes classification model and MLE parameters.

29/11/2021 -- More on Naïve Bayes: Multinomial Event Model. Topics include: digit classification with Naïve Bayes; generalization of Naïve Bayes to multinomial event model; Laplace Smoothing.

2/12/2021 -- Review on Naïve Bayes and Numerical Stability. Topics include: review of email representations with Multivariate Bernoulli and Multinomial Naïve Bayes, and Laplace smoothing; sample application of the multinomial event model; notes on underflow and numerical stability.

6/12/2021 -- Project presentations. Students have delivered the first presentations about their projects. The session has included a discussion and feedback.

9/12/2021 -- Bias/variance. Topics include: forms of biases including selection bias, publication bias, non-response bias, length bias; underfitting and overfitting, high bias and high variance; data and parameter view at bias and variance.

13/12/2021 -- More on bias, variance and regularization. Topics include: generalization and empirical risk, approximation and estimation error, bias-variance tradeoff; regularization and its probabilistic interpretation; hold-out cross-validation.

16/12/2021 -- Final Project Presentations (Part 1). Students have delivered their final project presentations.

18/12/2021 -- Final Project Presentations (Part 2). Students have delivered their final project presentations.

23/12/2021 -- Regularization and feature selection. Topics include: more on regularization and gradient descent in case of regularization; regularization of generative and discriminative techniques; discussion on data splits; feature selection.

Lab Classes

24/09/2021 -- Introduction to Python. The first lab lecture introduces Python programming and Jupiter Lab. Topics from the lecture include: Python as a dynamically-typed and interpreted language; data types (integers, floats, strings, tuples and lists); conditional statements with if-elif-else; for loops; slicing; list comprehension.

1/10/2021 -- More on lists, mutability and vector algebra. Topics include: Python zip(); dictionaries; mutability; user input; while loops; functions, libraries and user-defined modules; vector add, sum and multiply.

8/10/2021 -- Matrices, Vectors and Files. Topics include: review of vector operations of sum, dot product, scalar product, multiplication, magnitude and distance; matrix operations including creation, initialization, row extraction, column extraction; dense and sparse matrices; file handing including reading, writing, seeking, type of files and access mode.

15/10/2021 -- Introduction to NumPy. Topics include: differences between Numpy arrays and standard Python lists/arrays. Array creation and initialization. Definition of the array type. Array attributes. Array accessing. Array slicing. Multi-dimensional arrays, and accessing/slicing on them. First operations on arrays: reshape, concatenation and splitting.

22/10/2021 -- Numpy array operations and ufuncs. Topics include: vectorization and ufuncs; array arithmetics with ufuncs and other Numpy's ufuncs; ufuncs features including output, aggregates with reduce and accumulate, and outer; aggregation statistics for Numpy arrays, including nan-safe operations; broadcasting.

29/10/2021 -- Boolean arrays, masks and structured arrays. Topics include: Boolean arrays and their use as masks, Boolean logic; fancy and combined indexing; structured arrays.

05/11/2021 -- Introduction to Pandas. Topics include: general introduction to Pandas; Pandas objects: Series, Dataframe and Index; data indexing and selection.

19/11/2021 -- More on Pandas. Topics include: unary and binary operations in Pandas, Ufuncs and index alignment; hierarchical indexing and Pandas multiIndex.

26/11/2021 -- Combining datasets in Pandas. Topics include: slicing with multiIndex; combining datasets with concat, append, merge and joins; one-to-one, one-to-many and many-to-many joins; specification of the merge key.

03/12/2021 -- Visualization with Matplotlib. Topics include: introduction to Matplotlib; the use of plt.show; MATLAB-style Vs. object-oriented interface; line plots, scatter plots, bar plots, errorbar plots; plt.annotate; plotting from Python data structures and plotting from Pandas Dataframes.

10/12/2021 -- Machine Learning with Scikit-Learn. Topics include: data representation; estimator API; example applications of supervised and unsupervised machine learning algorithms with Scikit-Learn; hyperparameters and model validation; cross validation; leave-one-out validation.