Workshop July 2018

This two-week intensive winter program will equip participants with deep theoretical foundations and expose them to the latest developments and cutting-edge research being conducted in the area of Data Science. There will be presentations/lectures on advanced background material not typically covered in undergraduate mathematics and statistics courses.

Date: 2 - 13 July 2018

Venue: AIMS South Africa, Muizenberg


Courses & Instructors:

  • Probability & Statistics for Data Science - Terence van Zyl (Wits)
  • Deep Learning - Emmanuel Darfouq (AIMS South Africa) & Bubacarr Bah (AIMS South Africa)
  • Network Analytics - Franck Kalala Mutombo (AIMS AIMS Senegal & University of Lubumbashi) & Bubacarr Bah (AIMS South Africa)


Course Outlines:

Probability & Statistics for Data Science

  • Exploratory Data Analysis:

Introduction to Exploratory Data Analysis and statistical distributions.

  • Discrete Distributions:

Understanding probability mass functions and cumulative distribution functions.

  • Modeling Distributions:

Modeling data using distributions.

  • Continuous Distributions:

Exploring probability density functions.

  • Multivariate Relationships:

Introduction to multivariate data and its visualization.

  • Estimation and Hypothesis Testing:

Theory of estimating parameters and hypothesis testing.

  • Linear Least Squares and Regression:

The use of least square estimation for linear regression.

The free online text book we will use is: http://greenteapress.com/thinkstats2/thinkstats2.pdf


Deep Learning (DL)

  • DL basics
  • Interesting applications
  • Intro to Tensorflow
  • Intro to TFLearn and PyTorch
  • Numpy Refresher
  • Training/theory
  • Sentiment analysis
  • CNNs
  • Weight initialisation
  • Autoencoders
  • Project: Dog breed classification
  • RNNs
  • LSTMs
  • Transfer learning
  • Word2Vec and Embeddings
  • Intro to reinforcement learning
  • Intro to generative adversarial networks


Network Analytics

  • The basic conceptual and mathematical formulation of networks
  • Basic metrics of networks (e.g. paths, components, degree distributions, etc.)
  • Centrality measures
  • General properties of real world networks
  • Models of networks
  • Dynamics of, and on, networks (e.g. percolation and resilience, growth, spreading, random walks, etc.)
  • Community detection
  • Social Network Analysis