Linear regression and gradient descent

Outline

  1. Getting started
  2. For loops and lists
  3. Gradient descent
  4. Refactor code to use matrices
  5. Multiple linear regression: predicting molecular solvation energies
  6. Multiple linear regression using Scikit: predicting molecular solubilities

Videos

  • Google Colab
  • Python intro
  • Pandas
  • Bias and weights
  • Loss function and L2 loss
  • Matplotlib
  • For loops
  • Lists
  • Gradient
  • Gradient descent
  • Epochs
  • Learning rate
  • Convergence
  • Refactoring
  • Matrices using Numpy
  • Uncommenting blocks of code
  • Random initialisation of weights
  • Downloading data using wget
  • More Pandas
  • The ESOL/Delaney data set
  • Scikit

Ideas for further coding projects

Linear Regression

  • Create a plot with with 5 lines for 5 different choices of b and w
  • Create a a plot with 5 lines using b and w obtained after 10, 50, 100, 500, and 1000 iterations

Mutiple Linear Regression

  • try using only some of the molecular descriptors to see if you get a lower error for the predicted solubilities
  • Find the molecules with the 5 largest errors. Google their names to find the structures. Do they have anything in common?