Intro to optimization in deep learning: Gradient Descent (Spring 2020)

Deep Learning, to a large extent, is about solving a large scale massive optimization problems. A Neural Network is modeled by a very complicated function, consisting of millions of parameters.

By training a neural network, it essentially means minimizing a loss function. The value of the loss function usually represents a measurement on how far away is the performance of our network to the target on a given data set.

We aim at providing an introduction to the gradient descent based optimization algorithms for learning deep neural network models. Nowadays, most of the deep learning model training still relies on the back propagation algorithm. In back propagation, the model variables will be iteratively updated until we observe the convergence with gradient descent based optimization algorithms. Besides the conventional vanilla gradient descent algorithm, many new algorithms have been proposed recently to improve the learning performance. We consider to study these gradient descent variants such as Momentum, Adagrad, Adam, Gadam, etc., in this research group.

People:

  • Aaron Bendickson (undergrad)

  • Joshua Kalyanapu (undergrad)

  • Bojun Lin (undergrad)

  • Changxin Qiu (PostDoc)

  • Jacob Riesen (undergrad)

  • Jue Yan (Faculty)

  • Mingming Yue (undergrad)

Pre-requisites:

  • Experience with programming (e.g. Matlab, Python) is desirable, but not necessary

  • Experience with Elementary Differential Equations(Math 266/267) is desirable, but not necessary