Deep Learning

(this site is under construction)

Deep Machine Learning is a new and old paradim of machine learning. On the one hand old, because the basic motivation are neural networks/multi-layer perceptron structures (besides others). On the other hand new, as learning deep networks with lots of hidden layers was not possible before or did not provide any benefit. One challenge was the "vanishing gradient" problem in backpropagation. I refer to great work from Hochreiter & Schmidhuber in 1991 [1] or 2001 [2].

The initial starting point of the deep learning wave was a contribution from Hinton and Salakhutdinov in the well known Science maganzine in 2006 [3]. They demonstrated efficient learning of deep nets by means of a stacked encoder pre-training strategy. Instead of learning the complete network, they divided the network in small encoder stages, stacked together the single stages and finally applied a fine tuning stage with the complete net.

There are tons of links and tutorials in the web to get in touch with deep learning. I highly recommend the following for newbie: Before you try to go deeper into deep learning, please, get an overview about other existing machine learning techniques. Great sources are for example massively open online courses (MOOC) provided by Coursera, Udacity, edX or Stanford Online Courses. An outstanding class on deep Convolutional Neural Networks (CNN) is provided by Standford, here.

A few links to tutorials or other link lists:

  • A long list of AI, machine learning and deep learning tutorials on GitHub: LINK
  • List of tutorials form Professor Bernhard Pfahringer, University of Waikato, NZ (one of my thesis supervisors): LINK
  • DeepLearning.org: one of the main web page for deep learning: LINK

How to start with a deep network?

For sure, it totally depends on the problem you want to solve. The next steps are usefull for deep CNN, which are well established in image and video processing.

  1. Start with an existing network! Caffe provides pre-trained networks.
  2. Fine tune the network with your problem-specific dataset (in this case, a knock-down of the final MLP struct in necessary)
  3. Fine tune the hyperparameter
  4. If performance is bad, start your own network design or write me an email :-)

A personal view on Deep Machine Learning:

  • Are we done with machine learning and is deep learning the holy grail: I believe NO. As other techniques in machine learning or AI in general, deep learning is well suited for a specific type of problems. But, deep learning has demonstrated outstanding performance in different domains, like computer vision or NLP. When you have enough data, deep learning is in the race.
  • When to choose deep learning? You definitely need huge amounts of training data. Additionally, deep learning shows great performance, when the data can be processed in stages, or in an evolutionary way. For example in face recognition, the early layers of a CNN learn simple low level geometries and gabor-like features. Higher level layers learn parts of a face like nose or eyes. High level layers learn complete faces. This is a processing of raw data in stages towards high level representations.
  • Can I use deep learning as a black box? Definitely NO. You need knowledge and experience in the design of the architecture and setup of the hyperparameter to fully exploit the expressive power of deep networks.

Literature:

[1] S. Hochreiter (1991): Untersuchungen zu dynamischen Neuronalen Netzen. Diploma Thesis, University of Munich. (supervised by Schmidhuber) LINK

[2] S. Hochreiter, Y: Bengio, P. Frasconi, J. Schmidhuber (2001): Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. A Field Guide to Dynamical Recurrent Neural Networks, IEEE Press. LINK

[3] G. Hinton, R.R. Salakhutdinov (2006): Reducing the dimensionaly of data with neural networks. Science, 313, p. 504-507 LINK