ConvNets

Deep Learning reading list: http://deeplearning.net/reading-list/

Alex Krizhevsky Homepage: http://www.cs.toronto.edu/~kriz/

From http://benanne.github.io/2014/04/03/faster-convolutions-in-theano.html

Quite a few libraries offering GPU-accelerated convnet training have sprung up in recent years: cuda-convnet, Caffe and Torch7 are just a few. I’ve been using Theano for anything deep learning-related, because it offers a few advantages:

    • it allows you to specify your models symbolically, and compiles this representation to optimised code for both CPU and GPU. It (almost) eliminates the need to deal with implementation details;
    • it does symbolic differentiation: given an expression, it can compute gradients for you.

The combination of these two advantages makes Theano ideal for rapid prototyping of machine learning models trained with gradient descent. For example, if you want to try a different objective function, just change that one line of code where you define it, and Theano takes care of the rest. No need to recompute the gradients, and no tedious optimisation to get it to run fast enough. This is a huge time saver.

OverFeat: http://cilvr.nyu.edu/doku.php?id=software:overfeat:start

*Caffe*:

  • https://gist.github.com/onauparc/dd80907401b26b602885
  • :~/workspace/code/caffe/examples/cifar10$ ./../../build/tools/test_net.bin cifar10_quick_test.prototxt cifar10_quick_iter_5000 100 GPU 0
  • https://registrationcenter.intel.com/RegCenter/AutoGen.aspx?ProductID=1461&AccountID=&EmailID=&ProgramID=&RequestDt=&rm=NCOM&lang=&pass=yes
  • install lmdb-dev
  • use g++-4.6
  • http://cs.nyu.edu/~fergus/tutorials/deep_learning_cvpr12/tutorial_p2_nnets_ranzato_short.pdf
  • https://github.com/jetpacapp/DeepBeliefSDK/tree/master
  • https://github.com/rbgirshick/caffe/blob/master/docs/mnist_prototxt.md
  • http://web.engr.illinois.edu/~slazebni/spring14/lec24_cnn.pdf
  • http://ufldl.stanford.edu/wiki/index.php/Softmax_Regression

Writing a fully connected layer is also simple:

layers {   name: "ip1"   type: INNER_PRODUCT   blobs_lr: 1.   blobs_lr: 2.   inner_product_param {     num_output: 500     weight_filler {       type: "xavier"     }     bias_filler {       type: "constant"     }   }   bottom: "pool2"   top: "ip1" } 

This defines a fully connected layer (for some legacy reason, Caffe calls it an innerproduct layer) with 500 outputs. All other lines look familiar, right?