Attention in ML


Attention Mechanism (

Attention and Memory in Deep Learning and NLP

Cho, Kyunghyun, Aaron Courville, and Yoshua Bengio. Describing Multimedia Content using Attention-based Encoder–Decoder Networks.  arXiv preprint arXiv:1507.01053 (2015)


Stollenga, Marijn F and Masci, Jonathan and Gomez, Faustino and Schmidhuber, Juergen Deep Networks with Internal Selective Attention NIPS 2014 

Noam Shazeer et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

Oriol Vinyals et al  Matching Networks for One Shot Learning

Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer

Structured Attention Networks
Yoon Kim, Carl Denton, Luong Hoang, Alexander M. Rush

Emergence of foveal image sampling from learning to attend in visual scenes
Brian Cheung, Eric Weiss, Bruno Olshausen

Karol Gregor et al  DRAW: A Recurrent Neural Network For Image Generation

Attend, Adapt and Transfer: Attentive Deep Architecture for Adaptive Transfer from multiple sources in the same domain
Janarthanan Rajendran, Aravind Lakshminarayanan, Mitesh M. Khapra, Prasanna P, Balaraman Ravindran

Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio

Recurrent Mixture Density Network for Spatiotemporal Visual Attention
Loris Bazzani, Hugo Larochelle, Lorenzo Torresani

Jimmy Ba, Geoffrey Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin Ionescu, Using Fast Weights to Attend to the Recent Past.

Cicero dos Santos, Ming Tan, Bing Xiang, Bowen Zhou. Attentive Pooling Networks

Yang, Scott Cheng-Hsin, Daniel M. Wolpert, and Máté Lengyel.  Theoretical perspectives on active sensing. Current Opinion in Behavioral Sciences 11 (2016): 100-108.