Scalable Deep Learning: From Theory to Practice
A fundamental task for artificial intelligence is learning. Deep Neural Networks have proven to cope perfectly with all learning paradigms, i.e. supervised, unsupervised, and reinforcement learning. Nevertheless, traditional deep learning approaches make use of cloud computing facilities and do not scale well to autonomous agents with low computational resources. Even in the cloud they suffer from computational and memory limitations and cannot be used to model properly large physical worlds for agents which assume networks with billion of neurons. These issues were addressed in the last few years by the emerging topics of scalable and efficient deep learning. The tutorial covers these topics focusing on theoretical advancements, practical applications, and hands-on experience, in two parts.
Part I - Scalable Deep Learning: from pruning to evolution.
The first part of the tutorial focuses on theory. We first revise how many agents make use of deep neural networks nowadays. We then introduce the basic concepts of neural networks and we draw a parallel between artificial and biological neural networks from a functional and topological perspective. We continue by introducing the first papers on efficient neural networks coming from early '90s which make use either of sparsity enforcing penalties or weights pruning of fully connected networks based on various saliency criterion. Afterwards, we review some of the recent works which start from fully connected networks and make use of prune-retrain cycles to compress deep neural networks and to make them more efficient in the inference phase. We then discuss an alternative approach, i.e. NeuroEvolution of Augmenting Topologies (NEAT) and its follow-ups, to grow efficient deep neural networks using evolutionary computing.
Scalable Deep Learning. Further on, we introduce the topic of Scalable Deep Learning (SDL) which builds on efficient deep learning and put together all of the above. Herein, we discuss how intrinsically sparse Deep Neural Networks (DNNs) are trained using the new proposed Sparse Evolutionary Training (SET) algorithm. SET-DNNs start from random sparse networks and use an evolutionary process to adapt their sparse connectivity to the data while learning. SET-DNNs offer benefits in both phases, training and inference, having quadratically lower memory-footprints and much faster running time then their fully connected counterparts. These make them the perfect match for autonomous agents or for the modeling of large physical environments which need millions (or perhaps billions) of neurons.
Part II - Scalable Deep Learning: deep reinforcement learning
Up to now, everything is discussed in the context of supervised and unsupervised learning. Further on, we introduce deep reinforcement learning and we pave the ground for scalable deep reinforcement learning. We describe some very recent progresses in the field of deep reinforcement learning that could be used to foster the performances of reinforcement learning agents when confronted with delayed reinforcement signals and environments that can exhibit sudden changes in their dynamics, as it is often the case with energy systems.
Applications - SDL agents in smart grids. The last part of the tutorial focuses on practical applications. Distributed generation, demand response, distributed storage, and electric vehicles are bringing new challenges to the power and energy sector. The tutorial addresses the current and envisioned solutions for the management of these distributed energy resources in the context of smart grids. Artificial intelligence based approaches bring important new possibilities enabling efficient individual and aggregated energy management. Such approaches can provide different players aiming to accomplish individual and common goals in the frame of a market-driven environment with advanced decision-support and automated solutions. Through the end, we will argue that the reinforcement learning paradigm can be very powerful to solve many decision-making problems in the energy sector, as for example investment problems, the design of bidding strategies for playing with the intraday electricity market or problems related to the control of microgrids. The last presentation defines the resource allocation problems as a sequential stochastic decision-making process, that considers scalable and efficient deep reinforcement learning agents. We investigate how multiple learning agents interact and influence each other in the smart grid context, what kind of global system dynamics arise, and how desired electrical behaviour can be obtained by modifying the learning algorithms used. The settings considered, range from one-to-one interactions (e.g. games) to small groups (e.g. multi-agent coordination) and large communities (e.g. interactions in social networks).
After the tutorial, the participants shall have: a basic understanding of scalable deep neural networks for intelligent agents learning, of agents in the smart grid context; basic hands-on experience to use these concepts in various practical applications; and some good thoughts for future research directions.
The detailed program will be made available at a later date.
- Slides will be available here.
- Decebal Constantin Mocanu, Elena Mocanu, Peter Stone, Phuong H. Nguyen, Madeleine Gibescu, and Antonio Liotta. 2018. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science. Nature Communications 9, https://www.nature.com/articles/s41467-018-04316-3 ; Code: https://github.com/dcmocanu/sparse-evolutionary-artificial-neural-networks
- Decebal Constantin Mocanu and Elena Mocanu. 2018. One-Shot Learning Using Mixture of Variational Autoencoders: A Generalization Learning Approach. In Proceedings of the 17th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS ’18), https://arxiv.org/abs/1804.07645
- Elena Mocanu, Decebal Constantin Mocanu, Phuong. H. Nguyen, Antonio Liotta, Michael E. Webber, Madeleine Gibescu, and J. G. Slootweg. 2018. On-line Building Energy Optimization using Deep Reinforcement Learning. IEEE Transactions on Smart Grid, https://ieeexplore.ieee.org/document/8356086
- Elena Mocanu, Phuong H Nguyen, Madeleine Gibescu and Wil L Kling, 2016. Deep learning for estimating building energy consumption, Sustainable Energy, Grids and Networks
- Anil Yaman, Giovanni Iacca, Decebal Constantin Mocanu, George Fletcher and Mykola Pechenizkiy, 2019. Learning with Delayed Synaptic Plasticity, The Genetic and Evolutionary Computation Conference (GECCO 2019), Prague, Czech Republic. https://arxiv.org/abs/1903.09393, Video
- Shiwei Liu, Decebal Constantin Mocanu, Amarsagar Reddy Ramapuram Matavalam, Yulong Pei, Mykola Pechenizkiy, 2019, Sparse evolutionary Deep Learning with over one million artificial neurons on commodity hardware, https://arxiv.org/abs/1901.09181 ; Code
- Decebal Constantin Mocanu, Elena Mocanu, Phuong Nguyen, Madeleine Gibescu, Antonio Liotta, 2016. A topological insight into restricted Boltzmann machines, Machine Learning (ECMLPKDD 2016), https://link.springer.com/article/10.1007%2Fs10994-016-5570-z
- Decebal Constantin Mocanu, 2016, On the synergy of network science and artificial intelligence, In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI 2016)
- Decebal Constantin Mocanu, Maria Torres Vega, Eric Eaton, Peter Stone, Antonio Liotta, 2016, Online contrastive divergence with generative replay: Experience replay without storing data, https://arxiv.org/abs/1610.05555.
ORGANIZERS Short Bios
- Elena Mocanu is an Assistant Professor in Machine Learning within the Data Science group at University of Twente and a researcher at Eindhoven University of Technology. She received her B.Sc. degree in Mathematics and Physics from Transilvania University of Brasov, Romania, in 2004. After four years as mathematics and physics teacher at high-school level, Elena moved to the university. She has been an Assistant Lecturer within the Department of Information Technology, University of Bucharest, Romania from September 2008 to January 2011. In parallel, in 2009 she started a master program in Theoretical Physics. In 2011 she obtained the M.Sc. degree with specialization in Quantum Transport from University of Bucharest, Romania. In January 2011, Elena moved from Romania to Netherlands. She obtained the M.Sc. degree in Operations Research from Maastricht University, The Netherlands, in 2013. In the last year of her master program, she did a six months Internship at Mastricht University on bioinformatics data analytics research and a six months graduation project at NXP Semiconductors, Eindhoven. In October 2013, Elena started her PhD research in Machine Learning and Smart Grids at TU/e. In January 2015 she performed a short research visit at the Technical University of Denmark and, from January to April 2016 she was a visiting researcher at University of Texas at Austin, USA. In 2017, Elena received her Doctor of Philosophy degree in Machine learning and Smart Grids from TU/e.
- Decebal Constantin Mocanu is an Assistant Professor in Artificial Intelligence and Machine Learning within the Data Mining group, Department of Mathematics and Computer Science, Eindhoven University of Technology (TU/e) since September 2017, and a member of TU/e Young Academy of Engineering. His short-term research interest is to conceive scalable deep artificial neural network models and their corresponding learning algorithms using principles from network science, evolutionary computing, optimization and neuroscience. Such models shall have sparse and evolutionary connectivity, make use of previous knowledge, and have strong generalization capabilities to be able to learn, and to reason, using few examples in a continuous and adaptive manner. Most science carried out throughout human evolution uses the traditional reductionism paradigm, which even if it is very successful, still has some limitations. Aristotle wrote in Metaphysics ``The whole is more than the sum of its parts''. Inspired by this quote, in long term, Decebal would like to follow the alternative complex systems paradigm and study the synergy between artificial intelligence, neuroscience, and network science for the benefits of science and society. In 2017, Decebal received his PhD in Artificial Intelligence and Network Science from TU/e. During his doctoral studies, Decebal undertook three research visits: the University of Pennsylvania (2014), Julius Maximilian University of Wurzburg (2015), and University of Texas, Austin (2016). Prior to this, in 2013, he obtained his MSc in Artificial Intelligence from Maastricht University. During his master studies, Decebal also worked as a part time software developer at We Focus BV in Maastricht. In the last year of his master studies, he also worked as an intern at Philips Research in Eindhoven, where he prepared his internship and master thesis projects. Decebal obtained his Licensed Engineer degree from University Politehnica of Bucharest. While in Bucharest, between 2001 and 2010, Decebal started MDC Artdesign SRL (a software house specialized in web development), worked as a computer laboratory assistant at the University Nicolae Titulescu, and as a software engineer at Namedrive LLC.