Scalable Deep Learning: From Theory to Practice
A fundamental task for Artificial Intelligence is learning. Deep Neural Networks have proven to cope perfectly with all learning paradigms, i.e. supervised, unsupervised, and reinforcement learning. Nevertheless, traditional deep learning approaches make use of cloud computing facilities and do not scale well to autonomous agents with low computational resources. Even in the cloud they suffer from computational and memory limitations and cannot be used to model properly large physical worlds for agents which assume networks with billion of neurons. These issues were addressed in the last few years by the emerging topics of scalable and efficient deep learning. The tutorial covers these topics focusing on theoretical advancements, practical applications, and hands-on experience, in two parts.
Detailed Program, Location, AND SLIDES
Location: The Venetian Ballroom J at IJCAI 2019
Date: Monday, 12 August 2019, 14:00 - 18:00.
- Tutorial introduction [pdf]
- Part I: Scalable Deep Learning: from pruning to evolution [pdf]
- Part II: Scalable Deep Learning: deep reinforcement learning [pdf]
- Applications: Multi-agent simulation and decision support in power and energy systems [pdf]
Program. Some of the following topics will be covered in depths and some in a shallow manner.
Part I - Scalable Deep Learning: from pruning to evolution.
The first part of the tutorial focuses on theory. We first revise how many agents make use of deep neural networks nowadays. We then introduce the basic concepts of neural networks and we draw a parallel between artificial and biological neural networks from a functional and topological perspective. We continue by introducing the first papers on efficient neural networks coming from early '90s which make use either of sparsity enforcing penalties or weights pruning of fully connected networks based on various saliency criterion. Afterwards, we review some of the recent works which start from fully connected networks and make use of prune-retrain cycles to compress deep neural networks and to make them more efficient in the inference phase. We then discuss an alternative approach, i.e. NeuroEvolution of Augmenting Topologies (NEAT) and its follow-ups, to grow efficient deep neural networks using evolutionary computing.
Scalable Deep Learning. Further on, we introduce the topic of Scalable Deep Learning (SDL) which builds on efficient deep learning and put together all of the above. Herein, we discuss how Deep Neural Networks (DNNs) are trained using the new proposed Sparse Evolutionary Training (SET) algorithm. SET-DNNs start from random sparse networks and use an evolutionary process to adapt their sparse connectivity to the data while learning. SET-DNNs offer benefits in both phases, training and inference, having quadratically lower memory-footprints and much faster running time then their fully-connected counterparts. Moreover, SET-DNNs are many times capable to outperform also in terms of performance metrics (e.g. classification accuracy) their fully-connected counterparts. These make them the perfect match for autonomous agents or for the modeling of large physical environments which need millions (or perhaps billions) of neurons.
Part II - Scalable Deep Learning: deep reinforcement learning
Up to now, everything is discussed in the context of supervised and unsupervised learning. Further on, we introduce deep reinforcement learning and we pave the ground for scalable deep reinforcement learning. We describe some very recent progresses in the field of deep reinforcement learning that could be used to foster the performances of reinforcement learning agents when confronted with environments that can exhibit sudden changes in their dynamics, as it is often the case with energy systems.
Applications - SDL agents in smart grids. The last part of the tutorial focuses on practical applications. Distributed generation, demand response, distributed storage, and electric vehicles are bringing new challenges to the power and energy sector. The tutorial addresses the current and envisioned solutions for the management of these distributed energy resources in the context of smart grids. Artificial intelligence based approaches bring important new possibilities enabling efficient individual and aggregated energy management. Such approaches can provide different players aiming to accomplish individual and common goals in the frame of a market-driven environment with advanced decision-support and automated solutions. Through the end, we will argue that reinforcement learning paradigm can be very powerful to solve many decision-making problems in the energy sector, as for example investment problems, the design of bidding strategies for playing with the intraday electricity market or problems related to the control of microgrids. The last presentation define the resource allocation problems as a sequential stochastic decision-making process, that considers scalable and efficient deep reinforcement learning agents. We investigate how multiple learning agents interact and influence each other in the smart grid context, what kind of global system dynamics arise, and how desired electrical behaviour can be obtained by modifying the learning algorithms used. The settings considered, range from one-to-one interactions (e.g. games) to small groups (e.g. multi-agent coordination) and large communities (e.g. interactions in social networks).
After the tutorial, the participants shall have: a basic understanding of scalable deep neural networks for intelligent agents learning, of agents in the smart grid context; basic hands-on experience to use these concepts in various practical applications; and some good thoughts for future research directions.
- Decebal Constantin Mocanu, Elena Mocanu, Peter Stone, Phuong H. Nguyen, Madeleine Gibescu, and Antonio Liotta, 2017. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science. Nature Communications 2018, https://www.nature.com/articles/s41467-018-04316-3; Code: https://github.com/dcmocanu/sparse-evolutionary-artificial-neural-networks
- Decebal Constantin Mocanu and Elena Mocanu. 2018. One-Shot Learning Using Mixture of Variational Autoencoders: A Generalization Learning Approach. In Proceedings of the 17th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS ’18), https://arxiv.org/abs/1804.07645
- Elena Mocanu, Decebal Constantin Mocanu, Phuong. H. Nguyen, Antonio Liotta, Michael E. Webber, Madeleine Gibescu, and J. G. Slootweg. 2018. On-line Building Energy Optimization using Deep Reinforcement Learning. IEEE Transactions on Smart Grid 2018, https://ieeexplore.ieee.org/document/8356086
- Elena Mocanu, Phuong H Nguyen, Madeleine Gibescu and Wil L Kling, 2016. Deep learning for estimating building energy consumption, Sustainable Energy, Grids and Networks 2016
- Anil Yaman, Giovanni Iacca, Decebal Constantin Mocanu, George Fletcher and Mykola Pechenizkiy, 2019. Learning with Delayed Synaptic Plasticity, The Genetic and Evolutionary Computation Conference (GECCO 2019), Prague, Czech Republic. https://arxiv.org/abs/1903.09393 , Video
- Shiwei Liu, Decebal Constantin Mocanu, Amarsagar Reddy Ramapuram Matavalam, Yulong Pei, Mykola Pechenizkiy, 2019, Sparse evolutionary Deep Learning with over one million artificial neurons on commodity hardware, https://arxiv.org/abs/1901.09181 ; Code
- Decebal Constantin Mocanu, Elena Mocanu, Phuong Nguyen, Madeleine Gibescu, Antonio Liotta, 2016. A topological insight into restricted Boltzmann machines, Machine Learning (ECMLPKDD 2016), https://link.springer.com/article/10.1007%2Fs10994-016-5570-z
- Decebal Constantin Mocanu, 2016, On the synergy of network science and artificial intelligence, In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI 2016)
- Decebal Constantin Mocanu, Maria Torres Vega, Eric Eaton, Peter Stone, Antonio Liotta, 2016, Online contrastive divergence with generative replay: Experience replay without storing data, https://arxiv.org/abs/1610.05555.
- Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, Joel Emer, 2017. Efficient Processing of Deep Neural Networks: A Tutorial and Survey. Proceedings of IEEE 2017.
- Yan Le Cun, John S. Denker, Sara A. Solla, 1990. Optimal brain damage. NIPS 1990.
- Song Han, Jeff Pool, John Tran, William J. Dally, 2015. Learning both Weights and Connections for Efficient, NIPS 2015.
- Hangyu Zhu and Yaochu Jin, 2019, Multi-objective Evolutionary Federated Learning, 2018, IEEE Transactions on Neural Networks and Learning Systems 2019.
- Hesham Mostafa, Xin Wang, 2018, Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization, ICML 2019.
- Tim Dettmers, Luke Zettlemoyer, 2019, Sparse Networks from Scratch: Faster Training without Losing Performance, https://arxiv.org/abs/1907.04840
ORGANIZERS Short Bios
- Decebal Constantin Mocanu is an Assistant Professor in Artificial Intelligence and Machine Learning within the Data Mining group, Department of Mathematics and Computer Science, Eindhoven University of Technology (TU/e) since September 2017, and a member of TU/e Young Academy of Engineering. His short-term research interest is to conceive scalable deep artificial neural network models and their corresponding learning algorithms using principles from network science, evolutionary computing, optimization and neuroscience. Such models shall have sparse and evolutionary connectivity, make use of previous knowledge, and have strong generalization capabilities to be able to learn, and to reason, using few examples in a continuous and adaptive manner. Most science carried out throughout human evolution uses the traditional reductionism paradigm, which even if it is very successful, still has some limitations. Aristotle wrote in Metaphysics ``The whole is more than the sum of its parts''. Inspired by this quote, in long term, Decebal would like to follow the alternative complex systems paradigm and study the synergy between artificial intelligence, neuroscience, and network science for the benefits of science and society. In 2017, Decebal received his PhD in Artificial Intelligence and Network Science from TU/e. During his doctoral studies, Decebal undertook three research visits: the University of Pennsylvania (2014), Julius Maximilian University of Wurzburg (2015), and University of Texas, Austin (2016). Prior to this, in 2013, he obtained his MSc in Artificial Intelligence from Maastricht University. During his master studies, Decebal also worked as a part time software developer at We Focus BV in Maastricht. In the last year of his master studies, he also worked as an intern at Philips Research in Eindhoven, where he prepared his internship and master thesis projects. Decebal obtained his Licensed Engineer degree from University Politehnica of Bucharest. While in Bucharest, between 2001 and 2010, Decebal started MDC Artdesign SRL (a software house specialized in web development), worked as a computer laboratory assistant at the University Nicolae Titulescu, and as a software engineer at Namedrive LLC.
- Elena Mocanu is an Assistant Professor in Machine Learning within the Data Science group at University of Twente and a researcher at Eindhoven University of Technology. She received her B.Sc. degree in Mathematics and Physics from Transilvania University of Brasov, Romania, in 2004. After four years as mathematics and physics teacher at high-school level, Elena moved to the university. She has been an Assistant Lecturer within the Department of Information Technology, University of Bucharest, Romania from September 2008 to January 2011. In parallel, in 2009 she started a master program in Theoretical Physics. In 2011 she obtained the M.Sc. degree with specialization in Quantum Transport from University of Bucharest, Romania. In January 2011, Elena moved from Romania to Netherlands. She obtained the M.Sc. degree in Operations Research from Maastricht University, The Netherlands, in 2013. In the last year of her master program, she did a six months Internship at Mastricht University on bioinformatics data analytics research and a six months graduation project at NXP Semiconductors, Eindhoven. In October 2013, Elena started her PhD research in Machine Learning and Smart Grids at TU/e. In January 2015 she performed a short research visit at the Technical University of Denmark and, from January to April 2016 she was a visiting researcher at University of Texas at Austin, USA. In 2017, Elena received her Doctor of Philosophy degree in Machine learning and Smart Grids from TU/e.
- Zita Vale is full professor at the Polytechnic Institute of Porto and the director of the Research Group on Intelligent Engineering and Computing for Advanced Innovation and Development (GECAD). She received her diploma in Electrical Engineering in 1986 and her PhD in 1993, both from University of Porto. Zita Vale works in the area of Power and Energy Systems, with special interest in the application of Artificial Intelligence techniques. She has been involved in more than 50 funded projects related to the development and use of Knowledge-Based systems, Multi-Agent systems, Genetic Algorithms, Neural networks, Particle Swarm Intelligence, Constraint Logic Programming and Data Mining. Energy resources management, distributed generation, demand response and electric vehicles are important topics of her research in the current projects. The main application fields of these projects comprise: (1) Smart Grids, accommodating an intensive use of Renewable Energy Sources, Distributed Energy Resources (DER) and Distributed Generation (DG). She addresses the management of energy resources, the impact of DER on electrical networks, the negotiation of DER in electricity markets, demand response, storage, energy management in buildings, and electrical vehicles, including the ones with gridable capability (V2G); (2) Electricity markets, addressing contracts, prices and tariffs, decision-support for market participants, aggregation, ancillary services, and wholesale and local market simulation; and (3) Control Center applications, namely intelligent alarm processing, intelligent interfaces and intelligent tutors. Zita published over 800 works, including more than 100 papers in international scientific journals, and more than 500 papers in international scientific conferences. She has surprised 17 PhD concluded thesis, and is currently supervising 8 PhD students. Se has also supervised 45 MSc concluded theses, and is currently surpervising 10 MSc thesis.
- Damien Ernst graduated in 1998 from the University of Liège. He did his master thesis on electricity networks, focusing on loss of synchronism phenomena that can lead to blackouts in a matter of seconds. He changed direction for his doctoral thesis (which he defended in 2003), to focus on reinforcement learning. ``However, I used reinforcement learning techniques to solve control and decision-making problems in electricity networks. After my thesis, I continued on that path and I have published a great deal on the subject.'' This was during his three-year post-doctoral research, funded by the FNRS and spent at CMU and MIT in the USA, and at ETH in Zurich. Damien Ernst then headed to France, where he was a professor at Supélec, an engineering school in Rennes. ``2006 was a great year,'' he recalls, ``mainly because of my contacts with the French industry.'' He then returned to the FNRS (2007-2011) as a research associate, where he continued his work on reinforcement learning. During the period 2011-2016, he has held the EDF-Luminus Chair on Smart Grids at the University of Liège. ``Thanks to this chair, I've learned a great deal from working more closely with industry. It's given me a more complete, and less academic vision of energy systems. I've also been able to develop great research projects to facilitate the integration of renewable energy into electrical networks.'' Damien Ernst is now working at the University of Liège, as a full professor. He is doing research in the fields of energy and Artificial Intelligence.