Opening the black-box of recurrent neural networks
Artificial recurrent neural networks (RNNs) are non-autonomous dynamical systems driven by (time-varying) inputs that perform computations exploiting short-term memory mechanisms. RNNs have shown remarkable results in several applications, including natural language generation and various signal processing tasks, e.g., audio and video processing. However, training RNNs is hard as a consequence of the "vanishing/exploding gradient problem". Moreover, their high-dimensional, non-linear structure complicates interpretability of internal dynamics, which are characterized by complex, input-dependent spatio-temporal patterns. This poses constraints on the applicability of RNNs, which are usually treated as black-box thus preventing the extraction of scientific knowledge (novel scientific insights) from experimental data. Similar issues affect also other architectures, stressing the need to develop general methodologies for explaining the behaviour of machine learning methods when used for decision-making (e.g., credit and insurance risk assessment) and in scientifically-relevant applications (e.g., bio-markers discovery for genetic diseases).
This project seeks to develop methodologies for providing a mechanistic description of the behaviour of RNNs by developing novel theories and attractor models suitable to describe non-autonomous systems that can operate and perform computations also in a transient regime. This will enable the analytic description of how RNNs solve computational tasks, which, in turn, will allow researchers to gather knowledge from data and drive novel hypotheses to design experiments.
References
P. Verzelli et al. Learn to Synchronize, Synchronize to Learn. ChaosP. Verzelli et al. Input-to-State Representation in linear reservoirs dynamics. IEEE-TNNLSA. Ceni, P. Ashwin, L. Livi, C. Postlethwaite. The Echo Index and multistability in input-driven recurrent neural networks. Physica DP. Verzelli, C. Alippi, L. Livi. Echo State Networks with Self-Normalizing Activations on the Hyper-Sphere. Scientific ReportsA. Ceni, P. Ashwin, L. Livi. Interpreting RNN behaviour via excitable network attractors. Cognitive ComputationAnalysis of graph sequences and graph-structured data
Today, it is possible to observe natural and man-made complex systems (e.g., protein and metabolic networks, smart grids, brain networks) involving spatio-temporal interactions of many elements on multiple scales. A prominent example is provided by the brain and the possibility to simultaneously record the activity of neurons from multiple electrodes. A first step toward understanding such systems from data requires the use of complex data representations, such as sequences of graphs, for encoding their spatio-temporal behaviour. Accordingly, data-driven procedures, like prediction and change detection methods, need to be designed with the ability to process sequences of graph-structured data.
References
D. Zambon et al. Distance-Preserving Graph Embeddings from Random Neural Features. ICML 2020F. M. Bianchi et al. Graph Neural Networks with convolutional ARMA filters. IEEE-TPAMIF. M. Bianchi et al. Hierarchical Representation Learning in Graph Neural Networks with Node Decimation Pooling. IEEE-TNNLSD. Zambon et al. Change-Point Methods on a Sequence of Graphs. IEEE-TSPD. Zambon et al. Concept Drift and Anomaly Detection in Graph Streams. IEEE-TNNLSD. Grattarola et al. Learning Graph Embeddings on Constant-Curvature Manifolds for Change Detection in Graph Streams. IEEE-TNNLSD. Zambon et al. Autoregressive Models for Sequences of Graphs. IEEE-IJCNN 2019Prediction and generation of molecular structures
Understanding the thermodynamics and kinetics of protein-ligand interactions play a fundamental role in protein science and in the early stages of drug discovery. From a thermodynamic point of view, protein-ligand interactions are described by the binding free-energy surface (BFES). Due to the numerous possible structural configurations assumed by both proteins and ligands, such a surface is high-dimensional and non-linear, and its accurate computation requires significant computational resources. On the other hand, while research on protein-ligand interaction focused on computing the most likely binding pose (represented by the lower energy minima in the BFES), more recently it has been discovered that other factors contribute to the effectiveness of their interactions over time. These factors are typically described by the so-called dissociation rate, which is a property quantifying the time-dependent stability of the ligand-protein interaction. Such a property can be investigated by analyzing particular saddle points of the BFES. However, such special saddles represent transition states that are elusive to both simulations and experiments and thus difficult to characterize.
Recently, machine learning methodologies based on deep neural networks yielded fundamental breakthroughs in several scientific and technological fields. Of particular interest are graph (convolutional) neural networks, which allow us to process graph-data as structured inputs by means of conventional neural processing mechanisms. Combined with the possibility to produce structured outputs through generative models, modern neural networks allow us to explore the high-dimensional conformational space proper of chemical structures and hence to investigate complex protein-ligand binding mechanisms.
References
Heydari et al. Transferring Chemical and Energetic Knowledge Between Molecular Systems with Machine Learning. Communications ChemistryGrattarola et al. Adversarial Autoencoders with Constant-Curvature Latent Manifolds. Applied Soft ComputingPrediction, localization, and control of epileptic seizures in drug-resistant patients
A natural way to model a complex system consists of representing it as an attributed graph, allowing to describe both its components and the relations between them. When a system evolves over time, the resulting sequence of graphs can be modeled as a stochastic process in the space of graphs, also called a graph-generating process (GGP). In this setting, several tasks that are well-defined for classical stochastic processes (e.g., anomaly prediction, control) become difficult to tackle, due to the complex nature of the domain of graphs.
The goal of this research project is to develop a framework for the diagnostics and control of a GGP, leveraging state-of-the-art models for deep learning on graphs to automatically deal with the complexity of the graph space. Although the framework is general, we will focus on the context of epileptic seizures and neuromodulation by analysing intracranial EEGs recorded from drug-resistant patients.
The research project is composed of four different milestones, tightly related to each other but addressing different learning tasks:
M1: The prediction of events occurring in a GGP, developing models that are able to operate both in supervised and unsupervised learning settings
M2: The localisation of nodes and edge that are involved in an event, as well as the identification of the salient time steps in the sequence that led to that event
M3: The explanation of an event through causal inference, using the information extracted in M2 to build a chain of cause-effect relationships that led to a particular event
M4: The control of a GGP to prevent or resolve undesired events, including the identification of the state space, action space, and reward functions of the process, by using control theory and reinforcement learning to learn a behaviour policy for the GGP
References
Grattarola et al. Seizure localisation with attention-based graph neural networks. Expert Systems with ApplicationsLopes et al. Recurrence quantification analysis of dynamic brain networks. EJNDr. Cesare Alippi, Politecnico di Milano, Italy, and Universita' della Svizzera italiana, Switzerland
Dr. Robert Jenssen and Dr. Filippo Maria Bianchi, UiT the Arctic University of Norway
Dr. Peter Ashwin and Dr. Krasimira Tsaneva-Atanasova, University of Exeter, UK
Dr. Taufik Valiante and Dr. David Groppe, University of Toronto, Canada
Dr. Naoki Masuda, University at Buffalo, USA
Dr. Vittorio Limongelli, Universita' della Svizzera italiana, Switzerland
Dr. Stanislaw Drożdż and Dr. Pawel Oświȩcimka, Polish Academy of Sciences, Poland
Dr. Alessandro Giuliani, Istituto Superiore di Sanita', Italy
Dr. Witold Pedrycz, Universit of Alberta, Canada