AIRSYB: An AI-based Self-adaptive Robot Systems Biologist

Note: We are in the process of updating this website, please watch this space regularly for more information. The videos are available at the bottom of this page (more to come).

Project Introduction

AIRSYB is an interdisciplinary research project funded by Royal Society of Edinburgh through Scottish Crucible Award. AIRSYB is also supported by Beyond Consulting Ltd.

People

Investigators: Dr Wei Pang (PI and Main Contact) from University of Aberdeen , Dr Adrian Saurin (University of Dundee), Dr John Woodward (University of Stirling).

Industrial Partner: Dr Tao You (Beyond Consulting Ltd).

Researchers: Dr Severin fichtl (Honorary Research Fellow at Aberdeen University), Mr Alistair Gillespie (Undergraduate Student at Aberdeen University)

Associated Researchers: Prof You Zhou (Computational Intelligence Lab at Jilin University, China), Mr Xinliang Tian (MSc student at Jilin University, China), Prof George M. Coghill (Computing, Aberdeen).

Background

The importance of systems biology

Most of our current biological knowledge has been gained using classical reductionist biology: dissecting biological systems into individual parts and studying them one by one. Techniques for doing this are well-established and large-scale screens to knockout individual genes or knockdown gene products have been used to identify the essential genes for most biological processes. The proteins encoded by these genes have been further studied to understand how they function within the context of the cell and the whole body. As knowledge about these individual proteins is gained, the connections that exist between them are inevitably discovered (e.g. binding partners, substrates etc.), which ultimately reveals the pathways or networks in which they function. The next big challenge is to understand how these protein networks control the many different functions of the cell, and ultimately, how this is perturbed to cause disease. This will require a shift in our emphasis away from classical reductionist biology, towards more holistic approaches that allow us to interrogate and comprehend complex biological networks. That is the realm of molecular systems biology.

The goal of systems biology

The goal is to assemble individually validated molecular connections into an in silico network model that can recapitulate the in vivo behaviour. If this is achieved, then complex network properties can often emerge that would otherwise be unpredictable using biological experiments alone. In this case, studying ‘the whole’ is much more informative than studying each of the individual parts in isolation.

The challenge for systems biologists

Biological systems are incredibly complex, and therefore almost by definition, they are inherently unpredictable. How does a systems biologist begin to attempt to predict behaviour in the face of such unpredictability? The answer is that they learn as much information as possible about a particular biological system first, before attempting to use that knowledge to model it successfully. There will undoubtedly be ‘unknowns’, but the hope is that the level of uncertainty will be small enough to minimise the impact on the output of the model. If this is not the case, and the output is not representative of the real data, then there are two strategies to take:

1) generate new biological data to reduce uncertainty

2) generate new models that can better predict the data, by introducing new or modified connections that can ultimately be validated later.

Or to put this another way, experiments can be used to drive the modelling or modelling can help to drive the experiments. Whilst the path of the former is relatively safe, it can be incredibly laborious and time consuming, with no guarantee that the new data will ultimately help the model. In contrast, developing new ideas in silico can be much quicker, but with the obvious caveat that there is no proof that any of these new models will be correct, and therefore they must eventually be validated by new experiments. An ideal situation would be to efficiently generate new models in silico that could be ranked accordingly to their ‘probability’ of being correct. Put simply, this probability can be defined as the model that i) best conforms to a series of ‘knowns’, ii) uses the least amount of ‘unknowns’, and iii) best approximates the real biological data.

The aim of this project is to use artificial intelligence (AI) computing to generate and categorise new models that are predictive of the biological data.

What Can AIRSYB do?

Our vision is that AIRSYB will be a computational tool to help the systems biologist in their quest to find viable new models. As such, it must be responsive to user input in a way that allows it to be guided to find the most plausible new models in the most efficient way.

Currently, the user can input biological connections, parameters and variables in the form of a model built on ordinary differential equations. Then, for each of these inputs, the user can define a ‘probability’ score for correctness. For example, connections that have been experimentally verified can be scored 1, whereas any unknowns can be give scored between 0 and 1, based on the likelihood that they are correct. Furthermore, any impossible connections that have been eliminated by biological experiments can be added and given a 0 score. This essentially defines a search space that the algorithm uses to find new network structures that best approximate the biological data.

This search can be guided by a defined set of biological data (i.e. a ‘training’ data set), before being tested against a further set of biological outputs once the network structures have been determined (i.e. a ‘testing’ data set). The final ranking considers the performance against the testing data set as well as the deviance from the user inputted ‘probability’ scores. In summary, therefore, AIRSYB uses AI computing to search and rank a series of new model outputs that best fit the real biological data. It is then up to the biologist to select and validate these models experimentally.