The importance of systems biology
Most of our current biological knowledge has been gained using classical reductionist biology: dissecting biological systems into individual parts and studying them one by one. Techniques for doing this are well-established and large-scale screens to knockout individual genes or knockdown gene products have been used to identify the essential genes for most biological processes. The proteins encoded by these genes have been further studied to understand how they function within the context of the cell and the whole body. As knowledge about these individual proteins is gained, the connections that exist between them are inevitably discovered (e.g. binding partners, substrates etc.), which ultimately reveals the pathways or networks in which they function. The next big challenge is to understand how these protein networks control the many different functions of the cell, and ultimately, how this is perturbed to cause disease. This will require a shift in our emphasis away from classical reductionist biology, towards more holistic approaches that allow us to interrogate and comprehend complex biological networks. That is the realm of molecular systems biology.
The goal of systems biology
The goal is to assemble individually validated molecular connections into an in silico network model that can recapitulate the in vivo behaviour. If this is achieved, then complex network properties can often emerge that would otherwise be unpredictable using biological experiments alone. In this case, studying ‘the whole’ is much more informative than studying each of the individual parts in isolation.
The challenge for systems biologists
Biological systems are incredibly complex, and therefore almost by definition, they are inherently unpredictable. How does a systems biologist begin to attempt to predict behaviour in the face of such unpredictability? The answer is that they learn as much information as possible about a particular biological system first, before attempting to use that knowledge to model it successfully. There will undoubtedly be ‘unknowns’, but the hope is that the level of uncertainty will be small enough to minimise the impact on the output of the model. If this is not the case, and the output is not representative of the real data, then there are two strategies to take:
1) generate new biological data to reduce uncertainty
2) generate new models that can better predict the data, by introducing new or modified connections that can ultimately be validated later.
Or to put this another way, experiments can be used to drive the modelling or modelling can help to drive the experiments. Whilst the path of the former is relatively safe, it can be incredibly laborious and time consuming, with no guarantee that the new data will ultimately help the model. In contrast, developing new ideas in silico can be much quicker, but with the obvious caveat that there is no proof that any of these new models will be correct, and therefore they must eventually be validated by new experiments. An ideal situation would be to efficiently generate new models in silico that could be ranked accordingly to their ‘probability’ of being correct. Put simply, this probability can be defined as the model that i) best conforms to a series of ‘knowns’, ii) uses the least amount of ‘unknowns’, and iii) best approximates the real biological data.
The aim of this project is to use artificial intelligence (AI) computing to generate and categorise new models that are predictive of the biological data.