SemSim

SemSim is a multiscale model-description architecture designed to facilitate the sharing, reuse, and modular construction of a wide range of biological models.

SemSim models are currently implemented as OWL (Web Ontology Language) files. A SemSim model file contains rich semantic information about a model's contents in addition to all of its computational aspects.

Motivation: Today, researchers worldwide use computational models to simulate the dynamics of the biological processes under investigation. Models might involve hundreds or thousands of variables and equations. While computational models address the immediate research at hand, they languish or are discarded as projects draw to completion. Often, they are not designed with an afterlife in mind.

As part of our goal to accelerate the way biological and medical research is conducted, we have developed the SemSim architecture so that researchers can more efficiently compose, reuse, and share simulations of biological processes.

For a model to be reused in another research effort, it must undergo a laborious, long review process by hand to ensure the model’s integrity. This obstacle considerably discourages reuse. If this scenario could be reversed so that the components of any model could be easily up-cycled and reused in other efforts through a machine-readable process, research would be dramatically accelerated.

Our vision is to create a machine readable model description format that would enable:

  • models that can be interconnected

  • models that are modularized to reuse components

  • models that use common standards to define model content

Implementation: SemSim models contain all the computational and semantic knowledge for a single biosimulation model. In order to manipulate SemSim models, we have created SemGen, a tool for creating, annotating, composing and decomposing SemSim models. Our eventual goal is to create a repository of readily-reusable SemSim models so that modelers can more efficiently build complex biosimulation models (see figure).

The SemSim approach to modular model reuse. Existing models are translated into the SemSim format and stored in a searchable repository. These models can then be downloaded, merged with others, or decomposed to produce new, reusable SemSim models that are added to the repository in turn.

SemSim models are semantically interoperable, which means a computer can recognize when contents from two models share the same biological meaning. This level of interoperability makes it possible to automate, beyond currently available methods, common model composition and decomposition tasks.

The key to semantic interoperability lies in a model's semantic annotations. In order to become semantically interoperable, SemSim model codewords and dependencies must be annotated against standardized reference ontologies. Although many ontologies of physical entities (i.e. structural parts) exist, we found that there was no existing ontology for annotating the physical properties that these entities possess. It is these physical properties that are simulated in a model, and so we are developing the Ontology of Physics for Biology (OPB) in order to create a standardized set of physical principles for use in model annotation.

Because the existing set of biological reference ontologies scales across multiple levels of biological organization (e.g. ChEBI contains molecules, the FMA contains more macroscopic entities), the SemSim architecture is inherently multi-scale. And because the OPB accounts for phenomena across the field of classical physics, the SemSim architecture is also inherently multi-domain. Therefore, the SemSim architecture can accommodate a wide variety of biosimulation models.