IPAVS/CIDMS-PD pathway data

CIDMS is a large scale data integration project undertaken by Systems Biology Research Group (SBRC). CIDMS-PD, one of the initiative under CIDMS project.

Initial CIDMS-PD was started as to curate information focusing on cardiovascular system. The great deal of lessons we learned while working with in a specific context is also true and applicable to other biological fields addressing wide range of  biological problems. So, recently the scope of CIDMS-PD has broadened and generalized to extend curation efforts to include all organs and tissues, concerning entire human, mouse and rat under IPAVS project. Further IPAVS pathways are curated for more biological contexts that is not under primary focus of other pathway databases.

Main tasks listed under CIDMS-PD involves collection, management and distribution of the high quality biological pathway information. The pathways curated by CIDMS-PD along with pathway and interaction information obtained form other public databases are made available to public through iPAVS web application.

Main focus of IPAVS project is curation of pathway information by experts and focuses on the following:
  • Molecular and interaction level context specific annotations
    • Context considered for curation: mainly phenotype/disease and tissue specificity. Also we will include following context in future organism, sex, perturbation, physiological and more.
    • Annotations include following information: Evidence supported with evidence code,experiments performed, conditions,citation to literature and summary
  • Annotation of pathways with their biological roles: the goal here is to capture the functions(participation in biological context) of entire pathway as entity. Such annotations is very valuable for systems biology studies analyzing large network. Also it will act as a good data resource to quickly get the overview about pathways and its importance in biological context.
  • Focuses on curation of pathways with mechanistic details. Several artifacts(marking) are included on the pathway maps such as text labels and symbols that could help users to read pathway maps and also understand the mechanism of action. see the following example pathways:
  • Pathway Cross-talk and the annotations describing cross-talk context ( level of cross-talk, conditions for cross-talk etc)
The main aim of IPAVS/CIDMS-PD is to manually curate pathway data primarily from literature ( experiment based research articles) and to organize the information as process maps using systems biology notation. IPAVS/CIDMS-PD follows a strict quality checks and uses the systematic expert monitored data pipeline (see figure) to ensure a high-quality pathway information establishing CIDMS-PD as one of the important resource providing biological pathway information.

The IPAVS/CIDMS-PD data is made available in several standard formats including SBML, BioPAX and other formats maximizing cross-compatibility with most of the existing academic and commercial analysis tools. The data can be downloaded in bulk from iPAVS website's download section and also individually(with or without annotations and as reports) from the map viewer for a specific pathway.

Data model followed by IPAVS

In IPAVS, pathway is a network consisting of set of events (processes/interactions/reactions) with its participating entities (molecules like proteins, genes, RNA, antisense RNA, low-molecular compounds/small molecules, ions and supramolecular complexes) having spatial-temporal behavior and relationships, and are responsible for achieving some biological outcome. Maps are multidimensional interlinking of several pathways together in a specific biological context (tissue, time, perturbation, disease/phenotype, physiology). Pathways can include other pathways as a node. Pathways hold annotations such as its role, context (e.g. cell types and disease) and category (signaling, metabolic, and regulatory).  

An Interaction is an event/process with its participating entity(ies) undergoing some sort of transformation from one state to another. Such transformation can occur spontaneously (conversion) or be regulated (controlled). In IPAVS the conversion process and control process are treated as distinct reactions. Examples include state transition, transcription, translation, dissociation / complex formation (binding) and transport, phenotype. The regulatory/control effects over this transformation include catalysis, inhibition, trigger, physical stimulus and modulation. All the interactions mentioned above belong to category of direct interactions. IPAVS uses Cell Designers/SBGN notation Known-transition-omitted/omitted-process (interactions that are known but omitted for the sake of clarity) and unknown-transition-omitted/uncertain-process (interactions that may not exist) to represent indirect interactions. Interactions get interlinked when they share entities. An interaction event will include annotations about its type and nature of reaction (reversible or irreversible), subcellular location, species and most importantly experimental evidence typically noted in form of the literature citations.

Entities are molecules, represented as in CellDesigner (species and species-aliases). They are protein, gene/DNA, RNA, antisense RNA, small molecule, ions, and drugs. Each of these entity types are considered as distinct entities. Any modification to an entity (e.g. post translational modification (PTM)) is considered as separate entity species than that of the original unmodified entity. Phosphorylation is the most frequently used modification type in pathways curated in IPAVS. But more than 11 types of PTM can be currently recorded in IPAVS. Similarly, macromolecular complexes (entities bound and interacting closely) are treated as distinct entity than their included components. Functionally equivalent entities are grouped together and represented by a single generalized entity called 'Entity-group' (for example a gene symbol PI3K represents all the isomers of PI3K molecules). In any given situation, any of the individual entity belonging to the generalized Entity-group can fulfill the same role (inherits all the behavior of generalized entity). Further, each individual molecules covered by the general entity-group are explicitly traceable. For example multiple isomers mapped to an entity-group can be visualized as multi boxed color/shape overlay, multi-charts overlay and multi-dimensional heat maps. IPAVS data model can also capture additional cellular details such as subcellular location, activity-state, cleaved/whole protein for all the participating pathway components. It can also support abstract entities such as Target Genes / tRNA etc. which serves as place holder for its individual members.