Datasets

The L-HetNetAligner design is general. Thus, the user may customise its use to analyse heterogeneous networks. As a proof of principle, we present the use of L-HetNetAligner on two sets of networks: 1) synthetic networks, and 2) heterogeneous network with multiple types of nodes and relationships.


Dataset: Synthetic Networks

The input dataset consists of 12 synthetic networks built using scale-free networks (SF) graph generator.

We set all model network instances to the same size of 950 nodes, and we have varied the number of edges. Then, we assign each node a colour out of n possible colours. We vary n from 1 to 4 in order to build four heterogeneous versions for each synthetic network.

Dataset: Hetionet Network

Hetionet is an heterogeneous network integrating data of medical relevance extracted from public resources. Hetionet consists of 47031 nodes of 11 types, such as genes, compounds, diseases, anatomies, pathways, biological processes, molecular functions, cellular components, pharmacologic classes, side effects, and symptoms and 2250197 relationships of 24 types .

Starting from Hetionet, we create a sub-network composed of genes, diseases, GO annotations (biological processes, molecular functions, cellular components,) and anatomy data.

We use these data to impose colours onto nodes of the Hetionet network. We build 4 coloured version of Hetionet in order to cover each type of nodes as follows:

  • 1 coloured version where all nodes have the same colour;
  • 2 coloured version where we assign a colour to nodes related to GO annotations and a colour to nodes that are not
  • related to GO annotations. We obtain 15656 “GO annotation related” and 21486 “non-GO annotation related” nodes;
  • 3 coloured version where we give a colour to nodes related to disease information and a colour to nodes related to GO annotations, and a colour to nodes not related to information different from disease information and from GO annotations. We obtain 136 disease related, 15656 GO annotation related, 21350 non-disease related and non-GO annotation related nodes;
  • 4 coloured version where we assign a colour to nodes related to genes, a colour to nodes related anatomies data, a colour to nodes related to disease information and a colour to nodes related to GO annotations. We obtain 20945 gene related, 405 anatomy related, 136 disease related, 15656 GO annotation related nodes.


Hetionet

Random