MetScape App for Cytoscape: Creating and Viewing Correlation Networks

NARRATOR: Hello, my name is Marci Brandenburg, and I am the Bioinformationist at the University of Michigan Taubman Health Sciences Library. Today, we learn about building correlation networks using MetScape, a Cytoscape app. MetScape can be used to visualize and interpret metabolomics and expression profiling data in the context of human metabolic networks. It can be used to visualize compound networks and display related information about reactions, enzymes, and pathways. Although not covered in this tutorial, you can also use this app to view pathway-based networks. MetScape uses data from the Edinburgh Human Metabolic Network and KEGG Compound Database. This image shows you the various workflows included in this app. This tutorial will focus on the correlation calculator and correlation networks. Currently, between 40% and 60% of experimentally measured compounds can be mapped to canonical metabolic pathways when using untargeted assays. For those compounds that do not map to a pathway, correlation-based networks are useful. For this tutorial, I will be using Cytoscape version 3.2.1 and MetScape version 3.1.1.

To install the MetScape app, you can find MetScape on the Cytoscape App Store webpage and click the Install button, or use the App Manager directly in the Cytoscape software. For this tutorial, I will use the App Manager in the Cytoscape software. First, open Cytoscape and go to the Apps Menu. Choose the first option, “App Manager”. An App Manager window should now appear. In the search box, enter “MetScape”. MetScape should now appear in the second column. Click on MetScape and then click on “Install” at the bottom of the window. Installation of the app will now occur. Once the app is successfully installed, it will appear in the Apps Menu.

The first time you open MetScape, a registration page will appear. This is a one-time, free registration.

Correlations are measures between pairs of metabolites. The Correlation Calculator is a standalone Java application that provides methods of calculating pairwise correlations among repeatedly measured entities. It is designed for use with quantitative metabolite measurements, such as Mass Spectrometry data, on a set of samples. The workflow allows inspection and/or saving of results at various stages, and the final correlation results can be dynamically imported into version 3.1 or higher of MetScape as a correlation network. This chapter will cover the Correlation Calculator.

The Correlation Calculator can be downloaded from the MetScape website. The input data file is a CSV file that contains a table of measurements across multiple samples. Although metabolites must be labeled, sample labels are optional. Samples may be in rows or columns.

After launching the calculator, click the Browse button. Select the appropriate data file and click Open. This data file includes labeled samples in rows, so I will make sure that Samples Labeled is checked and Samples in Rows is selected. Next, under Data Normalization, select Log2-Transform Data and Autoscale Data. Under Normalize Data, click Run. Click View Normalized Data to view the results. To save the data, click the Save button. If the data are already normalized before loading it into the calculator, this normalization step can be skipped. Now, I will use Pearson’s Correlations to filter out metabolites; this step is optional. To use Pearson’s Correlations, click Run under Calculate Pearson’s Correlations. You can click View Histogram to view a histogram of the maximum Pearson’s Correlation by metabolite. You can click View Heatmap to view the results as a heatmap. The View CSV File and Save buttons can be used to perform these functions.

The slider and text fields can be used to filter metabolites to those with correlation coefficients within a specified range. The last step is to use a Partial Correlation Method, either Debiased Sparse Partial Correlation or Basic Partial Correlation. I will select Debiased Sparse Partial Correlation, or DSPC, and then click Run. The Correlation Calculator calculates the partial correlation values, p-values, and q-values for each compound pair. You can click the View CSV File and/or Save buttons to perform these functions. You can click View in MetScape to view the correlation network in MetScape, where interactive visualization and exploration can be performed.

To learn more about the correlation network in MetScape, please refer to Chapter 6.

This portion of the tutorial will cover the data file formats for building a correlation network in MetScape. Two types of data file formats are accepted. The first data file format is column-based; this is the recommended format. The first row of the column-based file must have column headings of the user’s choosing. The first two columns must contain metabolite names or ids. Additional columns contain values, such as p-values and correlation values.

The second data file format is a matrix format, where the first row and first column contain metabolite names, and the rest of the rows and columns contain correlation values.

The next chapter will discuss building a correlation network in MetScape.

To build a correlation network using MetScape, go to the Apps menu and click on MetScape. You will get a menu of options. Select “Build Network” and then “Correlation-based.” Now a MetScape tab displays on the left side of your screen, in the Control Panel. Under the Input section, click the Select button. Select the location of the correlation file and click Open. A new window will appear showing potential matches, found in the MetScape database, for each compound in the input file. Use the dropdown arrows for each compound to choose the best match. If the compound is not found in the system, it will say “Not Found.” Your mapping selection will be saved, so that it will appear as the default option in the future. Next, select OK.

Under Edge Mapping, use the dropdown menu next to “Base Edges on” and select the appropriate column from your data file. For today’s example, I will select the correlation values column, labeled pcor. I will then use the dropdown menu next to Tooltip Labels and select the p-values column, labeled pval. This will allow me to see the p-values by simply mousing over an edge in the built network.

Loading a Group Definition file is optional. This is a simple 2-column file with metabolite names in the first column and group names in the second column. Group names can be anything that you choose. For today’s example, I will not load a Group Definition file.

Under Range for Edges, I can drag the blue arrows to filter the range. For this example, I will use the full range of values, negative 1 to 1. Below the edge range is the number of edges and nodes that will be in the built network. Next, I will click “Build Network”.

To learn more about the correlation network in MetScape, please refer to Chapter 6.

The MetScape correlation network consists of nodes of varying colors, and edges of varying colors and widths. To view a legend, go to the Cytoscape Apps Menu, go down to MetScape, and select Show Legend. The purple hexagons represent compounds that mapped to a known compound in the MetScape database. The white hexagons represent compounds that did not map to a known compound in the MetScape database. Pink edges represent a positive correlation, while blue edges represent a negative correlation. The thicker the edge, the stronger the correlation.

For more information, here are two citations for articles published on MetScape. In addition, there is a MetScape webpage that contains a user manual and additional videos on using MetScape. I would like to acknowledge the following people and organizations for their contributions to this tutorial.