Finding Friends in Molecular Interaction Networks

1. Summary

Cells respond to external signals through protein-protein interactions. These interactions are often represented as a graph, and algorithms from graph theory can be used to generate hypotheses about protein regulation. In this module, I will introduce the computational problem of identifying candidate regulators of a specific protein of interest using molecular interaction networks. As a case study, we will aim to predict novel regulators of Fog signaling, which is involved in changing the shape of a cell. Students will try their hand at identifying candidate regulators in a newly-established Drosophila interactome, and will visualize their results using graph visualization software.

2. Presentation Materials:

Click here.

3. Hands-on Exercise(s):

https://repl.it/classroom/invite/W7aIQQe You will need to make a username and password.

4. Associated Materials/Files

NetworkX Cheat Sheet: click here.

GraphSpace Cheat Sheet: click here.

GraphSpace: https://graphspace.org/

CompBioWorkshop GraphSpace Group: https://www.graphspace.org/groups/1067

HTML Color Picker: https://htmlcolorcodes.com/color-picker/

5. Program/Software requirements

None. Hands-on activity is done through a web browser.

6. Advanced Material

Further Reading

Scale-Free Networks. Barabasi and Bonabeau, Scientific American 2003. http://barabasi.com/f/124.pdf

Data Science of the Facebook World. Stephen Wolfram's blog, 2013. http://blog.stephenwolfram.com/2013/04/data-science-of-the-facebook-world/

Interactome-based approaches to human disease. Caldera et al., Current Opinion in Systems Biology 2017. https://www.sciencedirect.com/science/article/pii/S2452310016300154

A Cell-based Assay to Investigate Non-muscle Myosin II Contractility via the Folded-gastrulation Signaling Pathway in Drosophila S2R+ Cells. Peters et al. Journal of Visual Experiments. 2018. https://www.jove.com/video/58325/a-cell-based-assay-to-investigate-non-muscle-myosin-ii-contractility

GraphSpace: stimulating interdisciplinary collaborations in network biology. Bharadwaj et al., Bioinformatics 2017. https://academic.oup.com/bioinformatics/article-abstract/33/19/3134/3867143

Ten Simple Rules for Developing Public Biological Databases. Helmy et al., PLoS Computational Biology 2016. https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005128

Code/Activity Materials

Full NetworkX 2.1 Documentation: https://networkx.github.io/documentation/stable

Full GraphSpace Python Client Documentation: http://manual.graphspace.org/projects/graphspace-python/en/latest/reference/index.html

Fly Interactome: https://github.com/annaritz/fly-interactome

7. Instructor Notes

This module focuses on visualization of network data, and begins to explore application of graph algorithms to a fly interactome based on the students' experience level. However, this type of module can be the jumping off point for more detailed discussions of different graph algorithms. When my students implement a graph algorithm in class they usually also visualize the output on a small network, as this module does. For this module, I would typically have all the algorithms "vote" for the best candidates, which makes relatively simple solutions to this problem relevant.

Using Repl.it

Repl.it is a free web-based programming environment that includes dozens of languages. Within the past two years they have piloted repl.it Classrooms, which is used in this module. A Classroom is organized as a series of assignments, which offer instructors a way to write assignment descriptions, starter code, unit tests, and a "model solution" that is available to students after they have passed the tests. The environment also allows teachers to provide customized student feedback. Students can enroll in a class through an enroll link; however the class is publicly viewable and anyone with a Repl.it account can take the course as a "self-learner." Classrooms can be made private with a fee.

Repl.it has an active developer base, and in my experience questions and feature requests are acknowledged within 24 hours. There are some drawbacks -- Unfortunately, there is no version history for in-progress assignments so students may accidentally delete their assignments. While any PyPI module is available, detecting and installing the module can make the software sluggish (especially the first time something's run). Some students may have to refresh their browser to kill a process such as an infinite loop or when working with very large datasets. However, my intro compbio class of ~30 students used it with relative ease when I tried it out last semester with a paid plan.

Repl.it lets you program in your browser, TechCrunch, March 2018 https://techcrunch.com/2018/03/15/repl-it-lets-you-program-in-your-browser/

Using GraphSpace

GraphSpace is a web server built upon Cytoscape JS (http://js.cytoscape.org/) that is designed to encourage collaboration among interdisciplinary groups. I have used it successfully in a classroom setting in two ways: In my introductory class (and in this module), I provide everyone the same username and password and allow students to post to that account; the graph title is important here to distinguish different students' graphs.

For more involved projects and more advanced students, they can make their own username and password, establish their own groups to organize graphs, and "invite" members to their groups. Graphs within a group can be seen by all members of the group; and users can make layouts of these graphs viewable to group members. Graphs can also be shared with the public, and images can be exported as PNGs. GraphSpace graphs have filtering and animation abilities, similar to the features of Cytoscape JS; these features are beyond the scope of the workshop.

GraphSpace has been extensively developed; the Python GraphSpace client has also been actively worked on (though not as thoroughly as the webserver). Errors when posting a graph to GraphSpace will lead to HTML spit out at the console. If an attribute is named incorrectly, it may be silently skipped. You can always export the JSON representation of the graph through the user interface to see what information was posted to GraphSpace.

Software Requirements for those not using Repl.it

This module requires the following software:

Python3: https://www.python.org/downloads/

Networkx2.1: https://networkx.github.io/documentation/stable/install.html (though NetworkX 1.11 may be ok too)

graphspace_python: http://manual.graphspace.org/projects/graphspace-python/en/latest/Installing.html

NetworkX has limited graph visualization capabilities for those who wish to avoid dealing with GraphSpace (https://networkx.github.io/documentation/stable/reference/drawing.html).

Cytoscape (http://www.cytoscape.org/) is also a popular network visualization tool -- GraphSpace supports export to Cytoscape.