How Galaxy started at Minnesota

Galaxy is the informatics component of a Minnesota partnership sponsored HAITI (High through-put sequence Analysis Infrastructure Technology Investigation) project.

 

HAITI project's goal is the development of best practices and standard operating procedures encountered in Next-Generation Sequencing (NGS) process and applications from raw data generation to data management and data analysis.  Informatics infrastructure and analytical tools figure prominently in the project.

With the urgent need for an integrated informatics environment that would support different types of users and applications, the University of Minnesota HAITI team evaluated different open source analytical platform and elected the adoption and local implementation of the analytical framework named GALAXY.  Similar to an electric grid in which appliances are plugged in, GALAXY is a framework in which different applications are plugged into, greatly facilitating information capture and data flow.  The GALAXY informatics tool will provide university’s researchers with the necessary integrated environment for them to access data, run analytical workflows or pipelines as well as share information. 


Importantly, GALAXY is not restricted to the management and analysis of NGS data.  As a framework, it can be extended to other types of data and associated analytical tools, offering extensibility and scalability.



Choosing Galaxy

Galaxy has been developed by Anton Nekrutenko and his Group at Penn State, Center for Comparative Genomics and Bioinformatics.  At the University, the adoption of GALAXY was done after evaluation by expert informatics analysts who work at the translational layer between lab research and computation (UMN Masonic Cancer Center Bioinformatics, MSI bioinformatics analysts), software developers (UMN Masonic Cancer Center Bioinformatics, MSI) and international user and developer community feedback.

 

A few integrated tools exist such as the Broad’s GATK (Genome Analysis Toolkit) or the commercial CLC Bio.  Others will likely be created in the future.  At equal functionality and quality, the large and very active open-source developers community supporting Galaxy was one of the deciding factor, minimizing the longer term burden and cost of software maintenance and active development.




Galaxy Resources & Communities


  • OpenHelix Galaxy tutorial (Professionally recorded) Highly Recommended    :

http://www.openhelix.com/galaxy

OpenHelix Video Tutorial on how to use galaxy: Import, prepare, analyze data; review histories; create workflows

Provides online training exercises for user’s practice

Target audience: Biologists

  • Penn State Galaxy screencast (Galaxy team recorded)

http://galaxy.psu.edu/screencasts.html

http://main.g2.bx.psu.edu/screencast

  • Penn State Galaxy public server:

Open to all to run analysis.  Note that as a public server, you may experience slower computation speed

 

http://galaxy.psu.edu

http://main.g2.bx.psu.edu/

http://usegalaxy.org

  • Documentation:

Galaxy: A WebBased Genome Analysis Tool for Experimentalists.  Current Protocols in Molecular Biology, UNIT 19.10 DOI: 10.1002/0471142727.mb1910s89


Look under Documentation for additional Galaxy application publications