Workshops‎ > ‎

Supercomputing (Nov. 16, 2009)

2/4/10 This site has moved. New URL is
http://systemsbiologyknowledgebase.org.

Login to new site at: http://login.systemsbiologyknowledgebase.org.



Draft Workshop Report now available - click here to download report 

Please send your comments either below, or send to kbasewiki@ornl.gov, or track changes in the Word document and send that.

Supercomputing 2009 (SC09) was in Portland, Oregon for its 21st annual conference. Recognized globally as the premier international conference on High Performance Computing (HPC), networking, storage and analysis, SC09 features interesting and innovative HPC scientific and technical applications from around the world.

Website: http://sc09.supercomputing.org/index.php

Using Clouds for Parallel Computations in Systems Biology
Held at SC09 on Monday, November 16
The aim of this workshop is to bring together researchers in the computing, systems biology, and computational biology fields. The workshop will focus particularly on applications of cloud computing. Modern genomics studies use many high-throughput instruments that generate prodigious amounts of data. For example, a single run on a current sequencing instrument generates 30-40 GB of sequence data, or one-third of a genomics sequence space (current archives of complete genomic data comprise 51 GB). The situation is further complicated by the democratization of sequencing; many small centers can now independently create large sequence datasets. Moreover, the immense amount and variety of 'omics data that must be integrated together with genomics data in order to model and study organisms at a systems level create unique opportunities in computational biology. Consequently, the rate of sequence and related data production is growing faster than our ability to analyze these data. Cloud computing provides an appealing possibility for on-demand access to computing resources. Many computations fall under the "embarrassingly parallel" header and should be ideally suited for cloud computing. However, challenging issues remain, including data transfer and local data availability on the cloud nodes. This workshop aims to bring together computer scientists, bioinformaticists, and computational biologists to discuss the feasibility of using cloud computing. Website:
http://www.mcs.anl.gov/events/workshops/sc09-sysbio/.
 
Workshop Organizers
  1. Susan Gregurick, U.S. Department of Energy
  2. Folker Meyer, Argonne National Laboratory
  3. Bob Cottingham, Oak Ridge National Laboratory
  4. Peg Folta, Lawrence Livermore National Laboratoary
  5. Elizabeth Glass, Argonne National Laboratory
Charge Questions
(To post, read, or comment on responses to these questions, see the Charge Questions subpage)
  1. What are the characteristics of applications that would be appropriate for effective utilization of cloud architecture?
  2. What are the hardware bottlenecks that prohibit cloud architectures from being easily adopted by high-throughput biological data analytics?
  3. What are specific tools that need to be developed or enhanced in order to make cloud architectures easily adopted for biological data and bioinformatics algorithms?                                                                                       

Agenda and Location of Meeting Room

Click here to view the Agenda.

The workshop will be held in Room A104. Click here to view a map of the convention center.
                      

List of Abstracts

(To post, read, or comment on abstracts, see the Abstracts subpage)

  1. Clouds in Systems Biochemistry, Christopher H. Chang Download View
  2. MapReduced Complete Composition Vectors for Genotyping, Marc Colosimo, Matthew Peterson, Scott Mardis, and Lynette Hirschman Download
  3. A Distributed Terabyte-size Parallel Processing Systems Biology Platform Based on the Hadoop/MapReduce/HBase Framework, Ronald Taylor Download
  4. Using Clouds For Data-Intensive Computing In Proteomics, Ananth Kalyanaraman, Douglas Baxter, and William Cannon Download View
  5. Architectures for the Next Generation of Systems Biologoy Informatics, Thomas Brettin Download
  6. Commodity Computing in Genomics Research, Michael Schatz, Mihai Pop, Dan Sommer, and Ben Langmead Download
  7. A Performance Comparison of Massively Parallel Sequence Matching Computations on Cloud Computing Platforms and HPC Clusters Using Hadoop, Shane Canon, Shreyas Cholia, John Shalf, Keith Jackson, Lavanya Ramakrishnan, and Victor Markowitz Download
  8. Using MapReduce Technologies in Bioinformatics and Medical Informatics, Xiaohong Qiu, Jaliya Ekanayake, Thilina Gunarathne, Jong Youl Choi, Seung-Hee Bae, Scott Beason, and Geoffrey Fox Download
List of Presentations

(To read or comment on the presentations, see the Presentations subpage)

  1. Architectures for the Next Generation of Systems Biology Informatics, Tom Brettin
  2. AWE: Pipelines for Cloud Scale Bioinformatics, Narayan Desai, Folker Meyer, Jared Wilkening, James Yang, and Andreas Wilke

  3. Challenges: How to Cope with an Explosion of Fascinating Data, Dawn Field

  4. Cloud-Based Services for Large Scale Analysis of Sequence and Expression Data: Lessons from Cistrack, Robert Grossman

  5. Cloud Computing with Nimbus, Kate Keahey

  6. CloVR: A Virtual Appliance for Automated Analysis of Sequence Data, Sam Angiuoli

  7. Commodity Computing in Genomics Research, Michael Schatz, Ben Langmead, Dan Sommer, and Mihai Pop

  8. Future Directions for Cloud Computing in Systems and Computational Biology, Susan Gregurick and Bob Cottingham

  9. Genomes in the Clouds: The UCSC Genomics Browsers and Distributed Biocomputation, David Haussler and Daniel Carlin

  10. M5 and Multi OMICS, Eugene Kolker

  11. Magellan at NERSC, Katherine Yelick

  12. Magellan Cloud at ALCF, Pete Beckman

  13. A Performance Comparison of Parallel Sequence Matching on Cloud Computing Platforms Using BLAST and Hadoop, Lavanya Ramakrishnan, Victor Markowitz, Shane Canon, Shreyas Cholia, Keith Jackson and John Shalf

  14. Cloud Computing in Systems and Computational Biology, Folker Meyer, Susan Gregurick and Peg Folta

  15. Towards a Consensus Annotation System, Owen White

  16. Using Clouds for Data-Intensive Computing in Proteomics, Ananth Kalyanaraman, Douglas Baxter, and William Cannon

  17. Using MapReduce Technologies in Bioinformatics and Medical Informatics, Judy Qiu

  18. ViPDAC, A Stand-Alone Proteomics Analysis Suite in the Cloud, Simon Twigger

  19. Please click here to view The Role of Cloud Computing in Big Biology, Deepak Singh

Kbase Workshop Summary Report
Will be posted January 25, 2010.

 

Č
ĉ
ď
Bob Cottingham,
Jan 22, 2010 9:48 AM