Current Content


April 18 - May 1, 2011 (Julie Baker Phillips)

posted May 8, 2011 11:08 PM by UCmerced CompBioJournalClub   [ updated May 9, 2011 12:37 AM ]



Cover

Distinct response of yeast ribosomes to a miscoding event during translation


Howard Hughes Medical Institute, Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA

Published in Advance March 17, 2011, doi: 10.1261/rna.2623711
RNA 2011. 17: 925-932

Abstract

Numerous mechanisms have evolved to control the accuracy of translation, including a recently discovered retrospective quality control mechanism in bacteria. This quality control mechanism is sensitive to perturbations in the codon:anticodon interaction in the P site of the ribosome that trigger a dramatic loss of fidelity in subsequent tRNA and release factor selection events in the A site. These events ultimately lead to premature termination of translation in response to an initial miscoding error. In this work, we extend our investigations of this mechanism to an in vitro reconstituted Saccharomyces cerevisiae translation system. We report that yeast ribosomes do not respond to mismatches in the P site by loss of fidelity in subsequent substrate recognition events. We conclude that retrospective editing, as initially characterized in Escherichia coli, does not occur in S. cerevisiae. These results highlight potential mechanistic differences in the functional core of highly conserved ribosomes.


The EMBO Journal

The EMBO Journal 30, 1497 - 1507 (4 March 2011) | doi:10.1038/emboj.2011.58

Structural insights into cognate versus near-cognate discrimination during decoding

Xabier Agirrezabala, Eduard Schreiner, Leonardo G Trabuco, Jianlin Lei, Rodrigo F Ortiz-Meoz, Klaus Schulten, Rachel Green and Joachim Frank

Abstract

The structural basis of the tRNA selection process is investigated by cryo-electron microscopy of ribosomes programmed with UGA codons and incubated with ternary complex (TC) containing the near-cognate Trp-tRNATrp in the presence of kirromycin. Going through more than 350 000 images and employing image classification procedures, we find ~8% in which the TC is bound to the ribosome. The reconstructed 3D map provides a means to characterize the arrangement of the near-cognate aa-tRNA with respect to elongation factor Tu (EF-Tu) and the ribosome, as well as the domain movements of the ribosome. One of the interesting findings is that near-cognate tRNA's acceptor stem region is flexible and CCA end becomes disordered. The data bring direct structural insights into the induced-fit mechanism of decoding by the ribosome, as the analysis of the interactions between small and large ribosomal subunit, aa-tRNA and EF-Tu and comparison with the cognate case (UGG codon) offers clues on how the conformational signals conveyed to the GTPase differ in the two cases.






Volume 472 Number 7343  
 

nature alert

Visit Nature online
Subscribe to Nature
View Table of Contents

The science that matters. Every week.

 
   

http://www.nature.com/news/specials/phdfuture/index.html?WT.mc_id=TWT_NatureNewshttp://www.nature.com/news/specials/phdfuture/index.html?WT.mc_id=TWT_NatureNews

Published online 20 April 2011 | Nature 472, 280-282 (2011) | doi:10.1038/472280a

News Feature

Education: Rethinking PhDs

Alison McCook

Fix it, overhaul it or skip it completely — institutions and individuals are taking innovative approaches to postgraduate science training.


Published online 20 April 2011 | Nature 472, 276-279 (2011) | doi:10.1038/472276a

News Feature

Education: The PhD factory

David Cyranoski , Natasha Gilbert , Heidi Ledford , Anjali Nayar & Mohammed Yahia

The world is producing more PhDs than ever before. Is it time to stop?


Reform the PhD system or close it down Published online 20 April 2011 | Nature 472, 261 (2011) | doi:10.1038/472261a
There are too many doctoral programmes, producing too many PhDs for the job market. Shut some and change the rest, says Mark C. Taylor.

Fix the PhD Nature 472, 259–260 (21 April 2011) doi:10.1038/472259b

No longer a guaranteed ticket to an academic career, the PhD system needs a serious rethink.



Faculty of 1000

Structural insights into cognate versus near-cognate discrimination during decoding

The EMBO Journal 30, 1497 - 1507 (4 March 2011) | doi:10.1038/emboj.2011.58

Xabier Agirrezabala, Eduard Schreiner, Leonardo G Trabuco, Jianlin Lei, Rodrigo F Ortiz-Meoz, Klaus Schulten, Rachel Green and Joachim Frank

The structural basis of the tRNA selection process is investigated by cryo-electron microscopy of ribosomes programmed with UGA codons and incubated with ternary complex (TC) containing the near-cognate Trp-tRNATrp in the presence of kirromycin. Going through more than 350 000 images and employing image classification procedures, we find ~8% in which the TC is bound to the ribosome. The reconstructed 3D map provides a means to characterize the arrangement of the near-cognate aa-tRNA with respect to elongation factor Tu (EF-Tu) and the ribosome, as well as the domain movements of the ribosome. One of the interesting findings is that near-cognate tRNA's acceptor stem region is flexible and CCA end becomes disordered. The data bring direct structural insights into the induced-fit mechanism of decoding by the ribosome, as the analysis of the interactions between small and large ribosomal subunit, aa-tRNA and EF-Tu and comparison with the cognate case (UGG codon) offers clues on how the conformational signals conveyed to the GTPase differ in the two cases.




Link: Info for AuthorsLink: Editorial BoardLink: AboutLink: SubscribeLink: AdvertiseLink: ContactLink: FeedbackLink: Sitemap Link: PNAS Home
Proceedings of the National Academy of Sciences

Misacylation of specific nonmethionyl tRNAs by a bacterial methionyl-tRNA synthetase

Thomas E. Jonesa, Rebecca W. Alexanderb, and Tao Pana,1
a
Department of Biochemistry and Molecular Biology and Institute of Biophysical Dynamics, University of Chicago, Chicago, IL 60637; and bDepartment of Chemistry, Wake Forest University, Winston-Salem, NC 27109

Edited by Paul Schimmel, The Skaggs Institute for Chemical Biology, La Jolla, CA, and approved March 10, 2011 (received for review December 20, 2010)

Published online before print April 11, 2011, doi: 10.1073/pnas.1019033108
PNAS April 26, 2011 vol. 108 no. 17 6933-6938

Abstract

Aminoacyl-tRNA synthetases perform a critical step in translation by aminoacylating tRNAs with their cognate amino acids. Although high fidelity of aminoacyl-tRNA synthetases is often thought to be essential for cell biology, recent studies indicate that cells tolerate and may even benefit from tRNA misacylation under certain conditions. For example, mammalian cells selectively induce mismethionylation of nonmethionyl tRNAs, and this type of misacylation contributes to a cell’s response to oxidative stress. However, the enzyme responsible for tRNA mismethionylation and the mechanism by which specific tRNAs are mismethionylated have not been elucidated. Here we show by tRNA microarrays and filter retention that the methionyl-tRNA synthetase enzyme from Escherichia coli (EcMRS) is sufficient to mismethionylate two tRNA species, Graphic and Graphic, indicating that tRNA mismethionylation is also present in the bacterial domain of life. We demonstrate that the anticodon nucleotides of these misacylated tRNAs play a critical role in conferring mismethionylation identity. We also show that a certain low level of mismethionylation is maintained for these tRNAs, suggesting that mismethionylation levels may have evolved to confer benefits to the cell while still preserving sufficient translational fidelity to ensure cell viability. EcMRS mutants show distinct effects on mismethionylation, indicating that many regions in this synthetase enzyme influence mismethionylation. Our results show that tRNA mismethionylation can be carried out by a single protein enzyme, mismethionylation also requires identity elements in the tRNA, and EcMRS has a defined structure-function relationship for tRNA mismethionylation.


Genome sequencing of environmental Escherichia coli expands understanding of the ecology and speciation of the model bacterial species


a
Center for Bioinformatics and Computational Genomics, bSchool of Biology, and gSchool of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332;  cDivision of Infectious Diseases, Department of Internal Medicine, University of Michigan Health System, Ann Arbor, MI 48109; dResearch School of Biology, The Australian National University, Canberra, ACT 0200, Australia; eThe Broad Institute, Cambridge, MA 02142; and fCenter for Microbial Ecology, Michigan State University, East Lansing, MI 48824

Edited by W. Ford Doolittle, Dalhousie University, Halifax, Canada, and approved March 18, 2011 (received for review October 18, 2010)

Published online before print April 11, 2011, doi: 10.1073/pnas.1015622108
PNAS April 26, 2011 vol. 108 no. 17 7200-7205

Abstract

Defining bacterial species remains a challenging problem even for the model bacterium Escherichia coli and has major practical consequences for reliable diagnosis of infectious disease agents and regulations for transport and possession of organisms of economic importance. E. coli traditionally is thought to live within the gastrointestinal tract of humans and other warm-blooded animals and not to survive for extended periods outside its host; this understanding is the basis for its widespread use as a fecal contamination indicator. Here, we report the genome sequences of nine environmentally adapted strains that are phenotypically and taxonomically indistinguishable from typical E. coli (commensal or pathogenic). We find, however, that the commensal genomes encode for more functions that are important for fitness in the human gut, do not exchange genetic material with their environmental counterparts, and hence do not evolve according to the recently proposed fragmented speciation model. These findings are consistent with a more stringent and ecologic definition for bacterial species than the current definition and provide means to start replacing traditional approaches of defining distinctive phenotypes for new species with omics-based procedures. They also have important implications for reliable diagnosis and regulation of pathogenic E. coli and for the coliform cell-counting test.


A mathematical model for adaptive prediction of environmental changes by microorganisms

aDepartment of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel; and
bDepartment of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158

Edited by Susan Lindquist, Whitehead Institute for Biomedical Research, Cambridge, MA, and approved March 17, 2011 (received for review January 3, 2011)

Published online before print April 12, 2011, doi: 10.1073/pnas.1019754108 PNAS April 26, 2011 vol. 108 no. 17 7271-7276

Abstract

Survival in natural habitats selects for microorganisms that are well-adapted to a wide range of conditions. Recent studies revealed that cells evolved innovative response strategies that extend beyond merely sensing a given stimulus and responding to it on encounter. A diversity of microorganisms, including Escherichia coli, Vibrio cholerae, and several yeast species, were shown to use a predictive regulation strategy that uses the appearance of one stimulus as a cue for the likely arrival of a subsequent one. A better understanding of such a predictive strategy requires elucidating the interplay between key biological and environmental forces. Here, we describe a mathematical framework to address this challenge. We base this framework on experimental systems featuring early preparation to either a stress or an exposure to improvement in the growth medium. Our model calculates the fitness advantage originating under each regulation strategy in a given habitat. We conclude that, although a predictive response strategy might by advantageous under some ecologies, its costs might exceed the benefit in others. The combined theoretical–experimental treatment presented here helps assess the potential of natural ecologies to support a predictive behavior.


Nature Genetics

The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

Nature Genetics 43, 476–481 (2011) doi:10.1038/ng.807
Received 13 September 2010
Accepted 18 March 2011
Published online 10 April 2011
We report the 207-Mb genome sequence of the North American Arabidopsis lyrata strain MN47 based on 8.3× dideoxy sequence coverage. We predict 32,670 genes in this outcrossing species compared to the 27,025 genes in the selfing species Arabidopsis thaliana. The much smaller 125-Mb genome of A. thaliana, which diverged from A. lyrata 10 million years ago, likely constitutes the derived state for the family. We found evidence for DNA loss from large-scale rearrangements, but most of the difference in genome size can be attributed to hundreds of thousands of small deletions, mostly in noncoding DNA and transposons. Analysis of deletions and insertions still segregating in A. thaliana indicates that the process of DNA loss is ongoing, suggesting pervasive selection for a smaller genome. The high-quality reference genome sequence for A. lyrata will be an important resource for functional, evolutionary and ecological studies in the genus Arabidopsis.


No second thoughts about data access

Nature Genetics 43, 389 (2011) doi:10.1038/ng.827
Published online
27 April 2011

More data than we can handle is no excuse to give up our efforts to promote data access, but it may make us think about new ways to make it sustainable.


PLoS Genetics: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access


Fluctuations in spo0A Transcription Control Rare Developmental Transitions in Bacillus subtilis

Nicolas Mirouze1, Peter Prepiak1, David Dubnau1,2*

1 Public Health Research Center, New Jersey Medical School, Newark, New Jersey, United States of America, 2 Department of Microbiology and Molecular Genetics, New Jersey Medical School, Newark, New Jersey, United States of America

Citation: Mirouze N, Prepiak P, Dubnau D (2011) Fluctuations in spo0A Transcription Control Rare Developmental Transitions in Bacillus subtilis. PLoS Genet 7(4): e1002048. doi:10.1371/journal.pgen.1002048

Abstract Top

Phosphorylated Spo0A is a master regulator of stationary phase development in the model bacterium Bacillus subtilis, controlling the formation of spores, biofilms, and cells competent for transformation. We have monitored the rate of transcription of the spo0A gene during growth in sporulation medium using promoter fusions to firefly luciferase. This rate increases sharply during transient diauxie-like pauses in growth rate and then declines as growth resumes. In contrast, the rate of transcription of an rRNA gene decreases and increases in parallel with the growth rate, as expected for stable RNA synthesis. The growth pause-dependent bursts of spo0A transcription, which reflect the activity of the spo0A vegetative promoter, are largely independent of all known regulators of spo0A transcription. Evidence is offered in support of a “passive regulation” model in which RNA polymerase stops transcribing rRNA genes during growth pauses, thus becoming available for the transcription of spo0A. We show that the bursts are followed by the production of phosphorylated Spo0A, and we propose that they represent initial responses to stress that bring the average cell closer to the thresholds for transition to bimodally expressed developmental responses. Measurement of the numbers of cells expressing a competence marker before and after the bursts supports this hypothesis. In the absence of ppGpp, the increase in spo0A transcription that accompanies the entrance to stationary phase is delayed and sporulation is markedly diminished. In spite of this, our data contradicts the hypothesis that sporulation is initiated when a ppGpp-induced depression of the GTP pool relieves repression by CodY. We suggest that, while the programmed induction of sporulation that occurs in stationary phase is apparently provoked by increased flux through the phosphorelay, bet-hedging stochastic transitions to at least competence are induced by bursts in transcription.

April 4 -- April 17 (2011) by Emily Wilson

posted Apr 15, 2011 1:53 PM by UCmerced CompBioJournalClub   [ updated Apr 22, 2011 8:41 PM ]

Cover5 April 2011; Vol. 108, No. 14

Statistical image analysis reveals features affecting fates of Myxococcus xanthus developmental aggregates

  1. Chunyan Xiea,
  2. Haiyang Zhanga,
  3. Lawrence J. Shimketsb, and
  4. Oleg A. Igoshina,1
aDepartment of Bioengineering, Rice University, Houston, TX 77005; and
bDepartment of Microbiology, University of Georgia, Athens, GA 30602

Edited* by Armin Dale Kaiser, Stanford University School of Medicine, Stanford, CA, and approved February 23, 2011 (received for review December 8, 2010)

Published online before print March 21, 2011, doi: 10.1073/pnas.1018383108 PNAS April 5, 2011 vol. 108 no. 14 5915-5920

http://www.pnas.org/content/108/14/5915.full

Abstract

Starving Myxococcus xanthus bacteria use their motility systems to self-organize into multicellular fruiting bodies, large mounds in which cells differentiate into metabolically inert spores. Despite the identification of the genetic pathways required for aggregation and the use of microcinematography to observe aggregation dynamics in WT and mutant strains, a mechanistic understanding of aggregation is still incomplete. For example, it is not clear why some of the initial aggregates mature into fruiting bodies, whereas others disperse, merge, or split into two. Here, we develop high-throughput image quantification and statistical analysis methods to gain insight into M. xanthus developmental aggregation dynamics. A quantitative metric of features characterizing each aggregate is used to deduce the properties of the aggregates that are correlated with each fate. The analysis shows that small aggregate size but not neighbor-related parameters correlate with aggregate dispersal. Furthermore, close proximity is necessary but not sufficient for aggregate merging. Finally, splitting occurs for those aggregates that are unusually large and elongated. These observations place severe constraints on the underlying aggregation mechanisms and present strong evidence against the role of long-range morphogenic gradients or biased cell exchange in the dispersal, merging, or splitting of transient aggregates. This approach can be expanded and adapted to study self-organization in other cellular systems.


PLoS Genetics: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access

Published April 07, 2011

Genome-Wide Meta-Analysis Identifies Regions on 7p21 (AHR) and 15q24 (CYP1A2) As Determinants of Habitual Caffeine Consumption

Marilyn C. Cornelis1#, Keri L. Monda2#, Kai Yu3#, Nina Paynter4#, Elizabeth M. Azzato3, Siiri N. Bennett5, Sonja I. Berndt3, Eric Boerwinkle6, Stephen Chanock3, Nilanjan Chatterjee3, David Couper7, Gary Curhan8, Gerardo Heiss2, Frank B. Hu1, David J. Hunter1, Kevin Jacobs3, Majken K. Jensen1, Peter Kraft9, Maria Teresa Landi3, Jennifer A. Nettleton6, Mark P. Purdue3, Preetha Rajaraman3, Eric B. Rimm1, Lynda M. Rose4, Nathaniel Rothman3, Debra Silverman3, Rachael Stolzenberg-Solomon3, Amy Subar3, Meredith Yeager3, Daniel I. Chasman4*, Rob M. van Dam10*, Neil E. Caporaso3*

1 Department of Nutrition, Harvard School of Public Health, Boston, Massachusetts, United States of America, 2 Department of Epidemiology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America, 3 Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America, 4 Brigham and Women's Hospital, Boston, Massachusetts, United States of America, 5 Collaborative Health Studies Coordinating Center, University of Washington, Seattle, Washington, United States of America, 6 Division of Epidemiology, Human Genetics, and Environmental Sciences, The University of Texas Health Science Center at Houston, Houston, Texas, United States of America, 7 Department of Biostatistics, Collaborative Studies Coordinating Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America, 8 Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America, 9 Program in Molecular and Genetic Epidemiology, Harvard School of Public Health, Boston, Massachusetts, United States of America, 10 Department of Epidemiology and Public Health and Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore

Editor: Greg Gibson, Georgia Institute of Technology, United States of America

Received: November 17, 2010; Accepted: February 6, 2011; Published: April 7, 2011

http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1002033#abstract0

Abstract

We report the first genome-wide association study of habitual caffeine intake. We included 47,341 individuals of European descent based on five population-based studies within the United States. In a meta-analysis adjusted for age, sex, smoking, and eigenvectors of population variation, two loci achieved genome-wide significance: 7p21 (P = 2.4×10−19), near AHR, and 15q24 (P = 5.2×10−14), between CYP1A1 and CYP1A2. Both the AHR and CYP1A2 genes are biologically plausible candidates as CYP1A2 metabolizes caffeine and AHR regulates CYP1A2.

Author Summary

Caffeine is the most widely consumed psychoactive substance in the world. Although demographic and social factors have been linked to habitual caffeine consumption, twin studies report a large heritable component. Through a comprehensive search of the human genome involving over 40,000 participants, we discovered two loci associated with habitual caffeine consumption: the first near AHR and the second between CYP1A1 and CYP1A2. Both the AHR and CYP1A2 genes are biologically plausible candidates, as CYP1A2 metabolizes caffeine and AHR regulates CYP1A2. Caffeine intake has been associated with manifold physiologic effects and both detrimental and beneficial health outcomes. Knowledge of the genetic determinants of caffeine intake may provide insight into underlying mechanisms and may provide ways to study the potential health effects of caffeine more comprehensively.


PLoS Genetics: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access

Incorporating Biological Pathways via a Markov Random Field Model in Genome-Wide Association Studies

Min Chen1, Judy Cho2, Hongyu Zhao3*

1 Division of Biostatistics, Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America, 2 Internal Medicine, Yale University, New Haven, Connecticut, United States of America, 3 Center for Statistical Genomics and Proteomics, Department of Epidemiology and Public Health, Yale University, New Haven, Connecticut, United States of America

Editor: David B. Allison, University of Alabama at Birmingham, United States of America

Received: June 17, 2010; Accepted: February 24, 2011; Published: April 7, 2011

http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1001353

Abstract

Genome-wide association studies (GWAS) examine a large number of markers across the genome to identify associations between genetic variants and disease. Most published studies examine only single markers, which may be less informative than considering multiple markers and multiple genes jointly because genes may interact with each other to affect disease risk. Much knowledge has been accumulated in the literature on biological pathways and interactions. It is conceivable that appropriate incorporation of such prior knowledge may improve the likelihood of making genuine discoveries. Although a number of methods have been developed recently to prioritize genes using prior biological knowledge, such as pathways, most methods treat genes in a specific pathway as an exchangeable set without considering the topological structure of a pathway. However, how genes are related with each other in a pathway may be very informative to identify association signals. To make use of the connectivity information among genes in a pathway in GWAS analysis, we propose a Markov Random Field (MRF) model to incorporate pathway topology for association analysis. We show that the conditional distribution of our MRF model takes on a simple logistic regression form, and we propose an iterated conditional modes algorithm as well as a decision theoretic approach for statistical inference of each gene's association with disease. Simulation studies show that our proposed framework is more effective to identify genes associated with disease than a single gene–based method. We also illustrate the usefulness of our approach through its applications to a real data example.

Author Summary

Statistical methods used in most GWAS are based on the analysis of single markers. Prior biological information about markers, genes, and pathways is not commonly incorporated in the detection of associated disease loci. Recently a number of methods have been developed to incorporate such information, and it has been shown that they may make use of prior biological knowledge in association analysis. However, most of these methods ignore the regulatory relationships and functional interactions among genes. In this article, we propose a statistical method that can explicitly model the interactions of genes in a neighborhood defined by the topology of a pathway. Simulation studies and a real data example show that the proposed method can improve the power of identifying associated genes when they are in the neighborhood of other genes whose association has been firmly established in previous studies.


PLoS Computational Biology: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access

Published April 07, 2011

Modification of Gene Duplicability during the Evolution of Protein Interaction Network

Matteo D'Antonio, Francesca D. Ciccarelli*

Department of Experimental Oncology, European Institute of Oncology, Milan, Italy

Editor: Christos A. Ouzounis, The Centre for Research and Technology, Hellas, Greece

Received: September 29, 2010; Accepted: February 24, 2011; Published: April 7, 2011

http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002029

Abstract

Duplications of genes encoding highly connected and essential proteins are selected against in several species but not in human, where duplicated genes encode highly connected proteins. To understand when and how gene duplicability changed in evolution, we compare gene and network properties in four species (Escherichia coli, yeast, fly, and human) that are representative of the increase in evolutionary complexity, defined as progressive growth in the number of genes, cells, and cell types. We find that the origin and conservation of a gene significantly correlates with the properties of the encoded protein in the protein-protein interaction network. All four species preserve a core of singleton and central hubs that originated early in evolution, are highly conserved, and accomplish basic biological functions. Another group of hubs appeared in metazoans and duplicated in vertebrates, mostly through vertebrate-specific whole genome duplication. Such recent and duplicated hubs are frequently targets of microRNAs and show tissue-selective expression, suggesting that these are alternative mechanisms to control their dosage. Our study shows how networks modified during evolution and contributes to explaining the occurrence of somatic genetic diseases, such as cancer, in terms of network perturbations.

Author Summary

Gene copy number is often tightly controlled because it directly affects the gene dosage. In several species, including yeast, worm, and fly, genes that have a single gene copy (singleton genes) encode proteins with several connections in the protein interaction network (hubs) as well as essential proteins. Surprisingly, in mouse and human essential proteins and hubs are encoded by genes with more than one copy in the genome (duplicated genes). Here we show that these two distinct groups of hubs were acquired at different times during the evolution of protein interaction network and contribute in different ways to the cell life. Singleton hubs are ancestral genes that are conserved from prokaryotes to vertebrates and accomplish basic functions that deal with the cell survival. Duplicated hubs were acquired mostly within metazoans and duplicated through vertebrate-specific whole genome duplication. These genes are involved in processes that are crucial for the organization of multicellularity. Although duplicated, also recent hubs are subject to gene dosage control through microRNAs and tissue-selective expression. The clarification of how the protein interaction network evolves enables us to understand the adaptation to the progressive increase in complexity and to better characterize the genes involved in diseases such as cancer.


PLoS Computational Biology: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access

Multi-Scaled Explorations of Binding-Induced Folding of Intrinsically Disordered Protein Inhibitor IA3 to its Target Enzyme

Jin Wang1,2,3*, Yong Wang1, Xiakun Chu2, Stephen J. Hagen4, Wei Han2, Erkang Wang1*

1 State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, People's Republic of China, 2 College of Physics, Jilin University, Changchun, Jilin, People's Republic of China, 3 Department of Chemistry, Physics and Applied Mathematics, State University of New York at Stony Brook, Stony Brook, New York, United States of America, 4 Department of Physics, University of Florida, Gainesville, Florida, United States of America

Editor: Gerhard Hummer, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, United States of America

Received: August 17, 2010; Accepted: March 7, 2011; Published: April 7, 2011

http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1001118

Abstract

Biomolecular function is realized by recognition, and increasing evidence shows that recognition is determined not only by structure but also by flexibility and dynamics. We explored a biomolecular recognition process that involves a major conformational change – protein folding. In particular, we explore the binding-induced folding of IA3, an intrinsically disordered protein that blocks the active site cleft of the yeast aspartic proteinase saccharopepsin (YPrA) by folding its own N-terminal residues into an amphipathic alpha helix. We developed a multi-scaled approach that explores the underlying mechanism by combining structure-based molecular dynamics simulations at the residue level with a stochastic path method at the atomic level. Both the free energy profile and the associated kinetic paths reveal a common scheme whereby IA3 binds to its target enzyme prior to folding itself into a helix. This theoretical result is consistent with recent time-resolved experiments. Furthermore, exploration of the detailed trajectories reveals the important roles of non-native interactions in the initial binding that occurs prior to IA3 folding. In contrast to the common view that non-native interactions contribute only to the roughness of landscapes and impede binding, the non-native interactions here facilitate binding by reducing significantly the entropic search space in the landscape. The information gained from multi-scaled simulations of the folding of this intrinsically disordered protein in the presence of its binding target may prove useful in the design of novel inhibitors of aspartic proteinases.

Author Summary

The intrinsically disordered peptide IA3 is the endogenous inhibitor for the enzyme named yeast aspartic proteinase saccharopepsin (YPrA). In the presence of YPrA, IA3 folds itself into an amphipathic helix that blocks the active site cleft of the enzyme. We developed a multi-scaled approach to explore the underlying mechanism of this binding-induced ordering transition. Our approach combines a structure-based molecular dynamics model at the residue level with a stochastic path method at the atomic level. Our simulations suggest that IA3 inhibits YPrA through an induced-fit mechanism where the enzyme (YPrA) induces conformational change of its inhibitor (IA3). This expands the definition of an induced-fit model from its original meaning that the binding of substrate (IA3) drives conformational change in the protein (YPrA). Our result is consistent with recent kinetic experiments and provides a microscopic explanation for the underlying mechanism. We also discuss the important roles of non-native interactions and backtracking. These results enrich our understanding of the enzyme-inhibition mechanism and may have value in the design of drugs.




Cover Image

Nucl. Acids Res. Table of Contents for Vol. 39, No. 7


Integrative analysis of genomic, functional and protein interaction data predicts long-range enhancer-target gene interactions

  1. Christian Rödelsperger1,2,3,
  2. Gao Guo3,
  3. Mateusz Kolanczyk2,
  4. Angelika Pletschacher3,
  5. Sebastian Köhler1,3,
  6. Sebastian Bauer3,
  7. Marcel H. Schulz2,4 and
  8. Peter N. Robinson1,2,3,*

+ Author Affiliations

1Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, 2Max Planck Institute for Molecular Genetics, 3Institute for Medical Genetics, Charité-Universitätsmedizin, Berlin and 4International Max Planck Research School for Computational Biology and Scientific Computing, Berlin, Germany

*To whom correspondence should be addressed. Tel: +49 30 450566042; Fax: +49 30 450569915; Email: peter.robinson@charite.de

  • Received June 25, 2010.
  • Revision received October 14, 2010.
  • Accepted October 14, 2010.
Nucl. Acids Res. (2011) 39 (7): 2492-2502. doi: 10.1093/nar/gkq1081 First published online: November 24, 2010

http://nar.oxfordjournals.org/content/39/7/2492.abstract?etoc

Abstract

Multicellular organismal development is controlled by a complex network of transcription factors, promoters and enhancers. Although reliable computational and experimental methods exist for enhancer detection, prediction of their target genes remains a major challenge. On the basis of available literature and ChIP-seq and ChIP-chip data for enhanceosome factor p300 and the transcriptional regulator Gli3, we found that genomic proximity and conserved synteny predict target genes with a relatively low recall of 12–27% within 2 Mb intervals centered at the enhancers. Here, we show that functional similarities between enhancer binding proteins and their transcriptional targets and proximity in the protein–protein interactome improve prediction of target genes. We used all four features to train random forest classifiers that predict target genes with a recall of 58% in 2 Mb intervals that may contain dozens of genes, representing a better than two-fold improvement over the performance of prediction based on single features alone. Genome-wide ChIP data is still relatively poorly understood, and it remains difficult to assign biological significance to binding events. Our study represents a first step in integrating various genomic features in order to elucidate the genomic network of long-range regulatory interactions.






Cover
Bioinformatics Table of Contents for 15 April 2011; Vol. 27, No. 8

MemLoci: predicting subcellular localization of membrane proteins in eukaryotes

  1. Andrea Pierleoni1,2,
  2. Pier Luigi Martelli1 and
  3. Rita Casadio1,*

+ Author Affiliations

1Biocomputing Group, Computational Biology Network, via San Giacomo 9/2, 40126 Bologna and 2Externautics s.p.a., Via Fiorentina 1, 53100 Siena, Italy

*To whom correspondence should be addressed.

  • Received December 4, 2010.
  • Revision received February 22, 2011.
  • Accepted February 23, 2011.
Bioinformatics (2011) 27 (9): 1224-1230. doi: 10.1093/bioinformatics/btr108 First published online: March 2, 2011

http://bioinformatics.oxfordjournals.org/content/27/9/1224.full?etoc

Abstract

Motivation: Subcellular localization is a key feature in the process of functional annotation of both globular and membrane proteins. In the absence of experimental data, protein localization is inferred on the basis of annotation transfer upon sequence similarity search. However, predictive tools are necessary when the localization of homologs is not known. This is so particularly for membrane proteins. Furthermore, most of the available predictors of subcellular localization are specifically trained on globular proteins and poorly perform on membrane proteins.

Results: Here we develop MemLoci, a new support vector machine-based tool that discriminates three membrane protein localizations: plasma, internal and organelle membrane. When tested on an independent set, MemLoci outperforms existing methods, reaching an overall accuracy of 70% on predicting the location in the three membrane types, with a generalized correlation coefficient as high as 0.50.

Availability: The MemLoci server is freely available on the web at: http://mu2py.biocomp.unibo.it/memloci. Datasets described in the article can be downloaded at the same site.

Contact: casadio@biocomp.unibo.it

Supplementary information: Supplementary data are available at Bioinformatics online. 


Cover

Identifying discriminative classification-based motifs in biological sequences

  1. Celine Vens1,2,*,
  2. Marie-Noëlle Rosso2 and
  3. Etienne G. J. Danchin2

+ Author Affiliations

1Katholieke Universiteit Leuven, Department of Computer Science, Celestijnenlaan 200A, 3001 Leuven, Belgium and 2Institut National de la Recherche Agronomique, U.M.R. - I.B.S.V. INRA-UNSA-CNRS, 400 route des Chappes, BP 167, 06903 Sophia-Antipolis Cedex, France

*To whom correspondence should be addressed.

  • Received September 8, 2010.
  • Revision received January 25, 2011.
  • Accepted February 24, 2011.
Bioinformatics (2011) 27 (9): 1231-1238. doi: 10.1093/bioinformatics/btr110 First published online: March 3, 2011

http://bioinformatics.oxfordjournals.org/content/27/9/1231.full?etoc


Abstract

Motivation: Identification of conserved motifs in biological sequences is crucial to unveil common shared functions. Many tools exist for motif identification, including some that allow degenerate positions with multiple possible nucleotides or amino acids. Most efficient methods available today search conserved motifs in a set of sequences, but do not check for their specificity regarding to a set of negative sequences.

Results: We present a tool to identify degenerate motifs, based on a given classification of amino acids according to their physico-chemical properties. It returns the top K motifs that are most frequent in a positive set of sequences involved in a biological process of interest, and absent from a negative set. Thus, our method discovers discriminative motifs in biological sequences that may be used to identify new sequences involved in the same process. We used this tool to identify candidate effector proteins secreted into plant tissues by the root knot nematode Meloidogyne incognita. Our tool identified a series of motifs specifically present in a positive set of known effectors while totally absent from a negative set of evolutionarily conserved housekeeping proteins. Scanning the proteome of M.incognita, we detected 2579 proteins that contain these specific motifs and can be considered as new putative effectors.

Availability and Implementation: The motif discovery tool and the proteins used in the experiments are available at http://dtai.cs.kuleuven.be/ml/systems/merci.

Contact: celine.vens@cs.kuleuven.be

Supplementary Information: Supplementary data are available at Bioinformatics online.




Cover

Prediction of microRNA targets in Caenorhabditis elegans using a self-organizing map

  1. Liisa Heikkinen1,2,
  2. Mikko Kolehmainen3 and
  3. Garry Wong1,2,*

+ Author Affiliations

1Department of Biosciences, 2Department of Neurobiology, A.I.Virtanen Institute for Molecular Sciences, Biocenter Finland and 3Department of Environmental Science, University of Eastern Finland, Kuopio, Finland

*To whom correspondence should be addressed.

  • Received November 25, 2010.
  • Revision received February 20, 2011.
  • Accepted March 12, 2011. 
Bioinformatics (2011) 27 (9): 1247-1254. doi: 10.1093/bioinformatics/btr144 First published online: March 21, 2011

http://bioinformatics.oxfordjournals.org/content/27/9/1247.full?etoc

Abstract

Motivation: MicroRNAs (miRNAs) are small non-coding RNAs that regulate transcriptional processes via binding to the target gene mRNA. In animals, this binding is imperfect, which makes the computational prediction of animal miRNA targets a challenging task. The accuracy of miRNA target prediction can be improved with the use of machine learning methods. Previous work has described methods using supervised learning, but they suffer from the lack of adequate training examples, a common problem in miRNA target identification, which often leads to deficient generalization ability.

Results: In this work, we introduce mirSOM, a miRNA target prediction tool based on clustering of short 3-untranslated region (3-UTR) substrings with self-organizing map (SOM). As our method uses unsupervised learning and a large set of verified Caenorhabditis elegans 3-UTRs, we did not need to resort to training using a known set of targets. Our method outperforms seven other methods in predicting the experimentally verified C.elegans true and false miRNA targets.

Availability: mirSOM miRNA target predictions are available at http://kokki.uku.fi/bioinformatics/mirsom.

Contact: liisa.heikkinen@uef.fi

Supplementary information: Supplementary data are available at Bioinformatics online.


Cover

Comparative visualization of genetic and physical maps with Strudel

  1. Micha Bayer1,*,
  2. Iain Milne1,
  3. Gordon Stephen1,
  4. Paul Shaw1,
  5. Linda Cardle1,
  6. Frank Wright2 and
  7. David Marshall1

+ Author Affiliations

1Genetics Programme, Scottish Crop Research Institute and 2Biomathematics and Statistics Scotland, Invergowrie, Dundee, DD2 5DA, UK

*To whom correspondence should be addressed.

  • Received December 21, 2010.
  • Revision received February 11, 2011.
  • Accepted February 24, 2011.
Bioinformatics (2011) 27 (9): 1307-1308. doi: 10.1093/bioinformatics/btr111 First published online: March 3, 2011

http://bioinformatics.oxfordjournals.org/content/27/9/1307.full?etoc


Abstract

Summary: Data visualization can play a key role in comparative genomics, for example, underpinning the investigation of conserved synteny patterns. Strudel is a desktop application that allows users to easily compare both genetic and physical maps interactively and efficiently. It can handle large datasets from several genomes simultaneously, and allows all-by-all comparisons between these.

Availability and implementation: Installers for Strudel are available for Windows, Linux, Solaris and Mac OS X at http://bioinf.scri.ac.uk/strudel/.

Contact: strudel@scri.ac.uk; micha.bayer@scri.ac.uk

INTRODUCTION

Crop genetics is still dominated by species for which fully sequenced and well-annotated genomes are unavailable. Comparative genomics is an important means of annotating unfinished genomes, and requires powerful visualization tools that elucidate the relationships with already annotated genomes.

There are a number of tools in this area, which range from web-based applications with database back-ends to standalone desktop applications (Fang et al., 2003; Lewis et al., 2002; Meyer et al., 2009; Mueller et al., 2008; Pan et al., 2005; Sawkins et al., 2004). The challenges faced by any comparative visualization application are the increasing volume of data, fast delivery of these to users, efficient on-screen rendering of a large amount of information and layout constraints.

Here, we present Strudel, a standalone Java desktop application that aims to combine ease of installation with ease of use, and allows the simultaneous multi-way comparison of several genomes. Usability has been a major design criterion for Strudel, and in early acceptance testing users were able to start generating insights into their data within minutes of downloading the application, without having to first consult the manual. Strudel's graphical interface has been designed to reduce visual clutter as much as possible, and a critical condition for this is that homologies between two chromosomes are never drawn across other genomes.


Cover

GeCo++: a C++ library for genomic features computation and annotation in the presence of variants

  1. Matteo Cereda1,2,
  2. Manuela Sironi1,
  3. Matteo Cavalleri3 and
  4. Uberto Pozzoli1,*

+ Author Affiliations

1Bioinformatics Lab, Scientific Institute I.R.C.C.S. ‘E. Medea’, Via Don L. Monza, 23852 Bosisio Parini (LC), Italy, 2Department of Theoretical Physics, University of Turin, Via P. Giuria 1 -10125, Torino and 3Bioingineering Lab, Scientific Institute I.R.C.C.S. ‘E. Medea’, Via Don L. Monza, 23852 Bosisio Parini (LC), Italy

  1. *To whom correspondence should be addressed.
  • Received December 15, 2010.
  • Revision received February 15, 2011.
  • Accepted March 1, 2011.

Bioinformatics (2011) 27 (9): 1313-1315. doi: 10.1093/bioinformatics/btr123 First published online: March 12, 2011

http://bioinformatics.oxfordjournals.org/content/27/9/1313.full?etoc

Abstract

Summary: We propose a C++ class library developed to the purpose of making the implementation of sequence analysis algorithms easier and faster when genomic annotations and variations need to be considered. The library provides a class hierarchy to seamlessly bind together annotations of genomic elements to sequences and to algorithm results; it allows to evaluate the effect of mutations/variations in terms of both element position shifts and of algorithm results, limiting recalculation to the minimum. Particular care has been posed to keep memory and time overhead into acceptable limits.

Availability and Implementation: A complete tutorial as well as a detailed doxygen generated documentation and source code is freely available at http://bioinformatics.emedea.it/geco, under the GPL license. The library was written in standard ISO C++, and does not depend on external libraries.

Contact: uberto.pozzoli@bp.lnf.it



Cover

Rapid membrane protein topology prediction

  1. Aron Hennerdal and
  2. Arne Elofsson*

+ Author Affiliations

Department of Biochemistry and Biophysics, Stockholm Bioinformatics Center, Center for Biomembrane Research, Swedish e-science Research Center, Stockholm University, 106 91 Stockholm, Sweden

*To whom correspondence should be addressed.

  • Received December 2, 2010.
  • Revision received February 16, 2011.
  • Accepted February 28, 2011.
 
Bioinformatics
(2011) 27 (9): 1322-1323. doi: 10.1093/bioinformatics/btr119

http://bioinformatics.oxfordjournals.org/content/27/9/1322.full?etoc

Abstract

Summary: State-of-the-art methods for topology of α-helical membrane proteins are based on the use of time-consuming multiple sequence alignments obtained from PSI-BLAST or other sources. Here, we examine if it is possible to use the consensus of topology prediction methods that are based on single sequences to obtain a similar accuracy as the more accurate multiple sequence-based methods. Here, we show that TOPCONS-single performs better than any of the other topology prediction methods tested here, but ∼6% worse than the best method that is utilizing multiple sequence alignments.

Availability and Implementation: TOPCONS-single is available as a web server from http://single.topcons.net/ and is also included for local installation from the web site. In addition, consensus-based topology predictions for the entire international protein index (IPI) is available from the web server and will be updated at regular intervals.

Contact: arne@bioinfo.se

Supplementary information: Supplementary data are avaliable at Bioinformatics online.



Cover

ogaraK: a population genetics simulator for malaria

  1. Tiago Antao* and
  2. Ian M. Hastings

+ Author Affiliations

Department of Molecular and Biochemical Parasitology, Liverpool School of Tropical Medicine, Pembroke Place, Liverpool L3 5QA, UK

*To whom correspondence should be addressed.

  • Received December 19, 2010.
  • Revision received February 18, 2011.
  • Accepted March 9, 2011.
     
Bioinformatics
(2011) 27 (9): 1335-1336. doi: 10.1093/bioinformatics/btr139
First published online: March 16, 2011

http://bioinformatics.oxfordjournals.org/content/27/9/1335.full?etoc

Abstract

Motivation: The evolution of resistance in Plasmodium falciparum malaria against most available treatments is a major global health threat. Population genetics approaches are commonly used to model the spread of drug resistance. Due to uncommon features in malaria biology, existing forward-time population genetics simulators cannot suitably model Plasmodium falciparum malaria.

Results: Here we present ogaraK, a population genetics simulator for modelling the spread of drug-resistant malaria. OgaraK is designed to make malaria simulation computationally tractable as it models infections, not individual parasites. OgaraK is also able to model the life cycle of the parasite which includes both haploid and diploid phases and sexual and asexual reproduction. We also allow for the simulation of different inbreeding levels, an important difference between high and low transmission areas and a fundamental factor influencing the outcome of strategies to control or eliminate malaria.

Availability: OgaraK is available as free software (GPL) from the address http://popgen.eu/soft/ogaraK.

Contact: tra@popgen.eu

Supplementary information: Supplementary data is available at Bioinformatics online.





Science, 15 April 2011 (Volume 332, Issue 6027)
http://www.sciencemag.org/content/vol332/issue6027/index.dtl?etoc



Science 15 April 2011:
Vol. 332 no. 6027 pp. 342-346
DOI: 10.1126/science.1202998

DNA Origami with Complex Curvatures in Three-Dimensional Space

  1. Dongran Han1,2,*,
  2. Suchetan Pal1,2,
  3. Jeanette Nangreave1,2,
  4. Zhengtao Deng1,2,
  5. Yan Liu1,2,*, and
  6. Hao Yan1,2,*

+ Author Affiliations

1The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA.

2Department of Chemistry and Biochemistry, Arizona State University, Tempe, AZ 85287, USA.

*To whom correspondence should be addressed. E-mail: hao.yan@asu.edu (H.Y.); dongran.han@asu.edu (D.H.); yan_liu@asu.edu (Y.L.)

http://www.sciencemag.org/content/332/6027/342.abstract?sa_campaign=Email%2Ftoc%2F15-April-2011%2F10.1126%2Fscience.1202998

Abstract

We present a strategy to design and construct self-assembling DNA nanostructures that define intricate curved surfaces in three-dimensional (3D) space using the DNA origami folding technique. Double-helical DNA is bent to follow the rounded contours of the target object, and potential strand crossovers are subsequently identified. Concentric rings of DNA are used to generate in-plane curvature, constrained to 2D by rationally designed geometries and crossover networks. Out-of-plane curvature is introduced by adjusting the particular position and pattern of crossovers between adjacent DNA double helices, whose conformation often deviates from the natural, B-form twist density. A series of DNA nanostructures with high curvature—such as 2D arrangements of concentric rings and 3D spherical shells, ellipsoidal shells, and a nanoflask—were assembled.

  • Received for publication 18 January 2011.
  • Accepted for publication 4 March 2011.


March 17 -- April 3 (2011) by Carolin Frank

posted Apr 10, 2011 4:31 PM by UCmerced CompBioJournalClub

Molecular Systems Biology
 
Article

Subject Categories: Bioinformatics | Functional genomics

Molecular Systems Biology 7 Article number: 473  doi:10.1038/msb.2011.6
Published online: 15 March 2011
Citation: Molecular Systems Biology 7:473

Toward molecular trait-based ecology through integration of biogeochemical, geographical and metagenomic data

Jeroen Raes1,2, Ivica Letunic1, Takuji Yamada1, Lars Juhl Jensen1,3 & Peer Bork1,4

  1. Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
  2. Molecular and Cellular Interactions Department, VIB – Vrije Universiteit Brussel, Brussels, Belgium
  3. NNF Center for Protein Research, Copenhagen, Denmark
  4. Max Delbrück Center for Molecular Medicine, Berlin-Buch, Germany

Correspondence to: Peer Bork1,4 Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, Heidelberg 69117, Germany. Tel.: +49 6 221 387 8526; Fax: +49 6 221 387 8517; Email: bork@embl.de

Received 4 May 2010; Accepted 25 January 2011; Published online 15 March 2011

This is an open-access article distributed under the terms of the Creative Commons Attribution Noncommercial Share Alike 3.0 Unported License, which allows readers to alter, transform, or build upon the article and then distribute the resulting work under the same or similar license to this one. The work must be attributed back to the original author and commercial use is not permitted without specific permission.

Topof page

Abstract

Using metagenomic ‘parts lists’ to infer global patterns on microbial ecology remains a significant challenge. To deduce important ecological indicators such as environmental adaptation, molecular trait dispersal, diversity variation and primary production from the gene pool of an ecosystem, we integrated 25 ocean metagenomes with geographical, meteorological and geophysicochemical data. We find that climatic factors (temperature, sunlight) are the major determinants of the biomolecular repertoire of each sample and the main limiting factor on functional trait dispersal (absence of biogeographic provincialism). Molecular functional richness and diversity show a distinct latitudinal gradient peaking at 20°N and correlate with primary production. The latter can also be predicted from the molecular functional composition of an environmental sample. Together, our results show that the functional community composition derived from metagenomes is an important quantitative readout for molecular trait-based biogeography and ecology.


Cover

RNA Table of Contents Alert

A new issue of RNA is available online:
1 April 2011; Vol. 17, No. 4 

The below Table of Contents is available online at: http://rnajournal.cshlp.org/content/vol17/issue4/?etoc

RNAcode: Robust discrimination of coding and noncoding regions in comparative sequence data

  1. Stefan Washietl1,2,8
  2. Sven Findeiß3
  3. Stephan A. Müller4,
  4. Stefan Kalkhof4
  5. Martin von Bergen4
  6. Ivo L. Hofacker2,
  7. Peter F. Stadler2,3,5,6,7 and 
  8. Nick Goldman1

-Author Affiliations

  1. 1EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
  2. 2Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria
  3. 3Bioinformatics Group, Department of Computer Science; and Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany
  4. 4Department of Proteomics, Helmholtz Centre for Environmental Research, 04318 Leipzig, Germany
  5. 5Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany
  6. 6RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology, 04103 Leipzig, Germany
  7. 7Santa Fe Institute, Santa Fe, New Mexico 87501, USA

+

Abstract

With the availability of genome-wide transcription data and massive comparative sequencing, the discrimination of coding from noncoding RNAs and the assessment of coding potential in evolutionarily conserved regions arose as a core analysis task. Here we present RNAcode, a program to detect coding regions in multiple sequence alignments that is optimized for emerging applications not covered by current protein gene-finding software. Our algorithm combines information from nucleotide substitution and gap patterns in a unified framework and also deals with real-life issues such as alignment and sequencing errors. It uses an explicit statistical model with no machine learning component and can therefore be applied “out of the box,” without any training, to data from all domains of life. We describe the RNAcode method and apply it in combination with mass spectrometry experiments to predict and confirm seven novel short peptides in Escherichia coli and to analyze the coding potential of RNAs previously annotated as “noncoding.” RNAcode is open source software and available for all major platforms at http://wash.github.com/rnacode.


Nucleic Acids Research Table of Contents Alert

A new issue of Nucleic Acids Research is available online:
Vol. 39, No. 5
The below Table of Contents is available online at: http://nar.oxfordjournals.org/content/vol39/issue5/index.dtl

The pseudogenes of Mycobacterium lepraereveal the functional relevance of gene order within operons

  1. Enrique M. Muro1,*
  2. Nancy Mah1
  3. Gabriel Moreno-Hagelsieb2 and
  4. Miguel A. Andrade-Navarro1

+Author Affiliations

  1. 1Computational Biology and Data Mining Group, Max Delbrück Center for Molecular Medicine, Robert-Rössle Strasse 10, 13125, Berlin, Germany and 2Department of Biology, Wilfrid Laurier University. Waterloo, Ontario, Canada
  1. *To whom correspondence should be addressed. Tel: (+49) 30 9406 4227; Fax: (+49) 30 9406 4240; Email: enrique.muro@mdc-berlin.de
  • Received March 26, 2010.
  • Revision received October 13, 2010.
  • Accepted October 14, 2010.

Abstract

Almost 50 years following the discovery of the prokaryotic operon, the functional relevance of gene order within operons remains unclear. In this work, we take advantage of the eroded genome of Mycobacterium leprae to add evidence supporting the notion that functionally less important genes have a tendency to be located at the end of its operons. M. leprae’s genome includes 1133 pseudogenes and 1614 protein-coding genes and can be compared with the close genome of M. tuberculosis. Assuming M. leprae’s pseudogenes to represent dispensable genes, we have studied the position of these pseudogenes in the operons of M. leprae and of their orthologs in M. tuberculosis. We observed that both tend to be located in the 3′ (downstream) half of the operon (P-values of 0.03 and 0.18, respectively). Analysis of pseudogenes in all available prokaryotic genomes confirms this trend (P-value of 7.1 × 10−7). In a complementary analysis, we found a significant tendency for essential genes to be located at the 5′ (upstream) half of the operon (P-value of 0.006). Our work provides an indication that, in prokarya, functionally less important genes have a tendency to be located at the end of operons, while more relevant genes tend to be located toward operon starts.


tRNA 5′-end repair activities of tRNAHisguanylyltransferase (Thg1)-

like proteins from Bacteria and Archaea

  1. Bhalchandra S. Rao1,2
  2. Emily L. Maris1 and 
  3. Jane E. Jackman1,2,*

+Author Affiliations

  1. 1Department of Biochemistry and Center for RNA Biology and 2Molecular, Cellular and Developmental Biology Graduate Program, The Ohio State University, Columbus, OH 43210, USA
  1. *To whom correspondence should be addressed. Tel: +1 614 247 8097; Fax: +1 614 292 6773; Email: jackman.14@osu.edu
  • Received August 5, 2010.
  • Revision received September 30, 2010.
  • Accepted October 1, 2010.

Abstract

The tRNAHis guanylyltransferase (Thg1) family comprises a set of unique 3′–5′ nucleotide addition enzymes found ubiquitously in Eukaryotes, where they function in the critical G−1 addition reaction required for tRNAHis maturation. However, in most Bacteria and Archaea, G−1 is genomically encoded; thus post-transcriptional addition of G−1 to tRNAHis is not necessarily required. The presence of highly conserved Thg1-like proteins (TLPs) in more than 40 bacteria and archaea therefore suggests unappreciated roles for TLP-catalyzed 3′–5′ nucleotide addition. Here, we report that TLPs from Bacillus thuringiensis(BtTLP) and Methanosarcina acetivorans (MaTLP) display biochemical properties consistent with a prominent role in tRNA 5′-end repair. Unlike yeast Thg1, BtTLP strongly prefers addition of missing N+1 nucleotides to 5′-truncated tRNAs over analogous additions to full-length tRNA (kcat/KM enhanced 5–160-fold). Moreover, unlike for −1 addition, BtTLP-catalyzed additions to truncated tRNAs are not biased toward addition of G, and occur with tRNAs other than tRNAHis. Based on these distinct biochemical properties, we propose that rather than functioning solely in tRNAHis maturation, bacterial and archaeal TLPs are well-suited to participate in tRNA quality control pathways. These data support more widespread roles for 3′–5′ nucleotide addition reactions in biology than previously expected.


Nature Reviews Molecular Cell Biology


Research Highlight

Nature Reviews Molecular Cell Biology 12206 (April 2011) | doi:10.1038/nrm3095

Gene expression: Misreading the code

Joanna E. Huddleston

Gene expressionMisreading the code

Ribosomes read the genetic code by matching the base-pairing of the mRNA codon to the correct tRNA anticodon. The discovery of tRNAs with mutations in the body of the tRNA (as opposed to in the anticodon) that result in aberrant decoding showed that tRNAs are not just scaffolds for amino acids and anticodons; the structure of the tRNA itself has a role in reading the code. Now, Schmeing et al. show how these mutant tRNAs lead to miscoding. Using X-ray crystallography, they find that the mutations aid the distortion of the tRNA that is necessary for it to interact simultaneously with the codon and with elongation factor-Tu (EF-Tu) to allow catalysis of protein synthesis.



[About the cover]

Science, 25 March 2011 (Volume 331, Issue 6024) 
http://www.sciencemag.org/content/vol331/issue6024/index.dtl?etoc

Also online at Science::


Science 25 March 2011: 
Vol. 331 no. 6024 p. 1513 
DOI: 10.1126/science.331.6024.1513
  • NEWS FOCUS
MICROBIOLOGY

Going Viral: Exploring the Role Of Viruses in Our Bodies

In the past decade, scientists have learned that the vast bacterial world inside the human body plays a role in regulating the energy we take in from food, primes the immune system, and performs a variety of other functions that help maintain our health. Now, researchers are gaining similar respect for the viruses we carry around. For a start, the variety and sheer number of viruses that inhabit us put our bacterial companions to shame. Many of the viruses prey on the bacteria in our bodies, altering their numbers and diversity and shuffling genes—including genes for antibiotic resistance—from one bacterium to another. At the International Human Microbiome Congress earlier this month, one provocative, albeit preliminary, finding emerged: Infants with unexplained fevers harbor many more viruses than healthy infants.


USINESS OFFICE FEATURE

LIFE SCIENCE TECHNOLOGIES: Synthetic Genomics - Building a Better Bacterium

The May 20, 2010, online edition of Science magazine contained pieces on Brownian motion and gravitational waves, small RNAs and drug delivery--items of interest to narrow slices of the research community. One article, though, generated instant worldwide attention. Entitled "Creation of a bacterial cell controlled by a chemically synthesized genome," the report detailed the world's first "synthetic cell," and it was at once praised and panned. Watchdog groups weighed in, as did U.S. President Barack Obama. Powered by advances in DNA synthesis and genome manipulation, the study was merely a proof-of-principle: Mycoplasma mycoides JCVI-syn1.0 has no practical scientific or commercial value. Yet its cobalt blue colonies represent the living embodiment of an entirely new, and previously unimaginable, branch of biology. Welcome to the age of synthetic genomics.


PLoS Computational Biology: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access

On the Origin of DNA Genomes: Evolution of the Division of Labor between Template and Catalyst in Model Replicator Systems

Nobuto Takeuchi1*Paulien Hogeweg2Eugene V. Koonin1

1 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America, 2 Theoretical Biology and Bioinformatics Group, Utrecht University, Utrecht, The Netherlands


bstract Top

The division of labor between template and catalyst is a fundamental property of all living systems: DNA stores genetic information whereas proteins function as catalysts. The RNA world hypothesis, however, posits that, at the earlier stages of evolution, RNA acted as both template and catalyst. Why would such division of labor evolve in the RNA world? We investigated the evolution of DNA-like molecules, i.e. molecules that can function only as template, in minimal computational models of RNA replicator systems. In the models, RNA can function as both template-directed polymerase and template, whereas DNA can function only as template. Two classes of models were explored. In the surface models, replicators are attached to surfaces with finite diffusion. In the compartment models, replicators are compartmentalized by vesicle-like boundaries. Both models displayed the evolution of DNA and the ensuing division of labor between templates and catalysts. In the surface model, DNA provides the advantage of greater resistance against parasitic templates. However, this advantage is at least partially offset by the disadvantage of slower multiplication due to the increased complexity of the replication cycle. In the compartment model, DNA can significantly delay the intra-compartment evolution of RNA towards catalytic deterioration. These results are explained in terms of the trade-off between template and catalyst that is inherent in RNA-only replication cycles: DNA releases RNA from this trade-off by making it unnecessary for RNA to serve as template and so rendering the system more resistant against evolving parasitism. Our analysis of these simple models suggests that the lack of catalytic activity in DNA by itself can generate a sufficient selective advantage for RNA replicator systems to produce DNA. Given the widespread notion that DNA evolved owing to its superior chemical properties as a template, this study offers a novel insight into the evolutionary origin of DNA.

Trends in GeneticsTrends in Genetics 

Volume 27, Issue 4,  Pages 127-164 (April 2011)

Review

Horizontal gene transfer between bacteria and animals

Julie C. Dunning HotoppaE-mail The Corresponding Author

a Institute for Genome Science, Department of Microbiology & Immunology, University of Maryland School of Medicine, Baltimore, USA, MD 21201


Available online 18 February 2011. 

Horizontal gene transfer is increasingly described between bacteria and animals. Such transfers that are vertically inherited have the potential to influence the evolution of animals. One classic example is the transfer of DNA from mitochondria and chloroplasts to the nucleus after the acquisition of these organelles by eukaryotes. Even today, many of the described instances of bacteria-to-animal transfer occur as part of intimate relationships such as those of endosymbionts and their invertebrate hosts, particularly insects and nematodes, while numerous transfers are also found in asexual animals. Both of these observations are consistent with modern evolutionary theory, in particular the serial endosymbiotic theory and Muller's ratchet. Although it is tempting to suggest that these particular lifestyles promote horizontal gene transfer, it is difficult to ascertain given the nonrandom sampling of animal genome sequencing projects and the lack of a systematic analysis of animal genomes for such transfers.


PLoS Biology: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access
Read the Journal|Submit to PLoS|Get E-mail Alerts|Contact Us|




Resolving Difficult Phylogenetic Questions: Why More Sequences Are Not Enough

Hervé Philippe1*Henner Brinkmann1Dennis V. Lavrov2D. Timothy J. Littlewood3Michael Manuel4Gert Wörheide5,6Denis Baurain7

1 Département de Biochimie, Centre Robert-Cedergren, Université de Montréal, Montréal, Québec, Canada, 2 Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, Iowa, United States of America, 3 Department of Zoology, The Natural History Museum, London, United Kingdom, 4 Université Paris 6, UMR 7138 "Systématique, Adaptation, Evolution" UPMC CNRS IRD MHNH, Paris, France, 5 Department of Earth and Environmental Sciences, Ludwig-Maximilians-Universität München, München, Germany, 6 GeoBio-Center, Ludwig-Maximilians-Universität München, München, Germany, 7 Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, Liège, Belgium

Citation: Philippe H, Brinkmann H, Lavrov DV, Littlewood DTJ, Manuel M, et al. (2011) Resolving Difficult Phylogenetic Questions: Why More Sequences Are Not Enough. PLoS Biol 9(3): e1000602. doi:10.1371/journal.pbio.1000602

Academic Editor: David Penny, Massey University, New Zealand

Published: March 15, 2011

Copyright: © 2011 Philippe et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: The work was funded by NSERC (www.nserc-crsng.gc.ca), CRC (www.chairs-chaires.gc.ca), Agence Nationale de la Recherche (http://www.agence-nationale-recherche.fr​/), ARC Biomod (www.cfwb.be), and DFG (http://www.dfg.de/en/index.jsp). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Abbreviations: BS, bootstrap support; EST, expressed sequence tag; LBA, long branch attraction

* E-mail: herve.philippe@umontreal.ca

In the quest to reconstruct the Tree of Life, researchers have increasingly turned to phylogenomics, the inference of phylogenetic relationships using genome-scale data (Box 1). Mesmerized by the sustained increase in sequencing throughput, many phylogeneticists entertained the hope that the incongruence frequently observed in studies using single or a few genes [1] would come to an end with the generation of large multigene datasets. Yet, as so often happens, reality has turned out to be far more complex, as three recent large-scale analyses, one published in PLoS Biology [2][4], make clear. The studies, which deal with the early diversification of animals, produced highly incongruent (Box 2) findings despite the use of considerable sequence data (see Figure 1). Clearly, merely adding more sequences is not enough to resolve the inconsistencies.


  BMC Biology   

Methodology article

A next-generation sequencing method for overcoming the multiple gene copy problem in polyploid phylogenetics, applied to Poa grasses

Philippa C Griffin emailCharles Robin email and Ary A Hoffmann email

BMC Biology 2011, 9:19doi:10.1186/1741-7007-9-19

Published:23 March 2011

Abstract (provisional)

Background

Polyploidy is important from a phylogenetic perspective because of its immense past impact on evolution and its potential future impact on diversification, survival and adaptation, especially in plants. Molecular population genetics studies of polyploid organisms have been difficult because of problems in sequencing multiple-copy nuclear genes using Sanger sequencing. This paper describes a method for sequencing a barcoded mixture of targeted gene regions using next-generation sequencing methods to overcome these problems.


[About the cover]

Science, 1 April 2011 (Volume 332, Issue 6025) 
http://www.sciencemag.org/content/vol332/issue6025/index.dtl?etoc

Also online at Science::

Science 1 April 2011: 
Vol. 332 no. 6025 pp. 43-44 
DOI: 10.1126/science.1200486
  • PERSPECTIVE
IMMUNOLOGY

Danger, Microbes, and Homeostasis

  1. Brian P. Lazzaro1 and 
  2. Jens Rolff2

+Author Affiliations

  1. 1Department of Entomology, Cornell University, Ithaca, NY 14853, USA.
  2. 2Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK.
  1. E-mail: jor@sheffield.ac.uk

The immune system is conventionally viewed as a means to fight infection. It has become clear, however, that what is considered the “immune” system has also evolved to maintain homeostasis and regulate commensal microbes that normally inhabit the body. Such varied functions demand nuanced and context-appropriate control of immune responses. The thoughts on how immunity becomes activated include two views: by recognition of “nonself” molecules of infectious agents (1) or by recognition of “danger” signals—host molecules released by damaged host cells (2). Empirical evidence supports both models, but also reveals their limits. Insights from recent studies on insect immune systems, which are generalizable to vertebrates, suggest that the two models may be compatible. That is, a host determines the balance of nonself elicitors and danger signals to decide when to activate the immune system against pathogenic infection while also maintaining healthy relationships with commensals.

  • REVIEW

Beyond Predictions: Biodiversity Conservation in a Changing Climate

  1. Terence P. Dawson1
  2. Stephen T. Jackson2
  3. Joanna I. House3
  4. Iain Colin Prentice3,4,5, and
  5. Georgina M. Mace4,6,*

+Author Affiliations

  1. 1School of the Environment, University of Dundee, Dundee DD1 4HN, Scotland, UK.
  2. 2Department of Botany, Program in Ecology, and Berry Biodiversity Conservation Center, University of Wyoming, Laramie, WY 82071, USA.
  3. 3QUEST, Department of Earth Sciences, University of Bristol, Bristol BS8 1RJ, UK.
  4. 4Grantham Institute for Climate Change and Division of Biology, Imperial College London, London SW7 2AZ, UK.
  5. 5Department of Biological Sciences, Macquarie University, North Ryde, NSW 2109, Australia.
  6. 6Centre for Population Biology, Imperial College London, Ascot SL5 7PY, UK.
  1. *To whom correspondence should be addressed. E-mail: g.mace@imperial.ac.uk

ABSTRACT

Climate change is predicted to become a major threat to biodiversity in the 21st century, but accurate predictions and effective solutions have proved difficult to formulate. Alarming predictions have come from a rather narrow methodological base, but a new, integrated science of climate-change biodiversity assessment is emerging, based on multiple sources and approaches. Drawing on evidence from paleoecological observations, recent phenological and microevolutionary responses, experiments, and computational models, we review the insights that different approaches bring to anticipating and managing the biodiversity consequences of climate change, including the extent of species’ natural resilience. We introduce a framework that uses information from different sources to identify vulnerability and to support the design of conservation responses. Although much of the information reviewed is on species, our framework and conclusions are also applicable to ecosystems, habitats, ecological communities, and genetic diversity, whether terrestrial, marine, or fresh water.

  • REPORT

Bacteria-Phage Antagonistic Coevolution in Soil

  1. Pedro Gómez1,2,* and 
  2. Angus Buckling1,3

+Author Affiliations

  1. 1Department of Zoology, University of Oxford, Oxford OX1 3PS, UK.
  2. 2Centro de Edafología y Biología Aplicada del Segura, Consejo Superior de Investigaciones Científicas (CEBAS-CSIC), Murcia (Espinardo) 30100, Spain.
  3. 3Biosciences, University of Exeter, Penryn TR10 9EZ, UK.
  1. *To whom correspondence should be addressed. E-mail: pedro.gomezlopez@zoo.ox.ac.uk

ABSTRACT

Bacteria and their viruses (phages) undergo rapid coevolution in test tubes, but the relevance to natural environments is unclear. By using a “mark-recapture” approach, we showed rapid coevolution of bacteria and phages in a soil community. Unlike coevolution in vitro, which is characterized by increases in infectivity and resistance through time (arms race dynamics), coevolution in soil resulted in hosts more resistant to their contemporary than past and future parasites (fluctuating selection dynamics). Fluctuating selection dynamics, which can potentially continue indefinitely, can be explained by fitness costs constraining the evolution of high levels of resistance in soil. These results suggest that rapid coevolution between bacteria and phage is likely to play a key role in structuring natural microbial communities.

PLoS Computational Biology: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access


RESEARCH ARTICLE

Reconciliation of Genome-Scale Metabolic Reconstructions for Comparative Systems Analysis

Matthew A. Oberhardt1#Jacek Puchałka2#Vítor A. P. Martins dos Santos1,2*Jason A. Papin1*

1 Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States of America, 2 Helmholtz Center for Infection Research (HZI), Braunschweig, Germany

Abstract Top

In the past decade, over 50 genome-scale metabolic reconstructions have been built for a variety of single- and multi- cellular organisms. These reconstructions have enabled a host of computational methods to be leveraged for systems-analysis of metabolism, leading to greater understanding of observed phenotypes. These methods have been sparsely applied to comparisons between multiple organisms, however, due mainly to the existence of differences between reconstructions that are inherited from the respective reconstruction processes of the organisms to be compared. To circumvent this obstacle, we developed a novel process, termed metabolic network reconciliation, whereby non-biological differences are removed from genome-scale reconstructions while keeping the reconstructions as true as possible to the underlying biological data on which they are based. This process was applied to two organisms of great importance to disease and biotechnological applications, Pseudomonas aeruginosa andPseudomonas putida, respectively. The result is a pair of revised genome-scale reconstructions for these organisms that can be analyzed at a systems level with confidence that differences are indicative of true biological differences (to the degree that is currently known), rather than artifacts of the reconstruction process. The reconstructions were re-validated with various experimental data after reconciliation. With the reconciled and validated reconstructions, we performed a genome-wide comparison of metabolic flexibility between P. aeruginosa and P. putida that generated significant new insight into the underlying biology of these important organisms. Through this work, we provide a novel methodology for reconciling models, present new genome-scale reconstructions of P. aeruginosa and P. putida that can be directly compared at a network level, and perform a network-wide comparison of the two species. These reconstructions provide fresh insights into the metabolic similarities and differences between these important Pseudomonads, and pave the way towards full comparative analysis of genome-scale metabolic reconstructions of multiple species.


Noise Contributions in an Inducible Genetic Switch: A Whole-Cell Simulation Study


Elijah Roberts1,2Andrew Magis3Julio O. Ortiz4,Wolfgang Baumeister4Zaida Luthey-Schulten1,2,3*

1 Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America, 2 Center for the Physics of Living Cells, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America, 3 Center for Biophysics and Computational Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America, 4 Department of Molecular Structural Biology, Max Planck Institute of Biochemistry, Martinsreid, Germany

Abstract Top

Stochastic expression of genes produces heterogeneity in clonal populations of bacteria under identical conditions. We analyze and compare the behavior of the inducible lac genetic switch using well-stirred and spatially resolved simulations for Escherichia coli cells modeled under fast and slow-growth conditions. Our new kinetic model describing the switching of the lac operon from one phenotype to the other incorporates parameters obtained from recently published in vivosingle-molecule fluorescence experiments along with in vitro rate constants. For the well-stirred system, investigation of the intrinsic noise in the circuit as a function of the inducer concentration and in the presence/absence of the feedback mechanism reveals that the noise peaks near the switching threshold. Applying maximum likelihood estimation, we show that the analytic two-state model of gene expression can be used to extract stochastic rates from the simulation data. The simulations also provide mRNA–protein probability landscapes, which demonstrate that switching is the result of crossing both mRNA and protein thresholds. Using cryoelectron tomography of an E. coli cell and data from proteomics studies, we construct spatial in vivo models of cells and quantify the noise contributions and effects on repressor rebinding due to cell structure and crowding in the cytoplasm. Compared to systems without spatial heterogeneity, the model for the fast-growth cells predicts a slight decrease in the overall noise and an increase in the repressors rebinding rate due to anomalous subdiffusion. The tomograms for E. coli grown under slow-growth conditions identify the positions of the ribosomes and the condensed nucleoid. The smaller slow-growth cells have increased mRNA localization and a larger internal inducer concentration, leading to a significant decrease in the lifetime of the repressor–operator complex and an increase in the frequency of transcriptional bursts.



Genome Research 


Directed networks reveal genomic barriers and DNA repair bypasses to lateral 

gene transfer among prokaryotes

  1. Ovidiu Popa1
  2. Einat Hazkani-Covo2
  3. Giddy Landan3,
  4. William Martin1 and 
  5. Tal Dagan1,4

+Author Affiliations

  1. 1 Institute of Botany III, Heinrich-Heine University Düsseldorf, Düsseldorf 40225, Germany;
  2. 2 Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27705, USA;
  3. 3 Department of Biology and Biochemistry, University of Houston, Houston, Texas 77204-5001, USA

    Abstract

    Lateral gene transfer (LGT) plays a major role in prokaryote evolution with only a few genes that are resistant to it; yet the nature and magnitude of barriers to lateral transfer are still debated. Here, we implement directed networks to investigate donor–recipient events of recent lateral gene transfer among 657 sequenced prokaryote genomes. For 2,129,548 genes investigated, we detected 446,854 recent lateral gene transfer events through nucleotide pattern analysis. Among these, donor–recipient relationships could be specified through phylogenetic reconstruction for 7% of the pairs, yielding 32,028 polarized recent gene acquisition events, which constitute the edges of our directed networks. We find that the frequency of recent LGT is linearly correlated both with genome sequence similarity and with proteome similarity of donor–recipient pairs. Genome sequence similarity accounts for 25% of the variation in gene-transfer frequency, with proteome similarity adding only 1% to the variability explained. The range of donor–recipient GC content similarity within the network is extremely narrow, with 86% of the LGTs occurring between donor–recipient pairs having ≤5% difference in GC content. Hence, genome sequence similarity and GC content similarity are strong barriers to LGT in prokaryotes. But they are not insurmountable, as we detected 1530 recent transfers between distantly related genomes. The directed network revealed that recipient genomes of distant transfers encode proteins of nonhomologous end-joining (NHEJ; a DNA repair mechanism) far more frequently than the recipient lacking that mechanism. This implicates NHEJ in genes spread across distantly related prokaryotes through bypassing the donor–recipient sequence similarity barrier.

    Nature Structural & Molecular Biology

    NATURE STRUCTURAL & MOLECULAR BIOLOGY | ARTICLE


    How mutations in tRNA distant from the anticodon affect the fidelity of decoding

    Nature Structural & Molecular Biology
     
    18,
     
    432–436
     
    (2011)
     
    doi:10.1038/nsmb.2003
    Received
     
    19 October 2010
     
    Accepted
     
    08 December 2010
     
    Published online
     
    06 March 2011

    Abstract

    The ribosome converts genetic information into protein by selecting aminoacyl tRNAs whose anticodons base-pair to an mRNA codon. Mutations in the tRNA body can perturb this process and affect fidelity. The Hirsh suppressor is a well-studied tRNATrp harboring a G24A mutation that allows readthrough of UGA stop codons. Here we present crystal structures of the 70S ribosome complexed with EF-Tu and aminoacyl tRNA (native tRNATrp, G24A tRNATrp or the miscoding A9C tRNATrp) bound to cognate UGG or near-cognate UGA codons, determined at 3.2-Å resolution. The A9C and G24A mutations lead to miscoding by facilitating the distortion of tRNA required for decoding. A9C accomplishes this by increasing tRNA flexibility, whereas G24A allows the formation of an additional hydrogen bond that stabilizes the distortion. Our results also suggest that each native tRNA will adopt a unique conformation when delivered to the ribosome that allows accurate decoding.


    NATURE STRUCTURAL & MOLECULAR BIOLOGY | ARTICLE


    mRNA translocation occurs during the second step of ribosomal intersubunit rotation

    Nature Structural & Molecular Biology
     
    18,
     
    457–462
     
    (2011)
     
    doi:10.1038/nsmb.2011
    Received
     
    09 June 2010
     
    Accepted
     
    15 December 2010
     
    Published online
     
    13 March 2011

    Abstract

    During protein synthesis, mRNA and tRNA undergo coupled translocation through the ribosome in a process that is catalyzed by elongation factor G(EF-G). On the basis of cryo-EM reconstructions, counterclockwise and clockwise rotational movements between the large and small ribosomal subunits have been implicated in a proposed ratcheting mechanism to drive the unidirectional movement of translocation. We used a combination of two fluorescence-based approaches to study the timing of these events, intersubunit fluorescence resonance energy transfer measurements to observe relative rotational movement of the subunits, and a fluorescence quenching assay to monitor translocation of mRNA. Binding of EF-G–GTP first induces rapid counterclockwise intersubunit rotation, followed by a slower, clockwise reversal of the rotational movement. We compared the rates of these movements and found that mRNA translocation occurs during the second, clockwise rotation event, corresponding to the transition from the hybrid state to the classical state.

    Feb 28 - Mar 17 (2011) by Alyssa Carrell

    posted Mar 28, 2011 11:42 AM by UCmerced CompBioJournalClub

    A Boolean-based systems biology approach to predict novel genes associated with cancer: Application to colorectal cancer

    Shivashankar H Nagaraj email and Antonio Reverter email

    Computational and Systems Biology, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Division of Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St. Lucia, Brisbane, Queensland 4067, Australia

     author email corresponding author email

    BMC Systems Biology 2011, 5:35doi:10.1186/1752-0509-5-35

    Published:26 February 2011

    Abstract

    Background

    Cancer has remarkable complexity at the molecular level, with multiple genes, proteins, pathways and regulatory interconnections being affected. We introduce a systems biology approach to study cancer that formally integrates the available genetic, transcriptomic, epigenetic and molecular knowledge on cancer biology and, as a proof of concept, we apply it to colorectal cancer.

    Results

    We first classified all the genes in the human genome into cancer-associated and non-cancer-associated genes based on extensive literature mining. We then selected a set of functional attributes proven to be highly relevant to cancer biology that includes protein kinases, secreted proteins, transcription factors, post-translational modifications of proteins, DNA methylation and tissue specificity. These cancer-associated genes were used to extract 'common cancer fingerprints' through these molecular attributes, and a Boolean logic was implemented in such a way that both the expression data and functional attributes could be rationally integrated, allowing for the generation of a guilt-by-association algorithm to identify novel cancer-associated genes. Finally, these candidate genes are interlaced with the known cancer-related genes in a network analysis aimed at identifying highly conserved gene interactions that impact cancer outcome. We demonstrate the effectiveness of this approach using colorectal cancer as a test case and identify several novel candidate genes that are classified according to their functional attributes. These genes include the following: 1) secreted proteins as potential biomarkers for the early detection of colorectal cancer (FXYD1GUCA2B, REG3A); 2) kinases as potential drug candidates to prevent tumor growth (CDC42BPB, EPHB3, TRPM6); and 3) potential oncogenic transcription factors (CDK8MEF2C, ZIC2).

    Conclusion

    We argue that this is a holistic approach that faithfully mimics cancer characteristics, efficiently predicts novel cancer-associated genes and has universal applicability to the study and advancement of cancer research.

    Genome Research

    Adaptive seeds tame genomic sequence comparison

    1. Szymon M. Kiełbasa1
    2. Raymond Wan2
    3. Kengo Sato3
    4. Paul Horton2 and
    5. Martin C. Frith2,4

    +Author Affiliations

    1. 1 Department of Computational Biology, Max Planck Institute for Molecular Genetics, Berlin D-14195, Germany;
    2. 2 Computational Biology Research Center, Tokyo 135-0064, Japan;
    3. 3 Graduate School of Frontier Sciences, University of Tokyo, Chiba 277-8561, Japan

      Abstract

      The main way of analyzing biological sequences is by comparing and aligning them to each other. It remains difficult, however, to compare modern multi-billionbase DNA data sets. The difficulty is caused by the nonuniform (oligo)nucleotide composition of these sequences, rather than their size per se. To solve this problem, we modified the standard seed-and-extend approach (e.g., BLAST) to use adaptive seeds. Adaptive seeds are matches that are chosen based on their rareness, instead of using fixed-length matches. This method guarantees that the number of matches, and thus the running time, increases linearly, instead of quadratically, with sequence length. LAST, our open source implementation of adaptive seeds, enables fast and sensitive comparison of large sequences with arbitrarily nonuniform composition.

      Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons

      1. Brian J. Haas1,9
      2. Dirk Gevers1
      3. Ashlee M. Earl1
      4. Mike Feldgarden1,
      5. Doyle V. Ward1
      6. Georgia Giannoukos1
      7. Dawn Ciulla1,
      8. Diana Tabbaa1,
      9. Sarah K. Highlander2,3
      10. Erica Sodergren4
      11. Barbara Methé5,
      12. Todd Z. DeSantis6
      13. The Human Microbiome Consortium,
      14. Joseph F. Petrosino2,3
      15. Rob Knight7,8 and 
      16. Bruce W. Birren1

      +Author Affiliations

      1. 1 Genome Sequencing and Analysis Program, The Broad Institute, Cambridge, Massachusetts 02142, USA;
      2. 2 Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;
      3. 3 Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, Texas 77030, USA;
      4. 4 The Genome Center, Washington University School of Medicine, St. Louis, Missouri 63108, USA;
      5. 5 Human Genomic Medicine, J. Craig Venter Institute, Rockville, Maryland 20850, USA;
      6. 6 Earth Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA;
      7. 7 Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309, USA;
      8. 8 Howard Hughes Medical Institute, University of Colorado, Boulder, Colorado 80309, USA

        Abstract

        Bacterial diversity among environmental samples is commonly assessed with PCR-amplified 16S rRNA gene (16S) sequences. Perceived diversity, however, can be influenced by sample preparation, primer selection, and formation of chimeric 16S amplification products. Chimeras are hybrid products between multiple parent sequences that can be falsely interpreted as novel organisms, thus inflating apparent diversity. We developed a new chimera detection tool called Chimera Slayer (CS). CS detects chimeras with greater sensitivity than previous methods, performs well on short sequences such as those produced by the 454 Life Sciences (Roche) Genome Sequencer, and can scale to large data sets. By benchmarking CS performance against sequences derived from a controlled DNA mixture of known organisms and a simulated chimera set, we provide insights into the factors that affect chimera formation such as sequence abundance, the extent of similarity between 16S genes, and PCR conditions. Chimeras were found to reproducibly form among independent amplifications and contributed to false perceptions of sample diversity and the false identification of novel taxa, with less-abundant species exhibiting chimera rates exceeding 70%. Shotgun metagenomic sequences of our mock community appear to be devoid of 16S chimeras, supporting a role for shotgun metagenomics in validating novel organisms discovered in targeted sequence surveys.

        BIO::Phylo-phyloinformatic analysis using perl

        Rutger A Vos1 emailJason Caravas2 emailKlaas Hartmann3 emailMark A Jensen4 email and Chase Miller5 email

        School of Biological Sciences, University of Reading, UK

        Department of Biological Sciences, Wayne State University, Detroit, MI, USA

        Tasmanian Aquaculture and Fisheries Institute, University of Tasmania, Australia

        Fortinbras Research, Rockville, MD, USA

        Center for Infection and Immunity, Columbia University, New York, NY, USA

         author email corresponding author email

        BMC Bioinformatics 2011, 12:63doi:10.1186/1471-2105-12-63

        Published:27 February 2011

        Abstract

        Background

        Phyloinformatic analyses involve large amounts of data and metadata of complex structure. Collecting, processing, analyzing, visualizing and summarizing these data and metadata should be done in steps that can be automated and reproduced. This requires flexible, modular toolkits that can represent, manipulate and persist phylogenetic data and metadata as objects with programmable interfaces.

        Results

        This paper presents Bio::Phylo, a Perl5 toolkit for phyloinformatic analysis. It implements classes and methods that are compatible with the well-known BioPerl toolkit, but is independent from it (making it easy to install) and features a richer API and a data model that is better able to manage the complex relationships between different fundamental data and metadata objects in phylogenetics. It supports commonly used file formats for phylogenetic data including the novel NeXML standard, which allows rich annotations of phylogenetic data to be stored and shared. Bio::Phylo can interact with BioPerl, thereby giving access to the file formats that BioPerl supports. Many methods for data simulation, transformation and manipulation, the analysis of tree shape, and tree visualization are provided.

        Conclusions

        Bio::Phylo is composed of 59 richly documented Perl5 modules. It has been deployed successfully on a variety of computer architectures (including various Linux distributions, Mac OS X versions, Windows, Cygwin and UNIX-like systems). It is available as open source (GPL) software fromhttp://search.cpan.org/dist/Bio-Phylo webcite

        compomics-utilities: an open-source Java library for computational proteomics

        Harald Barsnes1,2 emailMarc Vaudel3 emailNiklaas Colaert4,5 emailKenny Helsens4,5 emailAlbert Sickmann3 emailFrode S Berven1 email and Lennart Martens4,5 email

        Proteomics Unit, Department of Biomedicine, University of Bergen, Norway

        Computational Biology Unit, UniComputing, Bergen, Norway

        Leibniz - Institut für Analytische Wissenschaften - ISAS - e.V., Dortmund, Germany

        Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium

        Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium

         author email corresponding author email

        BMC Bioinformatics 2011, 12:70doi:10.1186/1471-2105-12-70

        Published:8 March 2011

        Abstract

        Background

        The growing interest in the field of proteomics has increased the demand for software tools and applications that process and analyze the resulting data. And even though the purpose of these tools can vary significantly, they usually share a basic set of features, including the handling of protein and peptide sequences, the visualization of (and interaction with) spectra and chromatograms, and the parsing of results from various proteomics search engines. Developers typically spend considerable time and effort implementing these support structures, which detracts from working on the novel aspects of their tool.

        Results

        In order to simplify the development of proteomics tools, we have implemented an open-source support library for computational proteomics, called compomics-utilities. The library contains a broad set of features required for reading, parsing, and analyzing proteomics data. compomics-utilities is already used by a long list of existing software, ensuring library stability and continued support and development.

        Conclusions

        As a user-friendly, well-documented and open-source library, compomics-utilities greatly simplifies the implementation of the basic features needed in most proteomics tools. Implemented in 100% Java, compomics-utilities is fully portable across platforms and architectures. Our library thus allows the developers to focus on the novel aspects of their tools, rather than on the basic functions, which can contribute substantially to faster development, and better tools for proteomics.

        Cover of Current Issue of ScienceScience 4 March 2011: 
        Vol. 331 no. 6021 pp. 1185-1188 
        DOI: 10.1126/science.1199707
        • REPORT

        Pseudomonas sax Genes Overcome Aliphatic Isothiocyanate–Mediated Non-Host Resistance in Arabidopsis

        1. Jun Fan1,*
        2. Casey Crooks1,2,
        3. Gary Creissen1
        4. Lionel Hill1
        5. Shirley Fairhurst1
        6. Peter Doerner3,4,*, and
        7. Chris Lamb1,

        +Author Affiliations

        1. 1John Innes Centre, Norwich NR4 7UH, UK.
        2. 2USDA Forest Products Laboratory, 1 Gifford Pinchot Drive, Madison, WI 53726, USA.
        3. 3Institute of Molecular Plant Sciences, University of Edinburgh, Mayfield Road, Edinburgh EH9 3JH, UK.
        4. 4Laboratoire de Physiologie Cellulaire Végétale, CNRS, CEA, INRA, and Université Joseph Fourier, F-38000 Grenoble, France

        +Author Notes

        •  Deceased.

        1. *To whom correspondence should be addressed. E-mail: jun.fan@bbsrc.ac.uk (J.F.); peter.doerner@ed.ac.uk(P.D.)
        1.  These authors contributed equally to this work.

        ABSTRACT

        Most plant-microbe interactions do not result in disease; natural products restrict non-host pathogens. We found that sulforaphane (4-methylsulfinylbutyl isothiocyanate), a natural product derived from aliphatic glucosinolates, inhibits growth in Arabidopsis of non-host Pseudomonasbacteria in planta. Multiple sax genes (saxCAB/F/D/G) were identified in Pseudomonas species virulent on Arabidopsis. These sax genes are required to overwhelm isothiocyanate-based defenses and facilitate a disease outcome, especially in the young leaves critical for plant survival. Introduction of saxCAB genes into non-host strains enabled them to overcome these Arabidopsisdefenses. Our study shows that aliphatic isothiocyanates, previously shown to limit damage by herbivores, are also crucial, robust, and developmentally regulated defenses that underpin non-host resistance in the Arabidopsis-Pseudomonas pathosystem.

        • Received for publication 28 October 2010.
        • Accepted for publication 24 January 2011.

        • Science 18 March 2011: 
          Vol. 331 no. 6023 pp. 1383-1384 
          DOI: 10.1126/science.331.6023.1383
          • NEWS FOCUS
          CONSERVATION ECOLOGY

          Embracing Invasives

          Much of the fauna and flora of the Galápagos islands is unique, but introduced species are taking over. Conservationists have spent the past 50 years attempting to remove introduced species and restore the islands' flora and fauna to prehuman days. There have been some successes: Goats have been eliminated from several islands. But the effort to eradicate blackberry, guava, and 34 other invasive plant species has cost more than $1 million and succeeded in eliminating just four. The most invasive and problematic of these aliens—blackberry and guava—have developed into forests where nothing else grows, birds cannot nest, and even insects are rare. The main reason for this failure is that invasive plants are far more competitive than native plants. Seeds of invasive species, such as blackberries, are long-lived and accumulate in high numbers in the soil, and restoration activities can have the paradoxical effect of stimulating them to germinate. Now, a group of maverick ecologists is promoting the idea that the addition of nonnative species to natives in a region leads to "novel" or "hybrid" ecosystems that have ecological value and may be worthy of conservation.

        • Cover

          The population dynamics of bacteria in physically structured habitats and the adaptive virtue of random motility

          1. Yan Weia,b
          2. Xiaolin Wangc
          3. Jingfang Liuc
          4. Ilya Nememand,e,
          5. Amoolya H. Singha,e
          6. Howie Weissc, and 
          7. Bruce R. Levina,1

          +Author Affiliations

          1. aDepartment of Biology,
          2. bGraduate Program in Population Biology, Ecology, and Evolution,
          3. dDepartments of Physics and Biology, and
          4. eComputational and Life Sciences Strategic Initiative, Emory University, Atlanta, GA 30322; and
          5. cDepartment of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332
          1. Edited* by Robert May, University of Oxford, Oxford, United Kingdom, and approved January 12, 2011 (received for review September 9, 2010)

          Abstract

          Why is motility so common in bacteria? An obvious answer to this ecological and evolutionary question is that in almost all habitats, bacteria need to go someplace and particularly in the direction of food. Although the machinery required for motility and chemotaxis (acquiring and processing the information needed to direct movement toward nutrients) are functionally coupled in contemporary bacteria, they are coded for by different sets of genes. Moreover, information that resources are more abundant elsewhere in a habitat would be of no value to a bacterium unless it already had the means to get there. Thus, motility must have evolved before chemotaxis, and bacteria with flagella and other machinery for propulsion in random directions must have an advantage over bacteria relegated to moving at the whim of external forces alone. However, what are the selection pressures responsible for the evolution and maintenance of undirected motility in bacteria? Here we use a combination of mathematical modeling and experiments with Escherichia coli to generate and test a parsimonious and ecologically general hypothesis for the existence of undirected motility in bacteria: it enables bacteria to move away from each other and thereby obtain greater individual shares of resources in physically structured environments. The results of our experiments not only support this hypothesis, but are quantitatively and qualitatively consistent with the predictions of our model.

          Clustering to identify RNA conformations constrained by secondary structure

          1. Adelene Y. L. Sima and 
          2. Michael Levittb,1

          +Author Affiliations

          1. aDepartment of Applied Physics, Stanford University, Stanford, CA 94305; and
          2. bDepartment of Structural Biology, Stanford University School of Medicine, D100 Fairchild Building, Stanford, CA 94305
          1. Contributed by Michael Levitt, December 21, 2010 (sent for review October 24, 2010)

          Abstract

          RNA often folds hierarchically, so that its sequence defines its secondary structure (helical base-paired regions connected by single-stranded junctions), which subsequently defines its tertiary fold. To preserve base-pairing and chain connectivity, the three-dimensional conformations that RNA can explore are strongly confined compared to when secondary structure constraints are not enforced. Using three examples, we studied how secondary structure confines and dictates an RNA’s preferred conformations. We made use of Macromolecular Conformations by SYMbolic programming (MC-Sym) fragment assembly to generate RNA conformations constrained by secondary structure. Then, to understand the correlations between different helix placements and orientations, we robustly clustered all RNA conformations by employing unique methods to remove outliers and estimate the best number of conformational clusters. We observed that the preferred conformation (as judged by largest cluster size) for each type of RNA junction molecule tested is consistent with its biological function. Further, the improved quality of models in our pruned datasets facilitates subsequent discrimination using scoring functions based either on statistical analysis (knowledge based) or experimental data.


          Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample

          1. J. Gregory Caporasoa
          2. Christian L. Lauberb
          3. William A. Waltersc,
          4. Donna Berg-Lyonsb
          5. Catherine A. Lozuponea,
          6. Peter J. Turnbaughd,
          7. Noah Fiererb,e, and 
          8. Rob Knighta,f,1

          +Author Affiliations

          1. aDepartment of Chemistry and Biochemistry,
          2. bCooperative Institute for Research in Environmental Sciences,
          3. cDepartment of Molecular, Cellular, and Developmental Biology, and
          4. eDepartment of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO 80309;
          5. dHarvard FAS Center for Systems Biology, Cambridge, MA 02138; and
          6. fHoward Hughes Medical Institute, Boulder, CO 80309
          1. Edited by Jeffrey I. Gordon, Washington University School of Medicine, St. Louis, MO, and approved April 30, 2010 (received for review February 27, 2010)

          Abstract

          The ongoing revolution in high-throughput sequencing continues to democratize the ability of small groups of investigators to map the microbial component of the biosphere. In particular, the coevolution of new sequencing platforms and new software tools allows data acquisition and analysis on an unprecedented scale. Here we report the next stage in this coevolutionary arms race, using the Illumina GAIIx platform to sequence a diverse array of 25 environmental samples and three known “mock communities” at a depth averaging 3.1 million reads per sample. We demonstrate excellent consistency in taxonomic recovery and recapture diversity patterns that were previously reported on the basis of metaanalysis of many studies from the literature (notably, the saline/nonsaline split in environmental samples and the split between host-associated and free-living communities). We also demonstrate that 2,000 Illumina single-end reads are sufficient to recapture the same relationships among samples that we observe with the full dataset. The results thus open up the possibility of conducting large-scale studies analyzing thousands of samples simultaneously to survey microbial communities at an unprecedented spatial and temporal resolution.



          Journal of Computational Biology
          naiveBayesCall: An Efficient Model-Based Base-Calling Algorithm for High-Throughput Sequencing

          To cite this article:
          Wei-Chun Kao, Yun S. Song. Journal of Computational Biology. March 2011, 18(3): 365-377. doi:10.1089/cmb.2010.0247.

          Published in Volume: 18 Issue 3: March 8, 2011

          Full Text: • HTML • PDF for printing (918.9 KB) • PDF w/ links (919.7 KB)


          Wei-Chun Kao1 and
          Yun S. Song1,2
          1Department of EECS, University of California, Berkeley, California.
          2Department of Statistics, University of California, Berkeley, California.
          Address correspondence to:
          Dr. Yun S. Song
          Departments of EECS and Statistics
          University of California
          Berkeley, CA 94720, USA
          E-mail: yss@eecs.berkeley.edu

          Abstract

          Immense amounts of raw instrument data (i.e., images of fluorescence) are currently being generated using ultra high-throughput sequencing platforms. An important computational challenge associated with this rapid advancement is to develop efficient algorithms that can extract accurate sequence information from raw data. To address this challenge, we recently introduced a novel model-based base-calling algorithm that is fully parametric and has several advantages over previously proposed methods. Our original algorithm, called BayesCall, significantly reduced the error rate, particularly in the later cycles of a sequencing run, and also produced useful base-specific quality scores with a high discrimination ability. Unfortunately, however, BayesCall is too computationally expensive to be of broad practical use. In this article, we build on our previous model-based approach to devise an efficient base-calling algorithm that is orders of magnitude faster than BayesCall, while still maintaining a comparably high level of accuracy. Our new algorithm is called naiveBayesCall, and it utilizes approximation and optimization methods to achieve scalability. We describe the performance of naiveBayesCall and demonstrate how improved base-calling accuracy may facilitate de novo assembly and SNP detection when the sequence coverage depth is low to moderate.

          Molecular Systems Biology

          Synopsis

          Subject Categories: Bioinformatics | Functional genomics

          Molecular Systems Biology 7 Article number: 473  doi:10.1038/msb.2011.6
          Published online: 15 March 2011
          Citation: Molecular Systems Biology 7:473

          Toward molecular trait-based ecology through integration of biogeochemical, geographical and metagenomic data

          Jeroen Raes1,2, Ivica Letunic1, Takuji Yamada1, Lars Juhl Jensen1,3 & Peer Bork1,4

          1. Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
          2. Molecular and Cellular Interactions Department, VIB – Vrije Universiteit Brussel, Brussels, Belgium
          3. NNF Center for Protein Research, Copenhagen, Denmark
          4. Max Delbrück Center for Molecular Medicine, Berlin-Buch, Germany

          Correspondence to: Peer Bork1,4 Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, Heidelberg 69117, Germany. Tel.: +49 6 221 387 8526; Fax: +49 6 221 387 8517; Email: bork@embl.de

          Received 4 May 2010; Accepted 25 January 2011; Published online 15 March 2011

          Topof page

          Article highlights

          • Climatic factors drive functional and phylogenetic composition of ocean microbial communities.
          • Function dispersal is controlled by environmental conditions.
          • Functional richness has a clear latitudinal gradient and correlates with primary production.
          • Metagenomic data can be used as a predictor for ecosystem processes.
          • To understand the relationship between community composition and environment, functional readouts are the most direct. Metagenomic data enable such trait-based ecology at the molecular level.
          Topof page

          Synopsis

          Metagenomics (shotgun sequencing of pooled DNA of complete microbial communities) is widely used to investigate ecosystem functioning of environmental and clinical samples. However, the nature of this data (usually a gigantic collection of gene fragments of 1000s of organisms) makes it very hard to infer global patterns on microbial ecology of the environment at hand. To address important ecological questions such as ‘How do microbial communities adapt to the environmental conditions?’, ‘What drives the functional variation across the globe and to what extent do genes disperse?’ and ‘What drives variation of CO2 uptake across different locations and communities?’, we integrated 25 ocean metagenomes from the Global Ocean Sampling project with geographical, meteorological and geophysicochemical data. We find that climatic factors (temperature, sunlight) are the major determinants of the functional and phylogenetic composition of an environment and the main limiting factor on whether functions dispersal across the planet. We find a distinct latitudinal gradient in the size and diversity of the functional repertoire of ocean microbial communities, peaking at 20°N, and which correlates with oceanic CO2 uptake. The latter can also be predicted from the molecular functional composition of an environmental sample. Together, our results show that the functional community composition derived from metagenomes can be used as quantitative predictor for molecular trait-based biogeography and ecology.


          PLoS Computational Biology: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access
          Read the Journal|Submit to PLoS|Get E-mail Alerts|Contact Us|

          Dynamic Phenotypic Clustering in Noisy Ecosystems

          Morten Ernebjerg1Roy Kishony1,2*

          1 Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America, 2 School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, United States of America

          Abstract Top

          In natural ecosystems, hundreds of species typically share the same environment and are connected by a dense network of interactions such as predation or competition for resources. Much is known about how fixed ecological niches can determine species abundances in such systems, but far less attention has been paid to patterns of abundances in randomly varying environments. Here, we study this question in a simple model of competition between many species in a patchy ecosystem with randomly fluctuating environmental conditions. Paradoxically, we find that introducing noise can actually induce ordered patterns of abundance-fluctuations, leading to a distinct periodic variation in the correlations between species as a function of the phenotypic distance between them; here, difference in growth rate. This is further accompanied by the formation of discrete, dynamic clusters of abundant species along this otherwise continuous phenotypic axis. These ordered patterns depend on the collective behavior of many species; they disappear when only individual or pairs of species are considered in isolation. We show that they arise from a balance between the tendency of shared environmental noise to synchronize species abundances and the tendency for competition among species to make them fluctuate out of step. Our results demonstrate that in highly interconnected ecosystems, noise can act as an ordering force, dynamically generating ecological patterns even in environments lacking explicit niches.

          Author Summary Top

          In natural ecosystems, hundreds of species with different characteristics typically live side by side, some competing for the same foods and some preying on others. A central question in ecology is how the abundance of a given species in such an ecosystem depends on its particular characteristics (its phenotype). Clearly, fixed environments can favor certain phenotypes (thick fur in a cold climate), but what happens when environmental conditions fluctuate randomly as e.g. the weather does? We investigated this question using a simple mathematical model of an ecosystem with many competing species. We found that, paradoxically, randomness in the environment can lead to the appearance of ordered clusters of abundant species with similar phenotypes, with the species adopting intermediate phenotypes being much less abundant (a mountains-and-valleys pattern). The clusters move around so that different phenotypes are favored at different times. We found that these effects arise from the tension between the tendency of noise to level out difference in abundances and the tendency of competition to create larger abundance differences.


          ScienceDirect® Home
          A protein fold classifier formed by fusing different modes of pseudo amino acid composition via PSSM

          Kaveh Kavousia, Behzad Moshiria, Mehdi SadeghibdCorresponding Author Contact InformationE-mail The Corresponding AuthorE-mail The Corresponding Author, Babak N. Araabia and Ali Akbar Moosavi-Movahedic

          a Control and Intelligent Processing Center of Excellence, School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran

          b National Institute of Genetic Engineering and Biotechnology, Tehran, Iran

          c Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran

          d School of Computer Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran

          Received 18 July 2010;  
          revised 13 November 2010;  
          accepted 13 December 2010.  
          Available online 17 December 2010. 

          Abstract

          Protein function is related to its chemical reaction to the surrounding environment including other proteins. On the other hand, this depends on the spatial shape and tertiary structure of protein and folding of its constituent components in space. The correct identification of protein domain fold solely using extracted information from protein sequence is a complicated and controversial task in the current computational biology. In this article a combined classifier based on the information content of extracted features from the primary structure of protein has been introduced to face this challenging problem. In the first stage of our proposed two-tier architecture, there are several classifiers each of which is trained with a different sequence based feature vector. Apart from the application of the predicted secondary structure, hydrophobicity, van der Waals volume, polarity, polarizability, and different dimensions of pseudo-amino acid composition vectors in similar studies, the position specific scoring matrix (PSSM) has also been used to improve the correct classification rate (CCR) in this study. Using K-fold cross validation on training dataset related to 27 famous folds of SCOP, the 28 dimensional probability output vector from each evidence theoretic K-NN classifier is used to determine the information content or expertness of corresponding feature for discrimination in each fold class. In the second stage, the outputs of classifiers for test dataset are fused using Sugeno fuzzy integral operator to make better decision for target fold class. The expertness factor of each classifier in each fold class has been used to calculate the fuzzy integral operator weights. Results make it possible to provide deeper interpretation about the effectiveness of each feature for discrimination in target classes for query proteins.

          Graphical abstract

          .

          Full-size image 

          Research highlights

          right triangle, filled In this study we use combined classifier for identification of protein domain fold. right triangle, filled Information content of extracted features of protein has been introduced to face this problem. right triangle, filled We show that position specific scoring matrix improves the correct classification rate. right triangle, filled Results provide deeper interpretation about the effectiveness of each feature for discrimination.

          Keywords: Sequence based feature; Position specific scoring matrix; Information content; Protein fold classification; Combined classifier




        Feb 14 through Feb 27 by Joshua Phillips

        posted Mar 14, 2011 9:33 AM by UCmerced CompBioJournalClub   [ updated Mar 28, 2011 10:48 AM ]



        A quality metric for homology modeling: the H-factor

        Eric di Luccio1,2 email and Patrice Koehl1 email

        Computer Science Department, Room 4337, Genome Center, GBSF University of California Davis 451 East Health Sciences Drive Davis, CA 95616, USA

        School of Applied Biosciences, Kyungpook National University (KNU), 1370 Sangyeok-dong, Buk-gu, Daegu, 702-701, Republic of Korea

        BMC Bioinformatics 2011, 12:48

        http://dx.doi.org/10.1186/1471-2105-12-48


        Published: 4 February 2011

        Abstract

        Background

        The analysis of protein structures provides fundamental insight into most biochemical functions and consequently into the cause and possible treatment of diseases. As the structures of most known proteins cannot be solved experimentally for technical or sometimes simply for time constraints, in silico protein structure prediction is expected to step in and generate a more complete picture of the protein structure universe. Molecular modeling of protein structures is a fast growing field and tremendous works have been done since the publication of the very first model. The growth of modeling techniques and more specifically of those that rely on the existing experimental knowledge of protein structures is intimately linked to the developments of high resolution, experimental techniques such as NMR, X-ray crystallography and electron microscopy. This strong connection between experimental and in silico methods is however not devoid of criticisms and concerns among modelers as well as among experimentalists.

        Results

        In this paper, we focus on homology-modeling and more specifically, we review how it is perceived by the structural biology community and what can be done to impress on the experimentalists that it can be a valuable resource to them. We review the common practices and provide a set of guidelines for building better models. For that purpose, we introduce the H-factor, a new indicator for assessing the quality of homology models, mimicking the R-factor in X-ray crystallography. The methods for computing the H-factor is fully described and validated on a series of test cases.

        Conclusions

        We have developed a web service for computing the H-factor for models of a protein structure. This service is freely accessible at http://koehllab.genomecenter.ucdavis.edu/toolkit/h-factor webcite.

        A novel approach to the clustering of microarray data via nonparametric density estimation

        Riccardo De Bin email and Davide Risso email

        Department of Statistical Sciences, University of Padova, Padova, Italy

        BMC Bioinformatics 2011, 12:49

        http://dx.doi.org/10.1186/1471-2105-12-49

        Abstract

        Background

        Cluster analysis is a crucial tool in several biological and medical studies dealing with microarray data. Such studies pose challenging statistical problems due to dimensionality issues, since the number of variables can be much higher than the number of observations.

        Results

        Here, we present a general framework to deal with the clustering of microarray data, based on a three-step procedure: (i) gene filtering; (ii) dimensionality reduction; (iii) clustering of observations in the reduced space. Via a nonparametric model-based clustering approach we obtain promising results both in simulated and real data.

        Conclusions

        The proposed algorithm is a simple and effective tool for the clustering of microarray data, in an unsupervised setting.


        slide 1


        Intercellular Nanotubes Mediate Bacterial Communication

        Authors

        Cell, Volume 144, Issue 4, 590-600, 18 February 2011

        http://dx.doi.org/10.1016/j.cell.2011.01.015

        Summary

        Bacteria are known to communicate primarily via secreted extracellular factors. Here we identify a previously uncharacterized type of bacterial communication mediated by nanotubes that bridge neighboring cells. Using Bacillus subtilis as a model organism, we visualized transfer of cytoplasmic fluorescent molecules between adjacent cells. Additionally, by coculturing strains harboring different antibiotic resistance genes, we demonstrated that molecular exchange enables cells to transiently acquire nonhereditary resistance. Furthermore, nonconjugative plasmids could be transferred from one cell to another, thereby conferring hereditary features to recipient cells. Electron microscopy revealed the existence of variously sized tubular extensions bridging neighboring cells, serving as a route for exchange of intracellular molecules. These nanotubes also formed in an interspecies manner, between B. subtilis and Staphylococcus aureus, and even between B. subtilis and the evolutionary distant bacterium Escherichia coli. We propose that nanotubes represent a major form of bacterial communication in nature, providing a network for exchange of cellular molecules within and between species.

        Site down at time of composition (links via NCBI-PubMed):

        Massive genomic rearrangement acquired in a single catastrophic event during cancer development.

        Stephens PJ, Greenman CD, Fu B, Yang F, Bignell GR, Mudie LJ, Pleasance ED, Lau KW, Beare D, Stebbings LA, McLaren S, Lin ML, McBride DJ, Varela I, Nik-Zainal S, Leroy C, Jia M, Menzies A, Butler AP, Teague JW, Quail MA, Burton J, Swerdlow H, Carter NP, Morsberger LA, Iacobuzio-Donahue C, Follows GA, Green AR, Flanagan AM, Stratton MR, Futreal PA, Campbell PJ.

        Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.

        Cell. 2011 Jan 7;144(1):27-40

        http://www.ncbi.nlm.nih.gov/sites/entrez/21215367?dopt=Abstract&holding=f1000%2Cf1000m

        Abstract

        Cancer is driven by somatically acquired point mutations and chromosomal rearrangements, conventionally thought to accumulate gradually over time. Using next-generation sequencing, we characterize a phenomenon, which we term chromothripsis, whereby tens to hundreds of genomic rearrangements occur in a one-off cellular crisis. Rearrangements involving one or a few chromosomes crisscross back and forth across involved regions, generating frequent oscillations between two copy number states. These genomic hallmarks are highly improbable if rearrangements accumulate over time and instead imply that nearly all occur during a single cellular catastrophe. The stamp of chromothripsis can be seen in at least 2%-3% of all cancers, across many subtypes, and is present in ∼25% of bone cancers. We find that one, or indeed more than one, cancer-causing lesion can emerge out of the genomic crisis. This phenomenon has important implications for the origins of genomic remodeling and temporal emergence of cancer.



        Modeling aqueous solvation with semi-explicit assembly

        1. Christopher J. Fennella,
        2. Charles W. Kehoeb, and
        3. Ken A. Dilla,1

        - Author Affiliations

        1. aDepartment of Pharmaceutical Chemistry, and
        2. bGraduate Group in Bioinformatics, University of California, San Francisco, CA 94143
        1. Contributed by Ken A. Dill, December 1, 2010 (sent for review October 1, 2010)

        PNAS February 22, 2011 vol. 108 no. 8 3234-3239

        http://dx.doi.org/10.1073/pnas.1017130

        Abstract

        We describe a computational solvation model called semi-explicit assembly (SEA). SEA water captures much of the physics of explicit-solvent models but with computational speeds approaching those of implicit-solvent models. We use an explicit-water model to precompute properties of water solvation shells around simple spheres, then assemble a solute’s solvation shell by combining the shells of these spheres. SEA improves upon implicit-solvent models of solvation free energies by accounting for local solute curvature, accounting for near-neighbor nonadditivities, and treating water’s dipole as being asymmetrical with respect to positive or negative solute charges. SEA does not involve parameter fitting, because parameters come from the given underlying explicit-solvation model. SEA is about as accurate as explicit simulations as shown by comparisons against four different homologous alkyl series, a set of 504 varied solutes, solutes taken retrospectively from two solvation-prediction events, and a hypothetical polar-solute series, and SEA is about 100-fold faster than Poisson–Boltzmann calculations.

        Experimental support for the evolution of symmetric protein architecture from a simple peptide motif

        1. Jihun Lee and
        2. Michael Blaber1

        - Author Affiliations

        1. Department of Biomedical Sciences, Florida State University, Tallahassee FL 32306-4300
        1. Edited* by Brian W. Matthews, University of Oregon, Eugene, OR, and approved November 10, 2010 (received for review October 6, 2010)

        PNAS January 4, 2011 vol. 108 no. 1 126-130

        http://dx.doi.org/10.1073/pnas.1015032108

        Abstract

        The majority of protein architectures exhibit elements of structural symmetry, and “gene duplication and fusion” is the evolutionary mechanism generally hypothesized to be responsible for their emergence from simple peptide motifs. Despite the central importance of the gene duplication and fusion hypothesis, experimental support for a plausible evolutionary pathway for a specific protein architecture has yet to be effectively demonstrated. To address this question, a unique “top-down symmetric deconstruction” strategy was utilized to successfully identify a simple peptide motif capable of recapitulating, via gene duplication and fusion processes, a symmetric protein architecture (the threefold symmetric β-trefoil fold). The folding properties of intermediary forms in this deconstruction agree precisely with a previously proposed “conserved architecture” model for symmetric protein evolution. Furthermore, a route through foldable sequence-space between the simple peptide motif and extant protein fold is demonstrated. These results provide compelling experimental support for a plausible evolutionary pathway of symmetric protein architecture via gene duplication and fusion processes.

        News lead:

        Leading the dog of selection by its mutational nose

        1. Daniel S. Fisher1

        + Author Affiliations

        1. Department of Applied Physics, Stanford University, Stanford, CA 94305
        PNAS February 15, 2011 vol. 108 no. 7 2633-2634

        http://dx.doi.org/10.1073/pnas.1100339108

        There are two simple caricatures of evolutionary dynamics: the phenotypic caricature focuses on continuous and predictable selection on variability of quantitative traits, whereas the genotypic caricature focuses on discrete, stochastic mutations. Although the apparent contradictions between these pictures were reconciled long ago, our quantitative understanding of the interplay between them is still surprisingly primitive. Indeed, even the simplest models of the dynamics of large asexual populations in which many alleles and many new mutations contribute to the evolving fitness have resisted solution. The PNAS paper by Hallatschek ( 1) is a substantial advance in the development of the mathematical methods needed to analyze these and more complex models.

        Referenced article from above article:

        The noisy edge of traveling waves

        1. Oskar Hallatschek1

        - Author Affiliations

        1. Biophysics and Evolutionary Dynamics Group, Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
        1. Edited by* Pierre C. Hohenberg, New York University, New York, NY, and approved November 30, 2010 (received for review September 12, 2010)

        PNAS February 1, 2011 vol. 108 no. 5 1783-1787

        http://dx.doi.org/10.1073/pnas.1013529108

        Abstract

        Traveling waves are ubiquitous in nature and control the speed of many important dynamical processes, including chemical reactions, epidemic outbreaks, and biological evolution. Despite their fundamental role in complex systems, traveling waves remain elusive because they are often dominated by rare fluctuations in the wave tip, which have defied any rigorous analysis so far. Here, we show that by adjusting nonlinear model details, noisy traveling waves can be solved exactly. The moment equations of these tuned models are closed and have a simple analytical structure resembling the deterministic approximation supplemented by a nonlocal cutoff term. The peculiar form of the cutoff shapes the noisy edge of traveling waves and is critical for the correct prediction of the wave speed and its fluctuations. Our approach is illustrated and benchmarked using the example of fitness waves arising in simple models of microbial evolution, which are highly sensitive to number fluctuations. We demonstrate explicitly how these models can be tuned to account for finite population sizes and determine how quickly populations adapt as a function of population size and mutation rates. More generally, our method is shown to apply to a broad class of models, in which number fluctuations are generated by branching processes. Because of this versatility, the method of model tuning may serve as a promising route toward unraveling universal properties of complex discrete particle systems.

        Aphid genome expression reveals host–symbiont cooperation in the production of amino acids

        1. Allison K. Hansen1 and
        2. Nancy A. Moran

        - Author Affiliations

        1. Department of Ecology and Evolutionary Biology, Yale University, West Haven, CT 06516-7388
        1. Edited by Trudy F. C. Mackay, North Carolina State University, Raleigh, NC, and approved January 3, 2011 (received for review September 8, 2010)

        PNAS February 15, 2011 vol. 108 no. 7 2849-2854

        http://dx.doi.org/10.1073/pnas.1013465108

        Abstract

        The evolution of intimate symbiosis requires the coordination of gene expression and content between the distinct partner genomes; this coordination allows the fusion of capabilities of each organism into a single integrated metabolism. In aphids, the 10 essential amino acids are scarce in the phloem sap diet and are supplied by the obligate bacterial endosymbiont (Buchnera), which lives inside specialized cells called bacteriocytes. Although Buchnera’s genome encodes most genes for essential amino acid biosynthesis, several genes in essential amino acid pathways are missing, as are most genes for production of nonessential amino acids. Additionally, it is unresolved whether the supply of nitrogen for amino acid biosynthesis is supplemented by recycling of waste ammonia. We compared pea aphid gene expression between bacteriocytes and other body tissues using RNA sequencing and pathway analysis and exploiting the genome sequences available for both partners. We found that 26 genes underlying amino acid biosynthesis were up-regulated in bacteriocytes. Seven of these up-regulated genes fill the gaps of Buchnera’s essential amino acid pathways. In addition, genes underlying five nonessential amino acid pathways lost from Buchnera are up-regulated in bacteriocytes. Finally, our results reveal that two genes, glutamine synthetase and glutamate synthase, which potentially work together in the incorporation of ammonium nitrogen into glutamate (GOGAT) cycle to assimilate ammonia into glutamate, are up-regulated in bacteriocytes. Thus, host gene expression and symbiont capabilities are closely integrated within bacteriocytes, which function as specialized organs of amino acid production. Furthermore, the GOGAT cycle may be a key source of nitrogen fueling the integrated amino acid metabolism of the aphid–Buchnera partnership.

        Exercise training increases size of hippocampus and improves memory

        1. Kirk I. Ericksona,
        2. Michelle W. Vossb,c,
        3. Ruchika Shaurya Prakashd,
        4. Chandramallika Basake,
        5. Amanda Szabof,
        6. Laura Chaddockb,c,
        7. Jennifer S. Kimb,
        8. Susie Heob,c,
        9. Heloisa Alvesb,c,
        10. Siobhan M. Whitef,
        11. Thomas R. Wojcickif,
        12. Emily Maileyf,
        13. Victoria J. Vieiraf,
        14. Stephen A. Martinf,
        15. Brandt D. Pencef,
        16. Jeffrey A. Woodsf,
        17. Edward McAuleyb,f, and
        18. Arthur F. Kramerb,c,1

        - Author Affiliations

        1. aDepartment of Psychology, University of Pittsburgh, Pittsburgh, PA 15260;
        2. bBeckman Institute for Advanced Science and Technology, and
        3. fDepartment of Kinesiology and Community Health, University of Illinois, Champaign-Urbana, IL 61801;
        4. cDepartment of Psychology, University of Illinois, Champaign-Urbana, IL 61820;
        5. dDepartment of Psychology, Ohio State University, Columbus, OH 43210; and
        6. eDepartment of Psychology, Rice University, Houston, TX 77251
        1. Edited* by Fred Gage, Salk Institute, San Diego, CA, and approved December 30, 2010 (received for review October 23, 2010)

        PNAS February 15, 2011 vol. 108 no. 7 3017-3022

        http://dx.doi.org/10.1073/pnas.1015950108

        Abstract

        The hippocampus shrinks in late adulthood, leading to impaired memory and increased risk for dementia. Hippocampal and medial temporal lobe volumes are larger in higher-fit adults, and physical activity training increases hippocampal perfusion, but the extent to which aerobic exercise training can modify hippocampal volume in late adulthood remains unknown. Here we show, in a randomized controlled trial with 120 older adults, that aerobic exercise training increases the size of the anterior hippocampus, leading to improvements in spatial memory. Exercise training increased hippocampal volume by 2%, effectively reversing age-related loss in volume by 1 to 2 y. We also demonstrate that increased hippocampal volume is associated with greater serum levels of BDNF, a mediator of neurogenesis in the dentate gyrus. Hippocampal volume declined in the control group, but higher preintervention fitness partially attenuated the decline, suggesting that fitness protects against volume loss. Caudate nucleus and thalamus volumes were unaffected by the intervention. These theoretically important findings indicate that aerobic exercise training is effective at reversing hippocampal volume loss in late adulthood, which is accompanied by improved memory function.




        Opportunity and Means: Horizontal Gene Transfer from the Human Host to a Bacterial Pathogen

        1. Mark T. Anderson and
        2. H. Steven Seifert

        - Author Affiliations

        1. Department of Microbiology-Immunology, Northwestern University, Feinberg School of Medicine, Chicago, Illinois, USA
        1. Address correspondence to H. Steven Seifert, h-seifert@northwestern.edu.
        1. Editor Stanley Maloy, San Diego State University

        15 February 2011 mBio vol. 2 no. 1 e00005-11

        http://dx.doi.org/10.1128/​mBio.00005-11
         
        ABSTRACT

        The acquisition and incorporation of genetic material between nonmating species, or horizontal gene transfer (HGT), has been frequently described for phylogenetically related organisms, but far less evidence exists for HGT between highly divergent organisms. Here we report the identification and characterization of a horizontally transferred fragment of the human long interspersed nuclear element L1 to the genome of the strictly human pathogen Neisseria gonorrhoeae. A 685-bp sequence exhibiting 98 to 100% identity to copies of the human L1 element was identified adjacent to the irg4 gene in some N. gonorrhoeae genomes. The L1 fragment was observed in ~11% of the N. gonorrhoeae population sampled but was not detected in Neisseria meningitidis or commensal Neisseria isolates. In addition, N. gonorrhoeae transcripts containing the L1 sequence were detected by reverse transcription-PCR, indicating that an L1-derived gene product may be produced. The high degree of identity between human and gonococcal L1 sequences, together with the absence of L1 sequences from related Neisseria species, indicates that this HGT event occurred relatively recently in evolutionary history. The identification of L1 sequences in N. gonorrhoeae demonstrates that HGT can occur between a mammalian host and a resident bacterium, which has important implications for the coevolution of both humans and their associated microorganisms.


        nature.com homepage

        Editorial:

        Devil in the details

        Journal name:
        Nature
        Volume:
        470,
        Pages:
        305–306
        Date published:
        (17 February 2011)
        DOI:
        http://dx.doi.org/10.1038/470305b
        Published online
        16 February 2011

        To ensure their results are reproducible, analysts should show their workings.


        Molecular Systems Biology homepage

        Towards the prediction of protein interaction partners using physical docking

        Mark Nicholas Wass1,2, Gloria Fuentes1,a, Carles Pons3,4, Florencio Pazos5 & Alfonso Valencia1

        1. Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
        2. Structural Bioinformatics Group, Centre for Bioinformatics, Imperial College London, London, UK
        3. Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
        4. Computational Bioinformatics, National Institute of Bioinformatics (INB), Barcelona, Spain
        5. Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), Madrid, Spain
        Molecular Systems Biology 7 Article number: 469

        http://dx.doi.org/10.1038/msb.2011.3

        Abstract

        Deciphering the whole network of protein interactions for a given proteome (‘interactome’) is the goal of many experimental and computational efforts in Systems Biology. Separately the prediction of the structure of protein complexes by docking methods is a well-established scientific area. To date, docking programs have not been used to predict interaction partners. We provide a proof of principle for such an approach. Using a set of protein complexes representing known interactors in their unbound form, we show that a standard docking program can distinguish the true interactors from a background of 922 non-redundant potential interactors. We additionally show that true interactions can be distinguished from non-likely interacting proteins within the same structural family. Our approach may be put in the context of the proposed ‘funnel-energy model’; the docking algorithm may not find the native complex, but it distinguishes binding partners because of the higher probability of favourable models compared with a collection of non-binders. The potential exists to develop this proof of principle into new approaches for predicting interaction partners and reconstructing biological networks.




        Genetics

        The Evolution of Host Specialization in the Vertebrate Gut Symbiont Lactobacillus reuteri

        Steven A. Frese1, Andrew K. Benson1, Gerald W. Tannock2, Diane M. Loach2, Jaehyoung Kim1, Min Zhang1, Phaik Lyn Oh1, Nicholas C. K. Heng3, Prabhu B. Patil1,4, Nathalie Juge5, Donald A. MacKenzie5, Bruce M. Pearson5, Alla Lapidus6, Eileen Dalin6, Hope Tice6, Eugene Goltsman6, Miriam Land7, Loren Hauser7, Natalia Ivanova6, Nikos C. Kyrpides6, Jens Walter1*

        1 Department of Food Science and Technology, University of Nebraska, Lincoln, Nebraska, United States of America, 2 Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand, 3 Sir John Walsh Research Institute (Faculty of Dentistry), University of Otago, Dunedin, New Zealand, 4 Institute of Microbial Technology (IMTECH), Chandigarh, India, 5 Institute of Food Research, Norwich Research Park, Norwich, United Kingdom, 6 Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America, 7 Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States of America

        http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1001314

        Abstract Top

        Recent research has provided mechanistic insight into the important contributions of the gut microbiota to vertebrate biology, but questions remain about the evolutionary processes that have shaped this symbiosis. In the present study, we showed in experiments with gnotobiotic mice that the evolution of Lactobacillus reuteri with rodents resulted in the emergence of host specialization. To identify genomic events marking adaptations to the murine host, we compared the genome of the rodent isolate L. reuteri 100-23 with that of the human isolate L. reuteri F275, and we identified hundreds of genes that were specific to each strain. In order to differentiate true host-specific genome content from strain-level differences, comparative genome hybridizations were performed to query 57 L. reuteri strains originating from six different vertebrate hosts in combination with genome sequence comparisons of nine strains encompassing five phylogenetic lineages of the species. This approach revealed that rodent strains, although showing a high degree of genomic plasticity, possessed a specific genome inventory that was rare or absent in strains from other vertebrate hosts. The distinct genome content of L. reuteri lineages reflected the niche characteristics in the gastrointestinal tracts of their respective hosts, and inactivation of seven out of eight representative rodent-specific genes in L. reuteri 100-23 resulted in impaired ecological performance in the gut of mice. The comparative genomic analyses suggested fundamentally different trends of genome evolution in rodent and human L. reuteri populations, with the former possessing a large and adaptable pan-genome while the latter being subjected to a process of reductive evolution. In conclusion, this study provided experimental evidence and a molecular basis for the evolution of host specificity in a vertebrate gut symbiont, and it identified genomic events that have shaped this process.

        Correlated Evolution of Nearby Residues in Drosophilid Proteins

        Benjamin Callahan1*, Richard A. Neher2¤, Doris Bachtrog3, Peter Andolfatto4, Boris I. Shraiman2,5

        1 Department of Applied Physics, Stanford University, Stanford, California, United States of America, 2 Kavli Institute for Theoretical Physics, University of California Santa Barbara, Santa Barbara, California, United States of America, 3 Department of Integrative Biology and Center for Theoretical Evolutionary Genomics, University of California Berkeley, Berkeley, California, United States of America, 4 Department of Ecology and Evolutionary Biology and the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America, 5 Department of Physics, University of California Santa Barbara, Santa Barbara, California, United States of America

        http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1001315

        Abstract Top

        Here we investigate the correlations between coding sequence substitutions as a function of their separation along the protein sequence. We consider both substitutions between the reference genomes of several Drosophilids as well as polymorphisms in a population sample of Zimbabwean Drosophila melanogaster. We find that amino acid substitutions are “clustered” along the protein sequence, that is, the frequency of additional substitutions is strongly enhanced within ≈10 residues of a first such substitution. No such clustering is observed for synonymous substitutions, supporting a “correlation length” associated with selection on proteins as the causative mechanism. Clustering is stronger between substitutions that arose in the same lineage than it is between substitutions that arose in different lineages. We consider several possible origins of clustering, concluding that epistasis (interactions between amino acids within a protein that affect function) and positional heterogeneity in the strength of purifying selection are primarily responsible. The role of epistasis is directly supported by the tendency of nearby substitutions that arose on the same lineage to preserve the total charge of the residues within the correlation length and by the preferential cosegregation of neighboring derived alleles in our population sample. We interpret the observed length scale of clustering as a statistical reflection of the functional locality (or modularity) of proteins: amino acids that are near each other on the protein backbone are more likely to contribute to, and collaborate toward, a common subfunction.


        Computational Biology

        Molecular Dynamics Simulations of Forced Unbending of Integrin αVβ3

        Wei Chen1¤a, Jizhong Lou2¤b, Jen Hsin3, Klaus Schulten3, Stephen C. Harvey2,4, Cheng Zhu1,2,5*

        1 Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America, 2 Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia, United States of America, 3 Department of Physics and Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America, 4 School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America, 5 Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America

        http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1001086

        Abstract Top

        Integrins may undergo large conformational changes during activation, but the dynamic processes and pathways remain poorly understood. We used molecular dynamics to simulate forced unbending of a complete integrin αVβ3 ectodomain in both unliganded and liganded forms. Pulling the head of the integrin readily induced changes in the integrin from a bent to an extended conformation. Pulling at a cyclic RGD ligand bound to the integrin head also extended the integrin, suggesting that force can activate integrins. Interactions at the interfaces between the hybrid and β tail domains and between the hybrid and epidermal growth factor 4 domains formed the major energy barrier along the unbending pathway, which could be overcome spontaneously in ~1 µs to yield a partially-extended conformation that tended to rebend. By comparison, a fully-extended conformation was stable. A newly-formed coordination between the αV Asp457 and the α-genu metal ion might contribute to the stability of the fully-extended conformation. These results reveal the dynamic processes and pathways of integrin conformational changes with atomic details and provide new insights into the structural mechanisms of integrin activation.

        Biology

        Self-Organization and Regulation of Intrinsically Disordered Proteins with Folded N-Termini

        Philip C. Simister1, Fred Schaper2, Nicola O'Reilly3, Simon McGowan4, Stephan M. Feller1*

        1 Cell Signalling Group, Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom, 2 Department of Systems Biology, Otto-von-Guericke-University Magdeburg, Magdeburg, Germany, 3 Peptide Synthesis Laboratory, Cancer Research UK London Research Institute, London, United Kingdom, 4 Computational Biology Research Group, Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom

        http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1000591

        Summary Top

        Here we hypothesize that some proteins use their structured N-terminal domains (SNTDs) to organize the remaining protein chain by means of intramolecular interactions, so generating partially condensed proteins. This model has several attractive features: as the nascent protein chain emerges from the ribosome, the SNTD folds spontaneously and then serves as a nucleation point for the yet unstructured amino acid chain, creating more compact shapes. This reduces the risk of protein degradation or aggregation. Moreover, an interspersed pattern of SNTD-docked regions and free loops can coordinate assembly of sub-complexes in defined loop-sections and enables novel regulatory mechanisms, for example through posttranslational modifications of docked regions.



        On origin of genetic code and tRNA before translation

        Andrei S Rodin1,2 email, Eörs Szathmáry2,3,4 email and Sergei N Rodin2,5 email

        Human Genetics Center, School of Public Health, University of Texas, Houston, TX 77225, USA

        Collegium Budapest (Institute for Advanced Study), Szentháromság u. 2, H-1014 Budapest, Hungary

        Parmenides Center for the Study of Thinking, Kirchplatz 1, D-82049 Munich/Pullach, Germany

        Institute of Biology, Eötvös University, 1c Pázmány Péter sétány, H-1117 Budapest, Hungary

        Department of Molecular and Cellular Biology, Beckman Research Institute of the City of Hope, Duarte, CA 91010, USA

        Biology Direct 2011, 6:14

        http://dx.doi.org/10.1186/1745-6150-6-14

        Abstract

        Background

        Synthesis of proteins is based on the genetic code - a nearly universal assignment of codons to amino acids (aas). A major challenge to the understanding of the origins of this assignment is the archetypal "key-lock vs. frozen accident" dilemma. Here we re-examine this dilemma in light of 1) the fundamental veto on "foresight evolution", 2) modular structures of tRNAs and aminoacyl-tRNA synthetases, and 3) the updated library of aa-binding sites in RNA aptamers successfully selected in vitro for eight amino acids.

        Results

        The aa-binding sites of arginine, isoleucine and tyrosine contain both their cognate triplets, anticodons and codons. We have noticed that these cases might be associated with palindrome-dinucleotides. For example, one-base shift to the left brings arginine codons CGN, with CG at 1-2 positions, to the respective anticodons NCG, with CG at 2-3 positions. Formally, the concomitant presence of codons and anticodons is also expected in the reverse situation, with codons containing palindrome-dinucleotides at their 2-3 positions, and anticodons exhibiting them at 1-2 positions. A closer analysis reveals that, surprisingly, RNA binding sites for Arg, Ile and Tyr "prefer" (exactly as in the actual genetic code) the anticodon(2-3)/codon(1-2) tetramers to their anticodon(1-2)/codon(2-3) counterparts, despite the seemingly perfect symmetry of the latter. However, since in vitro selection of aa-specific RNA aptamers apparently had nothing to do with translation, this striking preference provides a new strong support to the notion of the genetic code emerging before translation, in response to catalytic (and possibly other) needs of ancient RNA life. Consistently with the pre-translation origin of the code, we propose here a new model of tRNA origin by the gradual, Fibonacci process-like, elongation of a tRNA molecule from a primordial coding triplet and 5'DCCA3' quadruplet (D is a base-determinator) to the eventual 76 base-long cloverleaf-shaped molecule.

        Conclusion

        Taken together, our findings necessarily imply that primordial tRNAs, tRNA aminoacylating ribozymes, and (later) the translation machinery in general have been co-evolving to ''fit'' the (likely already defined) genetic code, rather than the opposite way around. Coding triplets in this primal pre-translational code were likely similar to the anticodons, with second and third nucleotides being more important than the less specific first one. Later, when the code was expanding in co-evolution with the translation apparatus, the importance of 2-3 nucleotides of coding triplets "transferred" to the 1-2 nucleotides of their complements, thus distinguishing anticodons from codons. This evolutionary primacy of anticodons in genetic coding makes the hypothesis of primal stereo-chemical affinity between amino acids and cognate triplets, the hypothesis of coding coenzyme handles for amino acids, the hypothesis of tRNA-like genomic 3' tags suggesting that tRNAs originated in replication, and the hypothesis of ancient ribozymes-mediated operational code of tRNA aminoacylation not mutually contradicting but rather co-existing in harmony.


        Current Biology

        High Spontaneous Rate of Gene Duplication in Caenorhabditis elegans

        Authors


        Current Biology, Volume 21, Issue 4, 306-310, 03 February 2011

        http://dx.doi.org/10.1016/j.cub.2011.01.026

        Summary

        Gene and genome duplications are the primary source of new genes and novel functions and have played a pivotal role in the evolution of genomic and organismal complexity [1,2]. The spontaneous rate of gene duplication is a critical parameter for understanding the evolutionary dynamics of gene duplicates; yet few direct empirical estimates exist and differ widely. The presence of a large population of recently derived gene duplicates in sequenced genomes suggests a high rate of spontaneous origin, also evidenced by population genomic studies reporting rampant copy-number polymorphism at the intraspecific level [3,4,5,6]. An analysis of long-term mutation accumulation lines of Caenorhabditis elegans for gene copy-number changes with array comparative genomic hybridization yields the first direct estimate of the genome-wide rate of gene duplication in a multicellular eukaryote. The gene duplication rate in C. elegans is quite high, on the order of 10−7 duplications/gene/generation. This rate is two orders of magnitude greater than the spontaneous rate of point mutation per nucleotide site in this species and also greatly exceeds an earlier estimate derived from the frequency distribution of extant gene duplicates in the sequenced C. elegans genome.

        Jan 31 through Feb 13 by David H. Ardell

        posted Feb 25, 2011 4:23 PM by UCmerced CompBioJournalClub   [ updated Feb 28, 2011 10:55 AM ]

        Proceedings of the Royal Society B: Biological Sciences

        Skip to main page content

          Before senescence: the evolutionary demography of ontogenesis

          1. Daniel A. Levitis*

          +Author Affiliations

          1. Laboratory of Evolutionary Biodemography, Max Planck Institute for Demographic Research, Konrad-Zuse-Strasse 1, 18057 Rostock, Germany
          1. * levitis@demogr.mpg.de

          Abstract

          The age-specific mortality curve for many species, including humans, is U-shaped: mortality declines with age in the developing cohort (ontogenescence) before increasing with age (senescence). The field of evolutionary demography has long focused on understanding the evolution of senescence while largely failing to address the evolution of ontogenescence. The current review is the first to gather the few available hypotheses addressing the evolution of ontogenescence, examine the basis and assumptions of each and ask what the phylogenetic extent of ontogenescence may be. Ontogenescence is among the most widespread of life-history traits, occurring in every population for which I have found sufficiently detailed data, in major groups throughout the eukaryotes, across many causes of death and many life-history types. Hypotheses seeking to explain ontogenescence include those based on kin selection, the acquisition of robustness, heterogeneous frailties and life-history optimization. I propose a further hypothesis, arguing that mortality drops with age because most transitions that could trigger the risks caused by genetic and developmental malfunctions are concentrated in early life. Of these hypotheses, only those that frame ontogenescence as an evolutionary by-product rather than an adaptation can explai



          Science 11 February 2011: 
          Vol. 331 no. 6018 pp. 728-729 
          DOI: 10.1126/science.1197891
          • PERSPECTIVE

          On the Future of Genomic Data

          1. Scott D. Kahn

          +Author Affiliations

          1. Illumina, 9885 Towne Centre Drive, San Diego, CA 92121, USA.

          ABSTRACT

          Many of the challenges in genomics derive from the informatics needed to store and analyze the raw sequencing data that is available from highly multiplexed sequencing technologies. Because single week-long sequencing runs today can produce as much data as did entire genome centers a few years ago, the need to process terabytes of information has become de rigueur for many labs engaged in genomic research. The availability of deep (and large) genomic data sets raises concerns over information access, data security, and subject/patient privacy that must be addressed for the field to continue its rapid advances.


          Changing the Equation on Scientific Data Visualization

          1. Peter Fox and 
          2. James Hendler*

          +Author Affiliations

          1. Tetherless World Constellation, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
          1. *To whom correspondence should be addressed. E-mail: hendler@cs.rpi.edu

          ABSTRACT

          An essential facet of the data deluge is the need for different types of users to apply visualizations to understand how data analyses and queries relate to each other. Unfortunately, visualization too often becomes an end product of scientific analysis, rather than an exploration tool that scientists can use throughout the research life cycle. However, new database technologies, coupled with emerging Web-based technologies, may hold the key to lowering the cost of visualization generation and allow it to become a more integral part of the scientific process.

          Science 11 February 2011: 
          Vol. 331 no. 6018 pp. 694-695 
          DOI: 10.1126/science.331.6018.694
          • NEWS
          NEWS

          Rescue of Old Data Offers Lesson for Particle Physicists

          Accustomed to working in large collaborations and moving swiftly on to bigger, better machines, particle physicists have no standard format for sharing or storing information after an experiment shuts down. Old data can end up scattered across the globe, stored haphazardly on old tapes, or lost entirely. This tendency has prompted some in the field to call for better care to be taken of data after an experiment has finished. For a very small fraction of the experiment's budget, they argue, data could be preserved in a form usable by later generations of physicists. To promote this strategy, researchers from a half-dozen major labs around the world, including CERN, formed a working group in 2009 called Data Preservation in High Energy Physics. One of the group's aims is to create the new post of "data archivist," someone within each experimental team who will ensure that information is properly managed.


          American Society For Microbiology The Journal of Bacteriology 

          GENOME ANNOUNCEMENT

          Genome Sequence of Leuconostoc inhae KCTC 3774, Isolated from Kimchi {triangledown}

          Dae-Soo Kim,1,{dagger} Sang-Haeng Choi,1,{dagger} Dong-Wook Kim,1 Ryong Nam Kim,1Seong-Hyeuk Nam,1 Aram Kang,1,2 Aeri Kim,1,2 and Hong-Seog Park1,2*

          Genome Resource Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), 111 Gwahangno, Yuseong-gu, Daejeon 305-806, Republic of Korea,1 University of Science and Technology (UST), 113 Gwahangno, Yuseong-gu, Daejeon 305-806, Republic of Korea2

          Received 1 December 2010/ Accepted 13 December 2010

          ABSTRACT

          Leuconostoc inhae strain KCTC 3774 is a Gram-positive, non-spore-forming,heterofermentative, spherical or lenticular lactic acid bacterium. Here we announce the draft genome sequence of Leuconostoc inhae KCTC 3774, isolated from traditional Korean kimchi, and describe major findings from its annotation.

          Journal of Bacteriology, March 2011, p. 1183-1190, Vol. 193, No. 5
          0021-9193/11/$12.00+0     doi:10.1128/JB.00925-10
          Copyright © 2011American Society for Microbiology. All Rights Reserved.

          Complete Genome Sequence of the Metabolically Versatile Plant Growth-Promoting EndophyteVariovorax paradoxus S110 {triangledown} ,{ddagger}

          Jong-In Han,1,{dagger}* Hong-Kyu Choi,2,{dagger} Seung-Won Lee,11,3 Paul M. Orwin,4Jina Kim,1 Sarah L. LaRoe,5 Tae-gyu Kim,1 Jennifer O'Neil,5 Jared R. Leadbetter,6 Sang Yup Lee,7 Cheol-Goo Hur,3 Jim C. Spain,8 Galina Ovchinnikova,9 Lynne Goodwin,10 and Cliff Han10

          Department of Civil and Environmental Engineering, KAIST, Daejeon, Republic of Korea,1 Department of Genetic Engineering, Dong-A University, Busan, Republic of Korea,2 Bioinformatics Research Center, KRIBB, Daejeon, Republic of Korea,3 California State University at San Bernardino, Department of Biology, San Bernardino, California,4 Department of Civil and Environmental Engineering, Rensselaer Polytechnic Institute, 110 8th St., Troy, New York,5Divisions of Biology and Environmental Science and Engineering, California Institute of Technology, Pasadena, California,6 Department of Chemical and Biomolecular Engineering, KAIST, Daejeon, Republic of Korea,7 School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia,8 DOE Joint Genome Institute, Walnut Creek, California,9 Los Alamos National Laboratory, Los Alamos, New Mexico,10 Department of Agricultural Bio-Resources, Genomics Division, National Academy of Agricultural Science, Suwon, Republic of Korea,11

          Received 7 August 2010/ Accepted 11 November 2010

          Variovorax paradoxus is a microorganism of special interest due to its diverse metabolic capabilities, including the biodegradation of both biogenic compounds and anthropogenic contaminants. V. paradoxus also engages in mutually beneficial interactions with both bacteria and plants. The complete genome sequence of V. paradoxus S110 is composed of 6,754,997 bp with 6,279 predicted protein-coding sequences within two circular chromosomes. Genomicanalysis has revealed multiple metabolic features for autotrophic and heterotrophic lifestyles. These metabolic diversities enable independent survival, as well as a symbiotic lifestyle. Consequently, S110 appears to have evolved into a superbly adaptable microorganism that is able to survive in ever-changing environmental conditions. Based on our findings, we suggest V. paradoxus S110 as a potential candidate for agrobiotechnological applications, such as biofertilizer and biopesticide. Because it has many associations with other biota, it is also suited to serve as an additional model system for studies of microbe-plant and microbe-microbe interactions.


          Biology Direct      

          Article alert


          The latest articles from Biology Direct, published between 27-Jan-2011 and 09-Feb-2011

          Discovery notes

          Evolutionary patterns of phosphorylated serines

          Yerbol Z Kurmangaliyev1,2 emailAlexander Goland1 email and Mikhail S Gelfand1,3 email

          Institute for Information Transmission Problems (the Kharkevich Institute) RAS, Bolshoi Karetny pereulok 19, Moscow, 127994, Russia

          National Center for Biotechnology of the Republic of Kazakhstan, Valikhanov str., 13/1, Astana, 010000, Republic of Kazakhstan

          Faculty of Bioengineering and Bioinformatics, Moscow State University, Vorobievy Gory 1-73, Moscow, 119991, Russia

           author email corresponding author email

          Biology Direct 2011, 6:8doi:10.1186/1745-6150-6-8

          Published:9 February 2011

          Abstract

          Posttranslationally modified amino acids are chemically distinct types of amino acids and in terms of evolution they might behave differently from their non-modified counterparts. In order to check this possibility, we reconstructed the evolutionary history of phosphorylated serines in several groups of organisms. Comparisons of substitution vectors have revealed some significant differences in the evolution of modified and corresponding non-modified amino acids. In particular, phosphoserines are more frequently substituted to aspartate and glutamate, compared to non-phosphorylated serines.

          Research

          Gene gain and loss events in Rickettsia and Orientia species

          Kalliopi Georgiades emailVicky Merhej emailKhalid El Karkouri emailDidier Raoult email and Pierre Pontarotti email

          Biology Direct 2011, 6:6doi:10.1186/1745-6150-6-6

          Published:8 February 2011

          Abstract (provisional)

          Background

          Genome degradation is an ongoing process in all members of the Rickettsiales order, which makes these bacterial species an excellent model for studying reductive evolution through interspecies variation in genome size and gene content. In this study, we evaluated the degree to which gene loss shaped the content of some Rickettsiales genomes. We shed light on the role played by horizontal gene transfers in the genome evolution of Rickettsiales.

          Results

          Our phylogenomic tree, based on whole-genome content, presented a topology distinct from that of the whole core gene concatenated phylogenetic tree, suggesting that the gene repertoires involved have different evolutionary histories. Indeed, we present evidence for 3 possible horizontal gene transfer events from various organisms to Orientia and 6 to Rickettsia spp., while we also identified 3 possible horizontal gene transfer events from Rickettsia and Orientia to other bacteria. We found 17 putative genes in Rickettsia spp. that are probably the result of de novo gene creation; 2 of these genes appear to be functional. On the basis of these results, we were able to reconstruct the gene repertoires of "proto-Rickettsiales" and "proto-Rickettsiaceae", which correspond to the ancestors of Rickettsiales and Rickettsiaceae, respectively. Finally, we found that 2,135 genes were lost during the evolution of the Rickettsiaceae to an intracellular lifestyle.

          Conclusions

          Our phylogenetic analysis allowed us to track the gene gain and loss events occurring in bacterial genomes during their evolution from a free-living to an intracellular lifestyle. We have shown that the primary mechanism of evolution and specialization in strictly intracellular bacteria is gene loss. Despite the intracellular habitat, we found several horizontal gene transfers between Rickettsiales species and various prokaryotic, viral and eukaryotic species. Open peer review: Reviewed by Arcady Mushegian, Eugene V. Koonin and Patrick Forterre. For the full reviews please go to the Reviewers' comments section.


          Opinion

          The Multiple Personalities of Watson and Crick Strands

          Reed A Cartwright email and Dan Graur email

          Biology Direct 2011, 6:7doi:10.1186/1745-6150-6-7

          Published:8 February 2011

          Abstract (provisional)

          Background

          In genetics it is customary to refer to double-stranded DNA as containing a 'Watson strand' and a 'Crick strand.' However, there seems to be no consensus in the literature on the exact meaning of these two terms, and the many usages contradict one another as well as the original definition. Here, we review the history of the terminology and suggest retaining a single sense that is currently the most useful and consistent. Proposal: The Saccharomyces Genome Database defines the Watson strand as the strand which has its 5'-end at the short-arm telomere and the Crick strand as its complement. The Watson strand is always used as the reference strand in their database. Using this as the basis of our standard, we recommend that Watson and Crick strand terminology only be used in the context of genomics. When possible, the centromere or other genomic feature should be used as a reference point, dividing the chromosome into two arms of unequal lengths. Under our proposal, the Watson strand is standardized as the strand whose 5'-end is on the short arm of the chromosome, and the Crick strand as the one whose 5'-end is on the long arm. Furthermore, the Watson strand should be retained as the reference (plus) strand in a genomic database. This usage not only makes the determination of Watson and Crick unambiguous but also allows unambiguous selection of reference stands for genomics. Reviewers: This article was reviewed by John M. Logsdon, Igor B. Rogozin (nominated by Andrey Rzhetsky), and William Martin.


          spacer
          coverspacer 

           

           

          Mechanism for the Alteration of the Substrate Specificities of Template-Independent RNA Polymerases

            To view the full text, please login as a subscribed user or purchase a subscription. Click here to view the full text on ScienceDirect.

          Structure, Volume 19, Issue 2, 232-243, 9 February 2011
          Copyright © 2011 Elsevier Ltd All rights reserved.
          10.1016/j.str.2010.12.006

           

          Authors

          • Highlights
          • Crystal structure of eubacterial polyA polymerase and its complex with ATP
          • The size and shape of the nucleobase interacting pocket are suitable for only ATP
          • The RNA-binding and catalytic domains together dictate the substrate specificity of polyA polymerase
          • The mechanism of ATP selection by polyA polymerase is distinct from that by the CCA-adding enzyme

          Summary

          PolyA polymerase (PAP) adds a polyA tail onto the 3′-end of RNAs without a nucleic acid template, using adenosine-5′-triphosphate (ATP) as a substrate. The mechanism for the substrate selection by eubacterial PAP remains obscure. Structural and biochemical studies of Escherichia coli PAP (EcPAP) revealed that the shape and size of the nucleobase-interacting pocket of EcPAP are maintained by an intra-molecular hydrogen-network, making it suitable for the accommodation of only ATP, using a single amino acid, Arg197. The pocket structure is sustained by interactions between the catalytic domain and the RNA-binding domain. EcPAP has a flexible basic C-terminal region that contributes to optimal RNA translocation for processive adenosine 5′-monophosphate (AMP) incorporations onto the 3′-end of RNAs. A comparison of the EcPAP structure with those of other template-independent RNA polymerases suggests that structural changes of domain(s) outside the conserved catalytic core domain altered the substrate specificities of the template-independent RNA polymerases.

          Cover

          PNAS Online Table of Contents Alert

          A new issue of Proceedings of the National Academy of Sciences is available online:
          8 February 2011; Vol. 108, No. 6 

          The below Table of Contents is available online at: http://www.pnas.org/content/vol108/issue6/?etoc

          Genome and transcriptome analyses of the mountain pine beetle-fungal symbiontGrosmannia clavigera, a lodgepole pine pathogen

          1. Scott DiGuistinia
          2. Ye Wanga
          3. Nancy Y. Liaob
          4. Greg Taylorb,
          5. Philippe Tanguayc
          6. Nicolas Feaud
          7. Bernard Henrissate
          8. Simon K. Chanb,
          9. Uljana Hesse-Orcea
          10. Sepideh Massoumi Alamoutia
          11. Clement K. M. Tsuif,
          12. Roderick T. Dockingb
          13. Anthony Levasseurg
          14. Sajeet Haridasa,
          15. Gordon Robertsonb
          16. Inanc Birolb
          17. Robert A. Holtb
          18. Marco A. Marrab,
          19. Richard C. Hamelinc
          20. Martin Hirstb
          21. Steven J. M. Jonesb
          22. Jörg Bohlmannf,h,1, and 
          23. Colette Breuila,1

          +Author Affiliations

          1. aDepartment of Wood Science,
          2. fDepartment of Forest Science, University of British Columbia, Vancouver, BC, Canada V6T 1Z4;
          3. bBritish Columbia Cancer Agency Genome Sciences Centre, Vancouver, BC, Canada V5Z 4E6;
          4. cNatural Resources Canada, Ste-Foy, QC, Canada G1V 4C7;
          5. dUnité Mixte de Recherche 1202, Institut National de la Recherche Agronomique-Université Bordeaux I, Biodiversité, Gènes et Communautés, Institut National de la Recherche Agronomique Bordeaux-Aquitaine, 33612 Cestas Cedex, France;
          6. eArchitecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche-6098, Centre National de la Recherche Scientifique, Universités Aix-Marseille I & II, 13288 Marseille cedex 9, France;
          7. gBiotechnologie des Champignons Filamenteux, Unité Mixte de Recherche-1161, Institut National de la Recherche, Universités de Provence et de la Méditerranée, 13288 Marseille cedex 09, France; and
          8. hMichael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada V6T 1Z3
          1. Edited by Rodney B. Croteau, Washington State University, Pullman, WA, and approved December 27, 2010 (received for review August 2, 2010)

          Abstract

          In western North America, the current outbreak of the mountain pine beetle (MPB) and its microbial associates has destroyed wide areas of lodgepole pine forest, including more than 16 million hectares in British Columbia. Grosmannia clavigera (Gc), a critical component of the outbreak, is a symbiont of the MPB and a pathogen of pine trees. To better understand the interactions between Gc, MPB, and lodgepole pine hosts, we sequenced the ∼30-Mb Gc genome and assembled it into 18 supercontigs. We predict 8,314 protein-coding genes, and support the gene models with proteome, expressed sequence tag, and RNA-seq data. We establish that Gc is heterothallic, and report evidence for repeat-induced point mutation. We report insights, from genome and transcriptome analyses, into how Gc tolerates conifer-defense chemicals, including oleoresin terpenoids, as they colonize a host tree. RNA-seq data indicate that terpenoids induce a substantial antimicrobial stress in Gc, and suggest that the fungus may detoxify these chemicals by using them as a carbon source. Terpenoid treatment strongly activated a ∼100-kb region of the Gc genome that contains a set of genes that may be important for detoxification of these host-defense chemicals. This work is a major step toward understanding the biological interactions between the tripartite MPB/fungus/forest system.


          Nucleic Acids Research
            THE JOURNAL     ETOC ALERTS     SUBSCRIPTIONS CURRENT ISSUE     ARCHIVE     SEARCH  
          Nucleic Acids Research cover image

          **************************Announcement************************** NAR Synthetic Biology Open Access Special Issue A new open access special issue of NAR on Synthetic Biology is now available to read online: here. Edited by James J. Collins, Drew Endy, Clyde A. Hutchison III, and Richard J. Roberts, it details the many advances made in this dynamic area - including those at the intersections of chemistry, physics, biology and engineering. **************************Announcement**************************

          Nucleic Acids Research Table of Contents Alert

          A new issue of Nucleic Acids Research is available online:
          Vol. 39, No. 3
          The below Table of Contents is available online at: http://nar.oxfordjournals.org/content/vol39/issue3/index.dtl

          Investigating the predictability of essential genes across distantly related organisms using an integrative approach

          1. Jingyuan Deng1,2
          2. Lei Deng1,3
          3. Shengchang Su4
          4. Minlu Zhang5
          5. Xiaodong Lin6
          6. Lan Wei7,
          7. Ali A. Minai3
          8. Daniel J. Hassett4 and 
          9. Long J. Lu1,2,5,8,*

          +Author Affiliations

          1. 1Division of Biomedical Informatics, Cincinnati Children’s Hospital Research Foundation, Cincinnati, OH 45229, 2Department of Biomedical Engineering, 3Department of Electrical and Computer Engineering, 4Department of Molecular Genetics, Biochemistry and Microbiology, 5Department of Computer Science, University of Cincinnati, Cincinnati, OH 45229, 6Department of Management Science and Information Systems, Rutgers University, Piscataway, NJ 08854, 7School of Medicine, Yale University, New Haven, CT 06511 and 8Department of Environmental Health, University of Cincinnati, Cincinnati, OH 45229, USA
          1. *To whom correspondence should be addressed. Tel: +1 513 636 8720; Fax: +1 513 636 2056; Email: long.lu@cchmc.org
          • Received May 28, 2010.
          • Revision received August 15, 2010.
          • Accepted August 18, 2010.

          Abstract

          Rapid and accurate identification of new essential genes in under-studied microorganisms will significantly improve our understanding of how a cell works and the ability to re-engineer microorganisms. However, predicting essential genes across distantly related organisms remains a challenge. Here, we present a machine learning-based integrative approach that reliably transfers essential gene annotations between distantly related bacteria. We focused on four bacterial species that have well-characterized essential genes, and tested the transferability between three pairs among them. For each pair, we trained our classifier to learn traits associated with essential genes in one organism, and applied it to make predictions in the other. The predictions were then evaluated by examining the agreements with the known essential genes in the target organism. Ten-fold cross-validation in the same organism yielded AUC scores between 0.86 and 0.93. Cross-organism predictions yielded AUC scores between 0.69 and 0.89. The transferability is likely affected by growth conditions, quality of the training data set and the evolutionary distance. We are thus the first to report that gene essentiality can be reliably predicted using features trained and tested in a distantly related organism. Our approach proves more robust and portable than existing approaches, significantly extending our ability to predict essential genes beyond orthologs.

          C-terminal domain of archaeal O-phosphoseryl-tRNA kinase displays large-scale motion to bind the 7-bp D-stem of archaeal tRNASec

          1. R. Lynn Sherrer1
          2. Yuhei Araiso2,3
          3. Caroline Aldag1
          4. Ryuichiro Ishitani2
          5. Joanne M. L. Ho1,
          6. Dieter Söll1,4,* and 
          7. Osamu Nureki2,3,*

          +Author Affiliations

          1. 1Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520-8114, USA, 2Department of Biophysics and Biochemistry, Graduate School of Science, The University of Tokyo, 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-0032, 3Department of Biological Information, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama-shi, Kanagawa 226-8501, Japan and 4Department of Chemistry, Yale University, New Haven, Connecticut 06520-8114, USA
          1. *To whom correspondence should be addressed. Tel: +81 3 5841 4392; Fax: +81 3 5841 8057; Email: nureki@ims.u-tokyo.ac.jp
          2. Correspondence may also be addressed to Dieter Söll. Tel: +1 203 432 6200; Fax: +1 203 432 6202; Email: dieter.Soll@yale.edu
          • Received August 17, 2010.
          • Revision received September 6, 2010.
          • Accepted September 8, 2010.

          Abstract

          O-Phosphoseryl-tRNA kinase (PSTK) is the key enzyme in recruiting selenocysteine (Sec) to the genetic code of archaea and eukaryotes. The enzyme phosphorylates Ser-tRNASec to produceO-phosphoseryl-tRNASec (Sep-tRNASec) that is then converted to Sec-tRNASec by Sep-tRNA:Sec-tRNA synthase. Earlier we reported the structure of the Methanocaldococcus jannaschii PSTK (MjPSTK) complexed with AMPPNP. This study presents the crystal structure (at 2.4-Å resolution) of MjPSTK complexed with an anticodon-stem/loop truncated tRNASec(Mj*tRNASec), a good enzyme substrate. Mj*tRNASec is bound between the enzyme’s C-terminal domain (CTD) and N-terminal kinase domain (NTD) that are connected by a flexible 11 amino acid linker. Upon Mj*tRNASec recognition the CTD undergoes a 62-Å movement to allow proper binding of the 7-bp D-stem. This large reorganization of the PSTK quaternary structure likely provides a means by which the unique tRNASec species can be accurately recognized with high affinity by the translation machinery. However, while the NTD recognizes the tRNA acceptor helix, shortened versions of MjPSTK (representing only 60% of the original size, in which the entire CTD, linker loop and an adjacent NTD helix are missing) are still activein vivo and in vitro, albeit with reduced activity compared to the full-length enzyme.


          Cover

          RNA Table of Contents Alert

          A new issue of RNA is available online:
          1 March 2011; Vol. 17, No. 3 

          The below Table of Contents is available online at: http://rnajournal.cshlp.org/content/vol17/issue3/?etoc




          Proofreading and spellchecking: A two-tier strategy for pre-mRNA splicing quality control

          1. Defne E. Egecioglu1,2 and 
          2. Guillaume Chanfreau1,2

          +Author Affiliations

          1. 1Department of Chemistry and Biochemistry, University of California Los Angeles, Los Angeles, California 90095-1569, USA
          2. 2Molecular Biology Institute, University of California Los Angeles, Los Angeles, California 90095-1569, USA

          Abstract

          Multi-tier strategies exist in many biochemical processes to ensure a maximal fidelity of the reactions. In this review, we focus on the two-tier quality control strategy that ensures the quality of the products of the pre-mRNA splicing reactions catalyzed by the spliceosome. The first step in the quality control process relies on kinetic proofreading mechanisms that are internal to the spliceosome and that are performed by ATP-dependent RNA helicases. The second quality control step, spellchecking, involves recognition of unspliced pre-mRNAs or aberrantly spliced mRNAs that have escaped the first proofreading mechanisms, and subsequent degradation of these molecules by degradative enzymes in the nucleus or in the cytoplasm. This two-tier quality control strategy highlights a need for high fidelity and a requirement for degradative activities that eliminate defective molecules. The presence of multiple quality control activities during splicing underscores the importance of this process in the expression of genetic information.

          Identification of compounds that decrease the fidelity of start codon recognition by the eukaryotic translational machinery

          1. Julie E. Takacs1
          2. Timothy B. Neary1
          3. Nicholas T. Ingolia2,3,4,
          4. Adesh K. Saini5
          5. Pilar Martin-Marcos5
          6. Jerry Pelletier6,
          7. Alan G. Hinnebusch5 and 
          8. Jon R. Lorsch1

          +Author Affiliations

          1. 1Department of Biophysics and Biophysical Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
          2. 2Department of Cellular and Molecular Pharmacology and Howard Hughes Medical Institute, University of California, San Francisco, California 94158, USA
          3. 3California Institute for Quantitative Biosciences, San Francisco, California 94158, USA
          4. 4Department of Embryology, Carnegie Institution, Baltimore, Maryland 21218, USA
          5. 5Laboratory of Gene Regulation and Development, Eunice Kennedy Shriver Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland 20892, USA
          6. 6Department of Biochemistry and McGill Cancer Center, McGill University, Montreal, Quebec H3G 1Y6, Canada

          Abstract

          Translation initiation in eukaryotes involves more than a dozen protein factors. Alterations in six factors have been found to reduce the fidelity of start codon recognition by the ribosomal preinitiation complex in yeast, a phenotype referred to as Sui. No small molecules are known that affect the fidelity of start codon recognition. Such compounds would be useful tools for probing the molecular mechanics of translation initiation and its regulation. To find compounds with this effect, we set up a high-throughput screen using a dual luciferase assay in S. cerevisiae. Screening of over 55,000 compounds revealed two structurally related molecules that decrease the fidelity of start codon selection by approximately twofold in the dual luciferase assay. This effect was confirmed using additional in vivo assays that monitor translation from non-AUG start codons. Both compounds increase translation of a natural upstream open reading frame previously shown to initiate translation at a UUG. The compounds were also found to exacerbate increased use of UUG as a start codon (Sui phenotype) conferred by haploinsufficiency of wild-type eukaryotic initiation factor (eIF) 1, or by mutation in eIF1. Furthermore, the effects of the compounds are suppressed by overexpressing eIF1, which is known to restore the fidelity of start codon selection in strains harboring Sui mutations in various other initiation factors. Together, these data strongly suggest that the compounds affect the translational machinery itself to reduce the accuracy of selecting AUG as the start codon.

          Distinct regulatory programs establish widespread sex-specific alternative splicing in Drosophila melanogaster

          1. Britta Hartmann1,2,3
          2. Robert Castelo2,4
          3. Belén Miñana1,2
          4. Erin Peden5,
          5. Marco Blanchette6,8
          6. Donald C. Rio6
          7. Ravinder Singh5 and
          8. Juan Valcárcel7

          +Author Affiliations

          1. 1Centre de Regulació Genòmica, Dr. Aiguader 88, 08003 Barcelona, Spain
          2. 2Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain
          3. 3Centre for Biological Signalling Studies (BIOSS), Albert-Ludwigs-Universitaät, Habsburgerstrasse 49, 79104 Freiburg, Germany
          4. 4Institut Municipal d'Investigació Mèdica, Dr. Aiguader 88, 08003 Barcelona, Spain
          5. 5Department of Molecular, Cellular and Developmental Biology, University of Colorado, Boulder, Colorado 80309, USA
          6. 6Department of Molecular and Cell Biology, University of California at Berkeley, California 94720-3204, USA
          7. 7Institució Catalana de Recerca i Estudis Avançats, 08010 Barcelona, Spain
          • 8 Present address: Stowers Institute for Biomedical Research, Kansas City, MO 64110, USA.

          Abstract

          In Drosophila melanogaster, female-specific expression of Sex-lethal (SXL) and Transformer (TRA) proteins controls sex-specific alternative splicing and/or translation of a handful of regulatory genes responsible for sexual differentiation and behavior. Recent findings in 2009 by Telonis-Scott et al. document widespread sex-biased alternative splicing in fruitflies, including instances of tissue-restricted sex-specific splicing. Here we report results arguing that some of these novel sex-specific splicing events are regulated by mechanisms distinct from those established by female-specific expression of SXL and TRA. Bioinformatic analysis of SXL/TRA binding sites, experimental analysis of sex-specific splicing in S2 and Kc cells lines and of the effects of SXL knockdown in Kc cells indicate that SXL-dependent and SXL-independent regulatory mechanisms coexist within the same cell. Additional determinants of sex-specific splicing can be provided by sex-specific differences in the expression of RNA binding proteins, including Hrp40/Squid. We report that sex-specific alternative splicing of the gene hrp40/squid leads to sex-specific differences in the levels of this hnRNP protein. The significant overlap between sex-regulated alternative splicing changes and those induced by knockdown of hrp40/squidand the presence of related sequence motifs enriched near subsets of Hrp40/Squid-regulated and sex-regulated splice sites indicate that this protein contributes to sex-specific splicing regulation. A significant fraction of sex-specific splicing differences are absent in germline-less tudor mutant flies. Intriguingly, these include alternative splicing events that are differentially spliced in tissues distant from the germline. Collectively, our results reveal that distinct genetic programs control widespread sex-specific splicing in Drosophila melanogaster.



          Faculty of 1000
          EvaluationsRankingsReportsPostersMagazineFacultyMonday 07 February 2011   




          Proc Natl Acad Sci U S A. 2010 Nov 16;107(46):19820-5. Epub 2010 Nov 1.

          Stochastic reaction-diffusion kinetics in the microscopic limit.

          Fange DBerg OGSjöberg PElf J.

          Department of Cell and Molecular Biology, Uppsala University, 75124 Uppsala, Sweden.

          Abstract

          Quantitative analysis of biochemical networks often requires consideration of both spatial and stochastic aspects of chemical processes. Despite significant progress in the field, it is still computationally prohibitive to simulate systems involving many reactants or complex geometries using a microscopic framework that includes the finest length and time scales of diffusion-limited molecular interactions. For this reason, spatially or temporally discretized simulations schemes are commonly used when modeling intracellular reaction networks. The challenge in defining such coarse-grained models is to calculate the correct probabilities of reaction given the microscopic parameters and the uncertainty in the molecular positions introduced by the spatial or temporal discretization. In this paper we have solved this problem for the spatially discretized Reaction-Diffusion Master Equation; this enables a seamless and physically consistent transition from the microscopic to the macroscopic frameworks of reaction-diffusion kinetics. We exemplify the use of the methods by showing that a phosphorylation-dephosphorylation mot

          f, commonly observed in eukaryotic signaling pathways, is predicted to display fluctuations that depend on the geometry of the system.


          Proc Natl Acad Sci U S A. 2010 Apr 13;107(15):6946-51. Epub 2010 Mar 24.

          Metabolic cycling in single yeast cells from unsynchronized steady-state populations limited on glucose or phosphate.

          Silverman SJPetti AASlavov NParsons LBriehof RThiberge SYZenklusen DGandhi SJLarson DRSinger RHBotstein D.

          Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA.

          Abstract

          Oscillations in patterns of expression of a large fraction of yeast genes are associated with the "metabolic cycle," usually seen only in prestarved, continuous cultures of yeast. We used FISH of mRNA in individual cells to test the hypothesis that these oscillations happen in single cells drawn from unsynchronized cultures growing exponentially in chemostats. Gene-expression data from synchronized cultures were used to predict coincident appearance of mRNAs from pairs of genes in the unsynchronized cells. Quantitative analysis of the FISH results shows that individual unsynchronized cells growing slowly because of glucose limitation or phosphate limitation show the predicted oscillations. We conclude that the yeast metabolic cycle is an intrinsic property of yeast metabolism and does not depend on either synchronization or external limitation of growth by the carbon source.


          Proc Natl Acad Sci U S A. 2008 Dec 30;105(52):20705-10. Epub 2008 Dec 19.

          Determination of cell fate selection during phage lambda infection.

          St-Pierre FEndy D.

          Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

          Abstract

          Bacteriophage lambda infection of Escherichia coli can result in distinct cell fate outcomes. For example, some cells lyse whereas others survive as lysogens. A quantitative biophysical model of lambda infection supports the hypothesis that spontaneous differences in the timing of individual molecular events during lambda infection leads to variation in the selection of cell fates. Building from this analysis, the lambda lysis-lysogeny decision now serves as a paradigm for how intrinsic molecular noise can influence cellular behavior, drive developmental processes, and produce population heterogeneity. Here, we report experimental evidence that warrants reconsidering this framework. By using cell fractioning, plating, and single-cell fluorescent microscopy, we find that physical differences among cells present before infection bias lambda developmental outcomes. Specifically, variation in cell volume at the time of infection can be used to help predict cell fate: a approximately 2-fold increase in cell volume results in a 4- to 5-fold decrease in the probability of lysogeny. Other cell fate decisions now thought to be stochastic might also be determined by pre-existing variation.




          Trends in MicrobiologyTrends in Microbiology 

          Volume 19, Issue 2,  Pages 49-104 (February 2011)








          Opinion

          Are low temperature habitats hot spots of microbial evolution driven by viruses?

          Alexandre M. AnesioaE-mail The Corresponding Author and Christopher M. Bellasa

          a Bristol Glaciology Centre, School of Geographical Sciences, University of Bristol, Bristol BS8 1SS, UK


          Available online 3 December 2010. 

          There is an increasing body of evidence to show that viruses are important drivers of microbial evolution and that they can store a great deal of the Earth's microbial diversity in their genomes. Examination of microbial diversity in polar regions has revealed a higher than expected diversity of viruses, bacteria and eukaryotic microbes. Further, the few available studies in polar regions reveal that viral control of microbial mortality is important in these habitats. In this opinion article, we argue that strong relationships between viruses and their hosts in a range of polar habitats could be key in explaining why polar regions are in fact hot spots of microbial diversity and evolution. Further, we argue that periodic glaciations, and particularly the Neoproterozoic low-latitude glaciation, known as ‘snowball Earth’, could have been periods of intense diversification in aquatic refuges.


          Opinion

          Time to recognise that mitochondria are bacteria?

          Mark J. PallenaE-mail The Corresponding Author

          a Centre for Systems Biology, School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK


          Available online 1 December 2010. 

          The scientific community is comfortable with recognising mitochondria as organelles that happen to be descendants of bacteria. Here, I playfully explore the arguments for and against a phylogenetic fundamentalism that states that mitochondria are bacteria and should be given their own taxonomic family, the Mitochondriaceae. I also explore the consequences of recognizing mitochondria as bacteria for our understanding of the systemic response to trauma and for the prospects of creating transgenic mitochondria.


           


          Molecular Cell
          Volume 41, Issue 3, 4 February 2011, Pages 247-248 



          Article

          Nascent Peptide in the Ribosome Exit Tunnel Affects Functional Properties of the A-Site of the Peptidyl Transferase Center

          Haripriya Ramu1Nora Vázquez-Laslop1Dorota Klepacki1Qing Dai2Joseph Piccirilli3Ronald Micura4 andAlexander S. Mankin1Corresponding Author Contact InformationE-mail The Corresponding Author

          1 Center for Pharmaceutical Biotechnology, University of Illinois, Chicago, IL 60607, USA

          2 Department of Molecular Genetics and Cell Biology, University of Illinois, Chicago, IL 60607, USA

          3 Department of Biochemistry and Molecular Biology and Department of Chemistry, University of Chicago, Chicago, IL 60637, USA

          4 Institute of Organic Chemistry and Center for Molecular Biosciences, University of Innsbruck, Innsbruck, Austria

          Received 4 October 2010;  
          revised 30 October 2010;  
          accepted 11 November 2010.  
          Published: February 3, 2011. 
          Available online 3 February 2011. 


          Referred to by:Peptides in the Ribosomal Tunnel Talk Back
          Molecular CellVolume 41, Issue 34 February 2011Pages 247-248
          Daniel N. Wilson
           PDF (286 K)   |      

          Summary

          The ability to monitor the nascent peptide structure and to respond functionally to specific nascent peptide sequences is a fundamental property of the ribosome. An extreme manifestation of such response is nascent peptide-dependent ribosome stalling, involved in the regulation of gene expression. The molecular mechanisms of programmed translation arrest are unclear. By analyzing ribosome stalling at the regulatory cistron of the antibiotic resistance gene ermA, we uncovered a carefully orchestrated cooperation between the ribosomal exit tunnel and the A-site of the peptidyl transferase center (PTC) in halting translation. The presence of an inducing antibiotic and a specific nascent peptide in the exit tunnel abrogate the ability of the PTC to catalyze peptide bond formation with a particular subset of amino acids. The extent of the conferred A-site selectivity is modulated by the C-terminal segment of the nascent peptide, where the third-from-last residue plays a critical role.

          Graphical Abstract


          Highlights

          ► In the presence of erythromycin, the ribosome stalls at the 8th codon of ermAL1 ► Stalled ribosome is unable to catalyze peptide bond formation with the 9th amino acid ► The nature of the A-site codon is critical for stalling ► Nascent peptide sequence renders the A-site selective to the nature of aminoacyl-tRNA


          Article

          The Base-Pairing RNA Spot 42 Participates in a Multioutput Feedforward Loop to Help Enact Catabolite Repression inEscherichia coli

          Chase L. Beisel1Corresponding Author Contact InformationE-mail The Corresponding Author and Gisela Storz1Corresponding Author Contact InformationE-mail The Corresponding Author

          1 Cell Biology and Metabolism Program, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892-5430, USA

          Received 13 August 2010;  
          revised 29 October 2010;  
          accepted 1 December 2010.  
          Published: February 3, 2011. 
          Available online 3 February 2011. 


          Referred to by:Sweet Business: Spot42 RNA Networks with CRP to Modulate Catabolite Repression
          Molecular CellVolume 41, Issue 34 February 2011Pages 245-246
          Kai Papenfort, Jörg Vogel
           PDF (113 K)   |      

          Summary

          Bacteria selectively consume some carbon sources over others through a regulatory mechanism termed catabolite repression. Here, we show that the base-pairing RNA Spot 42 plays a broad role in catabolite repression in Escherichia coli by directly repressing genes involved in central and secondary metabolism, redox balancing, and the consumption of diverse nonpreferred carbon sources. Many of the genes repressed by Spot 42 are transcriptionally activated by the global regulator CRP. Since CRP represses Spot 42, these regulators participate in a specific regulatory circuit called a multioutput feedforward loop. We found that this loop can reduce leaky expression of target genes in the presence of glucose and can maintain repression of target genes under changing nutrient conditions. Our results suggest that base-pairing RNAs in feedforward loops can help shape the steady-state levels and dynamics of gene expression.

          Graphical Abstract


          Highlights

          ► Spot 42 regulates the consumption of numerous nonpreferred carbon sources ► Spot 42 participates with CRP in a multioutput coherent feedforward loop ► The loop can reduce the leaky expression of glucose-repressed genes ► The loop can help maintain glucose repression under changing nutrient conditions


          Preview

          Peptides in the Ribosomal Tunnel Talk Back

          Daniel N. Wilson1Corresponding Author Contact InformationE-mail The Corresponding Author

          1 Gene Center and Department for Biochemistry, Center for Protein Science-Munich (CiPS-M), University of Munich, Feodor-Lynen-Strasse 25, D-81377 Munich, Germany


          Available online 3 February 2011. 


          Refers to:Nascent Peptide in the Ribosome Exit Tunnel Affects Functional Properties of the A-Site of the Peptidyl Transferase Center
          Molecular CellVolume 41, Issue 34 February 2011Pages 321-330
          Haripriya Ramu, Nora Vázquez-Laslop, Dorota Klepacki, Qing Dai, Joseph Piccirilli, Ronald Micura, Alexander S. Mankin
           PDF (868 K)   |  Supplementary content  |      

          In this issue of Molecular Cell, Ramu et al. demonstrate that nascent peptides located within the ribosomal tunnel can talk back to the peptidyl transferase center to induce translational stalling by restricting the species of aminoacyl-tRNAs that can bind there.

          Nascent Peptide in the Ribosome Exit Tunnel Affects Functional Properties of the A-Site of the Peptidyl Transferase Center  Original Research Article 

          Pages 321-330 
          Haripriya Ramu, Nora Vázquez-Laslop, Dorota Klepacki, Qing Dai, Joseph Piccirilli, Ronald Micura, Alexander S. Mankin

          Graphical Abstract

          Highlights

          ► In the presence of erythromycin, the ribosome stalls at the 8th codon of ermAL1 ► Stalled ribosome is unable to catalyze peptide bond formation with the 9th amino acid ► The nature of the A-site codon is critical for stalling ► Nascent peptide sequence renders the A-site selective to the nature of aminoacyl-tRNA


          Nature Structural & Molecular Biology
          TABLE OF CONTENTS

          February 2011 Volume 18, Issue 2

          RNA secondary structure in mutually exclusive splicing pp159 - 168
          Yun Yang, Leilei Zhan, Wenjing Zhang, Feng Sun, Wenfeng Wang, Nan Tian, Jingpei Bi, Haitao Wang, Dike Shi, Yajian Jiang, Yaozhou Zhang and Yongfeng Jin
          doi:10.1038/nsmb.1959
          Alternative splicing plays a major role in the generation of functional diversity but the underlying mechanisms remain poorly understood. In a comparative genome analysis of 73 arthropod species, spanning around 420 million years of evolution, Yongfeng and coworkers find built-in intronic elements that lead to mutual exclusive splicing. These elements are species- or clade-specific, but evolutionarily conserved at the secondary structure level.
          Abstract | Full Text | PDF

          Alternate rRNA secondary structures as regulators of translation pp169 - 176
          Shu Feng, Heng Li, Jing Zhao, Konstantin Pervushin, Ky Lowenhaupt, Thomas U Schwartz and Peter Dröge
          doi:10.1038/nsmb.1962
          Understanding the structural dynamics of ribosomal components is key to understanding translation. The Z-DNA– and Z-RNA–binding domain from the human RNA editing enzyme ADAR1-L is now shown to bind to specific regions of ribosomal RNAs affecting translation, suggesting that these regions might at least transiently form Z-RNA structure not observed in crystal structures.
          Abstract | Full Text | PDF

          NATURE STRUCTURAL & MOLECULAR BIOLOGY | ARTICLE


          Dynamic local unfolding in the serpin α-1 antitrypsin provides a mechanism for loop insertion and polymerization

          Nature Structural & Molecular Biology
           
          18,
           
          222–226
           
          (2011)
           
          doi:10.1038/nsmb.1976
          Received
           
          30 April 2010
           
          Accepted
           
          12 November 2010
           
          Published online
           
          23 January 2011

          Abstract

          The conformational plasticity of serine protease inhibitors (serpins) underlies both their activities as protease inhibitors and their susceptibility to pathogenic misfolding and aggregation. Here, we structurally characterize a sheet-opened state of the serpin α-1 antitrypsin (α1AT) and show how local unfolding allows functionally essential strand insertion. Mutations in α1AT that cause polymerization-induced serpinopathies map to the labile region, suggesting that the evolution of serpin function required sampling of high risk conformations on a dynamic energy landscape.


          Transcriptome-wide sequencing reveals numerous APOBEC1 mRNA-editing targets in transcript 3′ UTRs

          Nature Structural & Molecular Biology
           
          18,
           
          230–236
           
          (2011)
           
          doi:10.1038/nsmb.1975
          Received
           
          31 August 2010
           
          Accepted
           
          10 November 2010
           
          Published online
           
          23 January 2011

          Abstract

          Apolipoprotein B–editing enzyme, catalytic polypeptide-1 (APOBEC1) is a cytidine deaminase initially identified by its activity in converting a specific cytidine (C) to uridine (U) in apolipoprotein B (apoB) mRNA transcripts in the small intestine. Editing results in the translation of a truncated apoB isoform with distinct functions in lipid transport. To address the possibility that APOBEC1 edits additional mRNAs, we developed a transcriptome-wide comparative RNA sequencing (RNA-Seq) screen. We identified and validated 32 previously undescribed mRNA targets of APOBEC1 editing, all of which are located in AU-rich segments of transcript 3′ untranslated regions (3′ UTRs). Further analysis established several characteristic sequence features of editing targets, which were predictive for the identification of additional APOBEC1 substrates. The transcriptomics approach to RNA editing presented here dramatically expands the list of APOBEC1 mRNA editing targets and reveals a novel cellular mechanism for the modification of transcript 3′ UTRs.

          Omics Gateway

          Use of stable isotope labeling by amino acids in cell culture as a spike-in standard in quantitative proteomics

          Nature Protocols
           
          6,
           
          147–157
           
          (2011)
           
          doi:10.1038/nprot.2010.192

          Abstract

          Mass spectrometry (MS)-based proteomics is increasingly applied in a quantitative format, often based on labeling of samples with stable isotopes that are introduced chemically or metabolically. In the stable isotope labeling by amino acids in cell culture (SILAC) method, two cell populations are cultured in the presence of heavy or light amino acids (typically lysine and/or arginine), one of them is subjected to a perturbation, and then both are combined and processed together. In this study, we describe a different approach—the use of SILAC as an internal or 'spike-in' standard—wherein SILAC is only used to produce heavy labeled reference proteins or proteomes. These are added to the proteomes under investigation after cell lysis and before protein digestion. The actual experiment is therefore completely decoupled from the labeling procedure. Spike-in SILAC is very economical, robust and in principle applicable to all cell- or tissue-based proteomic analyses. Applications range from absolute quantification of single proteins to the quantification of whole proteomes. Spike-in SILAC is especially advantageous when analyzing the proteomes of whole tissues or organisms. The protocol describes the quantitative analysis of a tissue sample relative to super-SILAC spike-in, a mixture of five SILAC-labeled cell lines that accurately represents the tissue. It includes the selection and preparation of the spike-in SILAC standard, the sample preparation procedure, and analysis and evaluation of the results.


            BMC Genomics   

          Article alert


          The latest articles from BMC Genomics, published between 18-Jan-2011 and 31-Jan-2011

          GC content around splice sites affects splicing through pre-mRNA secondary structures

          Jing Zhang1 emailCC Jay Kuo1 email and Liang Chen2 email

          Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, California 90089, USA

          Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, USA

           author email corresponding author email

          BMC Genomics 2011, 12:90doi:10.1186/1471-2164-12-90

          Published:31 January 2011

          Abstract

          Background

          Alternative splicing increases protein diversity by generating multiple transcript isoforms from a single gene through different combinations of exons or through different selections of splice sites. It has been reported that RNA secondary structures are involved in alternative splicing. Here we perform a genomic study of RNA secondary structures around splice sites in humans (Homo sapiens), mice (Mus musculus), fruit flies (Drosophila melanogaster), and nematodes (Caenorhabditis elegans) to further investigate this phenomenon.

          Results

          We observe that GC content around splice sites is closely associated with the splice site usage in multiple species. RNA secondary structure is the possible explanation, because the structural stability difference among alternative splice sites, constitutive splice sites, and skipped splice sites can be explained by the GC content difference. Alternative splice sites tend to be GC-enriched and exhibit more stable RNA secondary structures in all of the considered species. In humans and mice, splice sites of first exons and long exons tend to be GC-enriched and hence form more stable structures, indicating the special role of RNA secondary structures in promoter proximal splicing events and the splicing of long exons. In addition, GC-enriched exon-intron junctions tend to be overrepresented in tissue-specific alternative splice sites, indicating the functional consequence of the GC effect. Compared with regions far from splice sites and decoy splice sites, real splice sites are GC-enriched. We also found that the GC-content effect is much stronger than the nucleotide-order effect to form stable secondary structures.

          Conclusion

          All of these results indicate that GC content is related to splice site usage and it may mediate the splicing process through RNA secondary structures.

          Population transcriptomics of Drosophila melanogasterfemales

          Lena Müller1 emailStephan Hutter1 emailRayna Stamboliyska1 emailSarah S Saminadin-Peter1,2 emailWolfgang Stephan1 email and John Parsch1 email

          Department of Biology II, University of Munich (LMU), 82152 Planegg-Martinsried, Germany

          Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, USA

           author email corresponding author email

          BMC Genomics 2011, 12:81doi:10.1186/1471-2164-12-81

          Published:28 January 2011

          Abstract

          Background

          Variation at the level of gene expression is abundant in natural populations and is thought to contribute to the adaptive divergence of populations and species. Gene expression also differs considerably between males and females. Here we report a microarray analysis of gene expression variation among females of 16 Drosophilamelanogaster strains derived from natural populations, including eight strains from the putative ancestral range in sub-Saharan Africa and eight strains from Europe. Gene expression variation among males of the same strains was reported previously.

          Results

          We detected relatively low levels of expression polymorphism within populations, but much higher expression divergence between populations. A total of 569 genes showed a significant expression difference between the African and European populations at a false discovery rate of 5%. Genes with significant over-expression in Europe included the insecticide resistance gene Cyp6g1, as well as genes involved in proteolysis and olfaction. Genes with functions in carbohydrate metabolism and vision were significantly over-expressed in the African population. There was little overlap between genes expressed differently between populations in females and males.

          Conclusions

          Our results suggest that adaptive changes in gene expression have accompanied the out-of-Africa migration of D. melanogaster. Comparison of female and male expression data indicates that the vast majority of genes differing in expression between populations do so in only one sex and suggests that most regulatory adaptation has been sex-specific.

          Conserved generation of short products at piRNA loci

          Philipp Berninger1,2 emailLukasz Jaskiewicz1 emailMohsen Khorshid1 email and Mihaela Zavolan1 email

          Biozentrum, Universität Basel and Swiss Institute of Bioinformatics, Klingelbergstrasse 50-70, 4056 Basel, Switzerland

          EMBL Grenoble, 6 rue Jules Horowitz, 38042 Grenoble, France

           author email corresponding author email

          BMC Genomics 2011, 12:46doi:10.1186/1471-2164-12-46

          Published:19 January 2011

          Abstract

          Background

          The piRNA pathway operates in animal germ lines to ensure genome integrity through retrotransposon silencing. The Piwi protein-associated small RNAs (piRNAs) guide Piwi proteins to retrotransposon transcripts, which are degraded and thereby post-transcriptionally silenced through a ping-pong amplification process. Cleavage of the retrotransposon transcript defines at the same time the 5' end of a secondary piRNA that will in turn guide a Piwi protein to a primary piRNA precursor, thereby amplifying primary piRNAs. Although several studies provided evidence that this mechanism is conserved among metazoa, how the process is initiated and what enzymatic activities are responsible for generating the primary and secondary piRNAs are not entirely clear.

          Results

          Here we analyzed small RNAs from three mammalian species, seeking to gain further insight into the mechanisms responsible for the piRNA amplification loop. We found that in all these species piRNA-directed targeting is accompanied by the generation of short sequences that have a very precisely defined length, 19 nucleotides, and a specific spatial relationship with the guide piRNAs.

          Conclusions

          This suggests that the processing of the 5' product of piRNA-guided cleavage occurs while the piRNA target is engaged by the Piwi protein. Although they are not stabilized through methylation of their 3' ends, the 19-mers are abundant not only in testes lysates but also in immunoprecipitates of Miwi and Mili proteins. They will enable more accurate identification of piRNA loci in deep sequencing data sets.

            BMC Systems Biology   

          Article alert


          The latest articles from BMC Systems Biology, published between 18-Jan-2011 and 31-Jan-2011

          Noise regulation by quorum sensing in low mRNA copy number systems

          Marc Weber email and Javier Buceta email

          Computer Simulation and Modelling (Co.S.Mo.) Lab, Parc Científic de Barcelona, C/Baldiri Reixac 10-12, Barcelona 08028, Spain

           author email corresponding author email

          BMC Systems Biology 2011, 5:11doi:10.1186/1752-0509-5-11

          Published:20 January 2011

          Abstract

          Background

          Cells must face the ubiquitous presence of noise at the level of signaling molecules. The latter constitutes a major challenge for the regulation of cellular functions including communication processes. In the context of prokaryotic communication, the so-called quorum sensing (QS) mechanism relies on small diffusive molecules that are produced and detected by cells. This poses the intriguing question of how bacteria cope with the fluctuations for setting up a reliable information exchange.

          Results

          We present a stochastic model of gene expression that accounts for the main biochemical processes that describe the QS mechanism close to its activation threshold. Within that framework we study, both numerically and analytically, the role that diffusion plays in the regulation of the dynamics and the fluctuations of signaling molecules. In addition, we unveil the contribution of different sources of noise, intrinsic and transcriptional, in the QS mechanism.

          Conclusions

          The interplay between noisy sources and the communication process produces a repertoire of dynamics that depends on the diffusion rate. Importantly, the total noise shows a non-monotonic behavior as a function of the diffusion rate. QS systems seems to avoid values of the diffusion that maximize the total noise. These results point towards the direction that bacteria have adapted their communication mechanisms in order to improve the signal-to-noise ratio.



          [About the cover]

          Science, 11 February 2011 (Volume 331, Issue 6018) 
          http://www.sciencemag.org/content/vol331/issue6018/index.dtl?etoc

          Also online at Science::






          PLoS Genetics: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access
          Read the Journal|Submit to PLoS|Get E-mail Alerts|Contact Us|

          New Articles in PLoS Genetics

          Published February 10, 2011

          Correlated Evolution of Nearby Residues in Drosophilid Proteins

          Benjamin Callahan1*Richard A. Neher2¤Doris Bachtrog3,Peter Andolfatto4Boris I. Shraiman2,5

          1 Department of Applied Physics, Stanford University, Stanford, California, United States of America, 2 Kavli Institute for Theoretical Physics, University of California Santa Barbara, Santa Barbara, California, United States of America, 3 Department of Integrative Biology and Center for Theoretical Evolutionary Genomics, University of California Berkeley, Berkeley, California, United States of America, 4 Department of Ecology and Evolutionary Biology and the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America, 5Department of Physics, University of California Santa Barbara, Santa Barbara, California, United States of America

          Abstract Top

          Here we investigate the correlations between coding sequence substitutions as a function of their separation along the protein sequence. We consider both substitutions between the reference genomes of several Drosophilids as well as polymorphisms in a population sample of Zimbabwean Drosophila melanogaster. We find that amino acid substitutions are “clustered” along the protein sequence, that is, the frequency of additional substitutions is strongly enhanced within ≈10 residues of a first such substitution. No such clustering is observed for synonymous substitutions, supporting a “correlation length” associated with selection on proteins as the causative mechanism. Clustering is stronger between substitutions that arose in the same lineage than it is between substitutions that arose in different lineages. We consider several possible origins of clustering, concluding that epistasis (interactions between amino acids within a protein that affect function) and positional heterogeneity in the strength of purifying selection are primarily responsible. The role of epistasis is directly supported by the tendency of nearby substitutions that arose on the same lineage to preserve the total charge of the residues within the correlation length and by the preferential cosegregation of neighboring derived alleles in our population sample. We interpret the observed length scale of clustering as a statistical reflection of the functional locality (or modularity) of proteins: amino acids that are near each other on the protein backbone are more likely to contribute to, and collaborate toward, a common subfunction.


          Epistatic Interaction Maps Relative to Multiple Metabolic Phenotypes

          Evan S. Snitkin1,2Daniel Segrè1,3*

          1 Program in Bioinformatics, Boston University, Boston, Massachusetts, United States of America, 2 Genetics and Molecular Biology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America, 3 Department of Biology and Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America

          Abstract Top

          An epistatic interaction between two genes occurs when the phenotypic impact of one gene depends on another gene, often exposing a functional association between them. Due to experimental scalability and to evolutionary significance, abundant work has been focused on studying how epistasis affects cellular growth rate, most notably in yeast. However, epistasis likely influences many different phenotypes, affecting our capacity to understand cellular functions, biochemical networks adaptation, and genetic diseases. Despite its broad significance, the extent and nature of epistasis relative to different phenotypes remain fundamentally unexplored. Here we use genome-scale metabolic network modeling to investigate the extent and properties of epistatic interactions relative to multiple phenotypes. Specifically, using an experimentally refined stoichiometric model for 1Saccharomyces cerevisiae, we computed a three-dimensional matrix of epistatic interactions between any two enzyme gene deletions, with respect to all metabolic flux phenotypes. We found that the total number of epistatic interactions between enzymes increases rapidly as phenotypes are added, plateauing at approximately 80 phenotypes, to an overall connectivity that is roughly 8-fold larger than the one observed relative to growth alone. Looking at interactions across all phenotypes, we found that gene pairs interact incoherently relative to different phenotypes, i.e. antagonistically relative to some phenotypes and synergistically relative to others. Specific deletion-deletion-phenotype triplets can be explained metabolically, suggesting a highly informative role of multi-phenotype epistasis in mapping cellular functions. Finally, we found that genes involved in many interactions across multiple phenotypes are more highly expressed, evolve slower, and tend to be associated with diseases, indicating that the importance of genes is hidden in their total phenotypic impact. Our predictions indicate a pervasiveness of nonlinear effects in how genetic perturbations affect multiple metabolic phenotypes. The approaches and results reported could influence future efforts in understanding metabolic diseases and the role of biochemical regulation in the cell.


          Pervasive Adaptive Protein Evolution Apparent in Diversity Patterns around Amino Acid Substitutions in Drosophila simulans

          Shmuel Sattath1Eyal Elyashiv1Oren Kolodny1Yosef Rinott2Guy Sella1*

          1 Department of Ecology, Evolution, and Behavior, Hebrew University of Jerusalem, Jerusalem, Israel, 2 Department of Statistics, Hebrew University of Jerusalem, Jerusalem, Israel

          Abstract Top

          In Drosophila, multiple lines of evidence converge in suggesting that beneficial substitutions to the genome may be common. All suffer from confounding factors, however, such that the interpretation of the evidence—in particular, conclusions about the rate and strength of beneficial substitutions—remains tentative. Here, we use genome-wide polymorphism data in D. simulans and sequenced genomes of its close relatives to construct a readily interpretable characterization of the effects of positive selection: the shape of average neutral diversity around amino acid substitutions. As expected under recurrent selective sweeps, we find a trough in diversity levels around amino acid but not around synonymous substitutions, a distinctive pattern that is not expected under alternative models. This characterization is richer than previous approaches, which relied on limited summaries of the data (e.g., the slope of a scatter plot), and relates to underlying selection parameters in a straightforward way, allowing us to make more reliable inferences about the prevalence and strength of adaptation. Specifically, we develop a coalescent-based model for the shape of the entire curve and use it to infer adaptive parameters by maximum likelihood. Our inference suggests that ~13% of amino acid substitutions cause selective sweeps. Interestingly, it reveals two classes of beneficial fixations: a minority (approximately 3%) that appears to have had large selective effects and accounts for most of the reduction in diversity, and the remaining 10%, which seem to have had very weak selective effects. These estimates therefore help to reconcile the apparent conflict among previously published estimates of the strength of selection. More generally, our findings provide unequivocal evidence for strongly beneficial substitutions in Drosophila and illustrate how the rapidly accumulating genome-wide data can be leveraged to address enduring questions about the genetic basis of adaptation

          Parallel Evolution of a Type IV Secretion System in Radiating Lineages of the Host-Restricted Bacterial Pathogen Bartonella

          Philipp Engel1Walter Salzburger2Marius Liesch1Chao-Chin Chang3Soichi Maruyama4Christa Lanz5Alexandra Calteau6Aurélie Lajus6Claudine Médigue6Stephan C. Schuster7Christoph Dehio1*

          1 Focal Area Infection Biology, Biozentrum, University of Basel, Basel, Switzerland,2 Zoological Institute, University of Basel, Basel, Switzerland, 3 College of Veterinary Medicine, National Chung Hsing University, Taichung, Taiwan, 4 Nihon University, Fujisawa, Kanagawa, Japan, 5 Max Planck Institute for Developmental Biology, Tübingen, Germany, 6 Commissariat à l'Energie Atomique (CEA), Direction des Sciences du Vivant, Institut de Génomique, Genoscope and CNRS-UMR 8030, Laboratoire d'Analyse Bioinformatique en Génomique et Métabolisme, Evry, France,7 Center for Comparative Genomics and Bioinformatics, Penn State University, University Park, Pennsylvania, United States of America

          Abstract Top

          Adaptive radiation is the rapid origination of multiple species from a single ancestor as the result of concurrent adaptation to disparate environments. This fundamental evolutionary process is considered to be responsible for the genesis of a great portion of the diversity of life. Bacteria have evolved enormous biological diversity by exploiting an exceptional range of environments, yet diversification of bacteria via adaptive radiation has been documented in a few cases only and the underlying molecular mechanisms are largely unknown. Here we show a compelling example of adaptive radiation in pathogenic bacteria and reveal their genetic basis. Our evolutionary genomic analyses of the α-proteobacterial genus Bartonella uncover two parallel adaptive radiations within these host-restricted mammalian pathogens. We identify a horizontally-acquired protein secretion system, which has evolved to target specific bacterial effector proteins into host cells as the evolutionary key innovation triggering these parallel adaptive radiations. We show that the functional versatility and adaptive potential of the VirB type IV secretion system (T4SS), and thereby translocated Bartonella effector proteins (Beps), evolved in parallel in the two lineages prior to their radiations. Independent chromosomal fixation of thevirB operon and consecutive rounds of lineage-specific bep gene duplications followed by their functional diversification characterize these parallel evolutionary trajectories. Whereas most Beps maintained their ancestral domain constitution, strikingly, a novel type of effector protein emerged convergently in both lineages. This resulted in similar arrays of host cell-targeted effector proteins in the two lineages ofBartonella as the basis of their independent radiation. The parallel molecular evolution of the VirB/Bep system displays a striking example of a key innovation involved in independent adaptive processes and the emergence of bacterial pathogens. Furthermore, our study highlights the remarkable evolvability of T4SSs and their effector proteins, explaining their broad application in bacterial interactions with the environment.

          The Architecture of Gene Regulatory Variation across Multiple Human Tissues: The MuTHER Study

          Alexandra C. Nica1,2Leopold Parts1Daniel Glass3James Nisbet1Amy Barrett4Magdalena Sekowska1Mary Travers4Simon Potter1Elin Grundberg1,3Kerrin Small1,3,Åsa K. Hedman4Veronique Bataille3Jordana Tzenova Bell3,4Gabriela Surdulescu3Antigone S. Dimas2,4,Catherine Ingle1Frank O. Nestle5Paola di Meglio5Josine L. Min4Alicja Wilk1Christopher J. Hammond3Neelam Hassanali4Tsun-Po Yang1Stephen B. Montgomery2,Steve O'Rahilly6Cecilia M. Lindgren4Krina T. Zondervan4,Nicole Soranzo1,3Inês Barroso1,6Richard Durbin1,Kourosh Ahmadi3Panos Deloukas1*Mark I. McCarthy4,7,8*Emmanouil T. Dermitzakis2*Timothy D. Spector3*The MuTHER Consortium

          1 Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom, 2 Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland, 3 Department of Twin Research, King's College London, London, United Kingdom, 4 Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom, 5 St. John's Institute of Dermatology, King's College London, London, United Kingdom, 6 University of Cambridge Metabolic Research Labs, Institute of Metabolic Science, Addenbrooke's Hospital, Cambridge, United Kingdom, 7 Oxford Centre for Diabetes, Endocrinology, and Metabolism, University of Oxford, Churchill Hospital, Oxford, United Kingdom, 8 Oxford NIHR Biomedical Research Centre, Churchill Hospital, Oxford, United Kingdom

          Abstract Top

          While there have been studies exploring regulatory variation in one or more tissues, the complexity of tissue-specificity in multiple primary tissues is not yet well understood. We explore in depth the role ofcis-regulatory variation in three human tissues: lymphoblastoid cell lines (LCL), skin, and fat. The samples (156 LCL, 160 skin, 166 fat) were derived simultaneously from a subset of well-phenotyped healthy female twins of the MuTHER resource. We discover an abundance of cis-eQTLs in each tissue similar to previous estimates (858 or 4.7% of genes). In addition, we apply factor analysis (FA) to remove effects of latent variables, thus more than doubling the number of our discoveries (1,822 eQTL genes). The unique study design (Matched Co-Twin Analysis—MCTA) permits immediate replication of eQTLs using co-twins (93%–98%) and validation of the considerable gain in eQTL discovery after FA correction. We highlight the challenges of comparing eQTLs between tissues. After verifying previous significance threshold-based estimates of tissue-specificity, we show their limitations given their dependency on statistical power. We propose that continuous estimates of the proportion of tissue-shared signals and direct comparison of the magnitude of effect on the fold change in expression are essential properties that jointly provide a biologically realistic view of tissue-specificity. Under this framework we demonstrate that 30% of eQTLs are shared among the three tissues studied, while another 29% appear exclusively tissue-specific. However, even among the shared eQTLs, a substantial proportion (10%–20%) have significant differences in the magnitude of fold change between genotypic classes across tissues. Our results underline the need to account for the complexity of eQTL tissue-specificity in an effort to assess consequences of such variants for complex traits.

          Quantitative Models of the Mechanisms That Control Genome-Wide Patterns of Transcription Factor Binding during EarlyDrosophila Development

          Tommy Kaplan1Xiao-Yong Li2Peter J. Sabo3Sean Thomas3John A. Stamatoyannopoulos3Mark D. Biggin4*,Michael B. Eisen1,2,4*

          1 Department of Molecular and Cell Biology, California Institute of Quantitative Biosciences, University of California Berkeley, Berkeley, California, United States of America, 2 Howard Hughes Medical Institute, University of California Berkeley, Berkeley, California, United States of America, 3 Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America, 4Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America

          Abstract Top

          Transcription factors that drive complex patterns of gene expression during animal development bind to thousands of genomic regions, with quantitative differences in binding across bound regions mediating their activity. While we now have tools to characterize the DNA affinities of these proteins and to precisely measure their genome-wide distribution in vivo, our understanding of the forces that determine where, when, and to what extent they bind remains primitive. Here we use a thermodynamic model of transcription factor binding to evaluate the contribution of different biophysical forces to the binding of five regulators of early embryonic anterior-posterior patterning in Drosophila melanogaster. Predictions based on DNA sequence and in vitro protein-DNA affinities alone achieve a correlation of ~0.4 with experimental measurements of in vivo binding. Incorporating cooperativity and competition among the five factors, and accounting for spatial patterning by modeling binding in every nucleus independently, had little effect on prediction accuracy. A major source of error was the prediction of binding events that do not occur in vivo, which we hypothesized reflected reduced accessibility of chromatin. To test this, we incorporated experimental measurements of genome-wide DNA accessibility into our model, effectively restricting predicted binding to regions of open chromatin. This dramatically improved our predictions to a correlation of 0.6–0.9 for various factors across known target genes. Finally, we used our model to quantify the roles of DNA sequence, accessibility, and binding competition and cooperativity. Our results show that, in regions of open chromatin, binding can be predicted almost exclusively by the sequence specificity of individual factors, with a minimal role for protein interactions. We suggest that a combination of experimentally determined chromatin accessibility data and simple computational models of transcription factor binding may be used to predict the binding landscape of any animal transcription factor with significant precision.


          PLoS Computational Biology: a peer-reviewed open-access journal published by the Public Library of ScienceOpen Access
          Read the Journal|Submit to PLoS|Get E-mail Alerts|Contact Us|


          A Mathematical Framework for Protein Structure Comparison

          Wei LiuAnuj Srivastava*Jinfeng Zhang*

          Department of Statistics, Florida State University, Tallahassee, Florida, United States of America

          Abstract Top

          Comparison of protein structures is important for revealing the evolutionary relationship among proteins, predicting protein functions and predicting protein structures. Many methods have been developed in the past to align two or multiple protein structures. Despite the importance of this problem, rigorous mathematical or statistical frameworks have seldom been pursued for general protein structure comparison. One notable issue in this field is that with many different distances used to measure the similarity between protein structures, none of them are proper distances when protein structures of different sequences are compared. Statistical approaches based on those non-proper distances or similarity scores as random variables are thus not mathematically rigorous. In this work, we develop a mathematical framework for protein structure comparison by treating protein structures as three-dimensional curves. Using an elastic Riemannian metric on spaces of curves, geodesic distance, a proper distance on spaces of curves, can be computed for any two protein structures. In this framework, protein structures can be treated as random variables on the shape manifold, and means and covariance can be computed for populations of protein structures. Furthermore, these moments can be used to build Gaussian-type probability distributions of protein structures for use in hypothesis testing. The covariance of a population of protein structures can reveal the population-specific variations and be helpful in improving structure classification. With curves representing protein structures, the matching is performed using elastic shape analysis of curves, which can effectively model conformational changes and insertions/deletions. We show that our method performs comparably with commonly used methods in protein structure classification on a large manually annotated data set.

          Accurate Quantification of Functional Analogy among Close Homologs

          Maria D. Chikina1Olga G. Troyanskaya2,3*

          1 Department of Molecular Biology, Princeton University, Princeton, New Jersey, United States of America, 2 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America, 3 Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America

          Abstract Top

          Correctly evaluating functional similarities among homologous proteins is necessary for accurate transfer of experimental knowledge from one organism to another, and is of particular importance for the development of animal models of human disease. While the fact that sequence similarity implies functional similarity is a fundamental paradigm of molecular biology, sequence comparison does not directly assess the extent to which two proteins participate in the same biological processes, and has limited utility for analyzing families with several parologous members. Nevertheless, we show that it is possible to provide a cross-organism functional similarity measure in an unbiased way through the exclusive use of high-throughput gene-expression data. Our methodology is based on probabilistic cross-species mapping of functionally analogous proteins based on Bayesian integrative analysis of gene expression compendia. We demonstrate that even among closely related genes, our method is able to predict functionally analogous homolog pairs better than relying on sequence comparison alone. We also demonstrate that the landscape of functional similarity is often complex and that definitive “functional orthologs” do not always exist. Even in these cases, our method and the online interface we provide are designed to allow detailed exploration of sources of inferred functional similarity that can be evaluated by the user.

          New Articles in PLoS Computational Biology

          Gene Expression Noise in Spatial Patterning: hunchbackPromoter Structure Affects Noise Amplitude and Distribution in Drosophila Segmentation

          David M. Holloway1,2*Francisco J. P. Lopes3Luciano da Fontoura Costa4Bruno A. N. Travençolo4,5Nina Golyandina6Konstantin Usevich6Alexander V. Spirov7

          1 Mathematics Department, British Columbia Institute of Technology, Burnaby, British Columbia, Canada, 2 Biology Department, University of Victoria, Victoria, British Columbia, Canada, 3 Instituto de Biofisica, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil, 4 Instituto de Fisica de Sao Carlos, Universidade de Sao Paulo, Sao Carlos, Sao Paulo, Brazil, 5 Faculty of Computing, Federal University of Uberlândia, Uberlândia, Brazil, 6 Mathematics and Mechanics Faculty, St. Petersburg State University, St. Petersburg, Russia, 7 Computer Science and Center of Excellence in Wireless and Information Technology, Stony Brook University, Stony Brook, New York, United States of America

          Abstract Top

          1Positional information in developing embryos is specified by spatial gradients of transcriptional regulators. One of the classic systems for studying this is the activation of the hunchback (hb) gene in early fruit fly (Drosophila) segmentation by the maternally-derived gradient of the Bicoid (Bcd) protein. Gene regulation is subject to intrinsic noise which can produce variable expression. This variability must be constrained in the highly reproducible and coordinated events of development. We identify means by which noise is controlled during gene expression by characterizing the dependence of hb mRNA and protein output noise on hb promoter structure and transcriptional dynamics. We use a stochastic model of the hb promoter in which the number and strength of Bcd and Hb (self-regulatory) binding sites can be varied. Model parameters are fit to data from WT embryos, the self-regulation mutant hb14F, and lacZ reporter constructs using different portions of the hb promoter. We have corroborated model noise predictions experimentally. The results indicate that WT (self-regulatory) Hb output noise is predominantly dependent on the transcription and translation dynamics of its own expression, rather than on Bcd fluctuations. The constructs and mutant, which lack self-regulation, indicate that the multiple Bcd binding sites in the hbpromoter (and their strengths) also play a role in buffering noise. The model is robust to the variation in Bcd binding site number across a number of fly species. This study identifies particular ways in which promoter structure and regulatory dynamics reduce hb output noise. Insofar as many of these are common features of genes (e.g. multiple regulatory sites, cooperativity, self-feedback), the current results contribute to the general understanding of the reproducibility and determinacy of spatial patterning in early development.


          Structural Properties of the Caenorhabditis elegans Neuronal Network

          Lav R. Varshney1Beth L. Chen2Eric Paniagua3David H. Hall4Dmitri B. Chklovskii5*

          1 Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America, 2 Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America, 3 California Institute of Technology, Pasadena, California, United States of America, 4 Albert Einstein College of Medicine, Bronx, New York, United States of America, 5 Howard Hughes Medical Institute, Janelia Farm Research Campus, Ashburn, Virginia, United States of America

          Abstract Top

          Despite recent interest in reconstructing neuronal networks, complete wiring diagrams on the level of individual synapses remain scarce and the insights into function they can provide remain unclear. Even forCaenorhabditis elegans, whose neuronal network is relatively small and stereotypical from animal to animal, published wiring diagrams are neither accurate nor complete and self-consistent. Using materials from White et al. and new electron micrographs we assemble whole, self-consistent gap junction and chemical synapse networks of hermaphrodite C. elegans. We propose a method to visualize the wiring diagram, which reflects network signal flow. We calculate statistical and topological properties of the network, such as degree distributions, synaptic multiplicities, and small-world properties, that help in understanding network signal propagation. We identify neurons that may play central roles in information processing, and network motifs that could serve as functional modules of the network. We explore propagation of neuronal activity in response to sensory or artificial stimulation using linear systems theory and find several activity patterns that could serve as substrates of previously described behaviors. Finally, we analyze the interaction between the gap junction and the chemical synapse networks. Since several statistical properties of the C. elegans network, such as multiplicity and motif distributions are similar to those found in mammalian neocortex, they likely point to general principles of neuronal networks. The wiring diagram reported here can help in understanding the mechanistic basis of behavior by generating predictions about future experiments involving genetic perturbations, laser ablations, or monitoring propagation of neuronal activity in response to stimulation.

          Stochastic Theory of Early Viral Infection: Continuous versus Burst Production of Virions

          John E. Pearson1Paul Krapivsky2Alan S. Perelson1*

          1 Theoretical Biology & Biophysics, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America, 2 Department of Physics, Boston University, Boston, Massachusetts, United States of America

          Abstract Top

          Viral production from infected cells can occur continuously or in a burst that generally kills the cell. For HIV infection, both modes of production have been suggested. Standard viral dynamic models formulated as sets of ordinary differential equations can not distinguish between these two modes of viral production, as the predicted dynamics is identical as long as infected cells produce the same total number of virions over their lifespan. Here we show that in stochastic models of viral infection the two modes of viral production yield different early term dynamics. Further, we analytically determine the probability that infections initiated with any number of virions and infected cells reach extinction, the state when both the population of virions and infected cells vanish, and show this too has different solutions for continuous and burst production. We also compute the distributions of times to establish infection as well as the distribution of times to extinction starting from both a single virion as well as from a single infected cell for both modes of virion production.




          Global Funding Outlook 2011       

          What country has the sunniest forecast?




          [About the cover]

          Science, 4 February 2011 (Volume 331, Issue 6017) 
          http://www.sciencemag.org/content/vol331/issue6017/index.dtl?etoc


          Science 4 February 2011: 
          Vol. 331 no. 6017 p. 511 
          DOI: 10.1126/science.1203356
          • EDITORIAL

          Lessons from Genomics

          1. Bruce Alberts
          1. Bruce Alberts is Editor-in-Chief of Science.

          In February 2001, Nature and Science provided the first detailed look at the human genome: a string of some 3 billion nucleotides whose unique sequence forms the genetic blueprint for each individual. This momentous occasion made headlines around the world. Now that a decade has elapsed, where has this achievement led us and where are we going with other such ambitious endeavors? Throughout this month in Science, the News and Commentary sections will present viewpoints and analyses of the effects of the genomics revolution on science and society (seehttp://scim.ag/genome10). Many lessons can be derived from the Human Genome Project that should be helpful in guiding other large science projects through their inevitable challenges.*

          The editors suggest the following Related Resources on Science sites

          In Science Magazine

          • PERSPECTIVE
          CELL BIOLOGY

          A Translational Pause to Localize

          1. David Ron1 and 
          2. Koreaki Ito2

          +Author Affiliations

          1. 1Institute of Metabolic Sciences, University of Cambridge, Cambridge, CB2 0QQ, UK.
          2. 2Faculty of Life Sciences, Kyoto Sangyo University, Kyoto 603-8555, Japan.
          1. E-mail: dr360@medschl.cam.ac.ukkito@cc.kyoto-su.ac.jp

          The unconventional splicing of a messenger RNA (mRNA) is key to a mechanism that controls the cellular response to unfolded proteins that accumulate in the endoplasmic reticulum (ER). Mammalian cells attempt to counterbalance this state of stress by expressing specific genes through the transcription factor XBP1 (1). The synthesis of this transcription factor requires splicing to generate its encoding mRNA, a process that occurs at the cytoplasmic face of the ER membrane. On page 586 of this issue (2), Yanagitani et al. reveal how translational pausing of the mRNA to be spliced contributes to this localization. The finding reveals surprising similarities in mechanisms regulating translation in eukaryotes and prokaryotes.

          The editors suggest the following Related Resources on Science sites

          In Science Magazine

          • Translational Pausing Ensures Membrane Targeting and Cytoplasmic Splicing of XBP1u mRNA
            • Kota Yanagitani
            • Yukio Kimata
            • Hiroshi Kadokura
            • and Kenji Kohno
            Science 4 February 2011586-589.Published online 13 January 2011

            Science 4 February 2011: 
            Vol. 331 no. 6017 pp. 586-589 
            DOI: 10.1126/science.1197142
            • REPORT

            Translational Pausing Ensures Membrane Targeting and Cytoplasmic Splicing of XBP1u mRNA

            1. Kota Yanagitani
            2. Yukio Kimata
            3. Hiroshi Kadokura, and 
            4. Kenji Kohno*

            +Author Affiliations

            1. Laboratory of Molecular and Cell Genetics, Graduate School of Biological Sciences, Nara Institute of Science and Technology, 8916-5, Takayama, Ikoma, Nara 630-0192, Japan.
            1. *To whom correspondence should be addressed. E-mail: kkouno@bs.naist.jp

            ABSTRACT

            Upon endoplasmic reticulum (ER) stress, an endoribonuclease, inositol-requiring enzyme-1α, splices the precursor unspliced form of X-box–binding protein 1 messenger RNA (XBP1u mRNA) on the ER membrane to yield an active transcription factor (XBP1s), leading to the alleviation of the stress. The nascent peptide encoded by XBP1u mRNA drags the mRNA–ribosome–nascent chain (R-RNC) complex to the membrane for efficient cytoplasmic splicing. We found that translation of the XBP1u mRNA was briefly paused to stabilize the R-RNC complex. Mutational analysis of XBP1u revealed an evolutionarily conserved peptide module at the carboxyl terminus that was responsible for the translational pausing and was required for the efficient targeting and splicing of the XBP1u mRNA. Thus, translational pausing may be used for unexpectedly diverse cellular processes in mammalian cells.

           
           BMC Bioinformatics   

          Article alert


          The latest articles from BMC Bioinformatics, published between 20-Jan-2011 and 02-Feb-2011


          Research article

          Investigating the effect of paralogs on microarray gene-set analysis

          Andre J Faure1,2 emailCathal Seoighe1,3 email and Nicola J Mulder1 email

          Computational Biology Group, Department of Clinical Laboratory Sciences, University of Cape Town, Cape Town, South Africa

          EMBL-European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK

          School of Mathematics, Statistics and Applied Mathematics, National University of Ireland Galway, Ireland

           author email corresponding author email

          BMC Bioinformatics 2011, 12:29doi:10.1186/1471-2105-12-29

          Published:24 January 2011

          Abstract

          Background

          In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research.

          Results

          We show that paralogs, which typically have high sequence identity and similar molecular functions, also exhibit high correlation in their expression patterns. We investigate this correlation as a potential confounding factor common to current GSA methods using Indygene http://www.cbio.uct.ac.za/indygene webcite, a web tool that reduces a supplied list of genes so that it includes no pairwise paralogy relationships above a specified sequence similarity threshold. We use the tool to reanalyse previously published microarray datasets and determine the potential utility of accounting for the presence of paralogs.

          Conclusions

          The Indygene tool efficiently removes paralogy relationships from a given dataset and we found that such a reduction, performed prior to GSA, has the ability to generate significantly different results that often represent novel and plausible biological hypotheses. This was demonstrated for three different GSA approaches when applied to the reanalysis of previously published microarray datasets and suggests that the redundancy and non-independence of paralogs is an important consideration when dealing with GSA methodologies.


          nocoRNAc: Characterization of non-coding RNAs in prokaryotes

          Alexander Herbig email and Kay Nieselt email

          Center for Bioinformatics Tübingen, University of Tübingen, Sand 14, 72076 Tübingen, Germany

           author email corresponding author email

          BMC Bioinformatics 2011, 12:40doi:10.1186/1471-2105-12-40

          Published:31 January 2011