In vertebrates, protein-coding genes have a GC-peak near their transcriptional start site. This pattern impacts how transcribed mRNAs are exported from the nucleus and translated into proteins. Although it is assumed that patterns of GC-content are shaped solely by adaptive evolution, we demonstrate in this paper that non adaptive processes such as historical GC-biased gene conversion play a major role. In fact, we show that the GC-peak is not preserved by selection and is currently decreasing to reach the mutation rate equilibrium.
In population genetics, a key objective is to elucidate the demographic and molecular mechanisms behind genetic diversity across genomes. Recent advances have highlighted the value of forwardin-time simulations, particularly those facilitated by the SLiM framework, for their capacity to simulate intricate evolutionary processes. Canonical models define meiotic recombination as a variable rate per site per generation. Yet, the passage from one generation to the next is not limited to meiotic processes alone, given that mitotic cell divisions are prevalent in most species. This paper studies the interplay between the frequency of meiosis versus mitosis, recombination rates, and selection coefficients on genetic diversity. Our findings reveal that selective sweeps reduce the entire genomic diversity when meiotic frequency is low. We further elucidate that recombination rate per meiosis, rather than meiotic frequency, defines the breadth of the genetic linkage valley surrounding a selective sweep. Additionally, we explore how dominance coefficient influences the loss of genomic diversity. This paper contributes to a deeper understanding of how recombination rate plays a crucial role in shaping genetic diversity, offering new insights into the complexities of evolutionary genetics. By examining selective sweeps and their genetic consequences, we highlight the importance of incorporating realistic modeling of the most essential parameters, such as recombination rate.
One major goal of molecular evolutionary biology is to identify genomic regions under selection and/ or adaptation. In this perspective, we aim to refine our definitions of selection and genetic drift, as well as additional mechanisms that constrain evolution of the genome under diverse contexts. We highlight that all of these processes need to be taken into account to correctly identify the targets of selection.
Our study explores the processes driving genomic diversity in regions of low recombination using a combination of simulations, theory and analyses of human data. We investigate how selection against several partially recessive variants affects linked neutral diversity (associative overdominance) can increase diversity very strongly, in some cases up to a 3-fold increase relative to the neutral expectation. We also characterize the conditions under which associative overdominance is strong (selection, dominance and recombination parameters). The increase in diversity is driven by the maintenance of complementary haplotypes such that the effects of recessive variants are masked in heterozygous state, which can be considered a form of balancing selection. We finally performed a genome scan on 1000G human populations and identify several genomic regions possibly subject to associative overdominance.
In this study, we examine the genomic diversity of human populations and show that purifying selection at linked sites (i.e. background selection) and GC-biased gene conversion (gBGC) affect as much as 95% of the variants of our genome. The magnitude and relative importance of these processes are largely determined by variation in recombination rate and base composition. By conditioning on genomic regions with recombination rates above 1.5 cM/Mb and mutation types (C↔G, A↔T), we identify a set of SNPs that is mostly unaffected by background selection or gBGC, and that avoids these biases in the reconstruction of human history.
This paper has been highlighted in eLife by a insight paper from Kelley Harris, Neutral evolution: the randomness that shapes our DNA and an eLife digest. It also benefits from a press release by the Swiss Institute of Bioinformatics and by the University of Bern. You can read the outreach summary in english, french or german.
For more information concerning the importance of our paper in the debate over neutral evolution from the evolutionary genetics' community you can read this article which cites our paper:
<< With the accumulating evidence for adaptation in the human genome, it seems likely that some large fraction of the genome would be subject to the effects of linked selection, he suggested. “We just don’t know how large that fraction is.” [says Andrew Kern] A recent paper in eLife by Fanny Pouyet and her computational-geneticist colleagues at the University of Bern and the Swiss Institute of Bioinformatics pins down that number. [...] [T]hey concluded that less than 5 percent of the human genome evolved by chance alone. As the editors of eLife noted in their summary of the paper, “This suggests that while most of our genetic material is formed of non-functional sequences, the vast majority of it evolves indirectly under some type of selection." >>
This study uses gene regulatory network models to examine the functional consequences of yeast GAL3 sequence variants. We link the genetic variation that exist among a population to changes of parameter values of the regulatory GAL network. We combine the numerical approach to experimental analyses of the yeast GAL network and we show that GAL3 natural variation is sufficient to convert a gradual response into a binary switch. Finally, dynamic network modeling allows us to successfully maps alleles to specific locations of the parameter space and to functionally infer the consequences of DNA polymorphisms in the population. This framework can be more generally applied to the mechanistic interpretation of genetic variants.
Here, we study the variation in synonymous codon usage among genes involved in different functional categories in humans. We show that synonymous codon usage is not driven by constraints on tRNA abundance, but by large-scale variation in GC-content, caused by meiotic recombination, via the non-adaptive process of GC-biased gene conversion (gBGC). First, we observe that expression in meiotic cells varies among functional categories. Then, we demonstrate that meiotic expression is associated with a decrease in recombination within genes and as a consequence is linked to a reduced level of gBGC. Overall, the differences in gBGC stength explains 70% of the variance in synonymous codon usage among genes. We argue that the strong heterogeneity of synonymous codon usage induced by gBGC in mammalian genomes precludes any optimization of the tRNA pool to the demand in codon usage.
We present a codon substitution model named SENCA (site evolution of nucleotides, codons, and amino acids) that disentangles 3 levels of genes evolution. SENCA separately describes 1) the nucleotide processes which apply on all sites of a sequence such as the mutational bias, 2) the preferences between synonymous codons, and 3) the preferences among amino acids. We study the core genome of 21 prokaryotes intraspecifically and five Enterobacteria interspecifically. We retrieve a universal mutational bias toward AT. We also argue that most synonymous substitutions are not neutral and must be taken into account to estimate the selection parameter on nonsynonymous substitutions. We propose new summary statistics to measure the relative importance of these 3 levels.
Bio++ is a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics. Bio++ is designed to be both easy to use and computer efficient by providing researchers a set of re-usable tools. This paper presents the second major release of the libraries, which provides notably a built-in access to sequence databases and new data structures for handling and manipulating sequences from the omics era. Complex models of sequence evolution, such as mixture models and generic n-tuples alphabets, are also included. You can find the description of these tools, here.