Discussion

4.0 An underwhelming result

The lack of change in our production rates over time, and the convergence of codons towards an even distribution suggests that our population is at drift.

4.1 Mechanistic Reasoning

The codon evolution model is an oversimplification of the process of transcription and translation in a cell. For example, the simulated cellular environment only contained two transcripts, A and B, and two possible codon speeds, fast and slow. By not including key components that might add more selective pressure to our system, as in degradation of target transcript, we lose the impact of that perturbation in the environment

In addition to oversimplifying the translation environment, we are neglecting the process of degradation over time as well. Ribosomal collisions signal transcript decay and ribosome rescue mechanisms in the cell (Simms et. al., 2017), and transcripts that encode for such events would be quickly degraded, impeding expression of their encoded protein product. In such an environment, the transcript that encodes for a spacing ramp would not only benefit from producing proteins at a faster rate, but would also dominate after the competing transcript is knocked out of the pool.

Figure 11. Ribosome collision and transcript degradation from Simms et. al., 2017

From Quax et. al., 2015

4.2 Biological Reasoning

Codon ramps are typically found on transcripts of highly expressed genes (Miller et. al., 2019), so by putting pressure on each transcript to produce as many proteins as possible within the simulation time, we expected to see an increase in production rate over the course of thousands of generations. Instead, mutations appeared to be fixing in the population at random, and the system was in a state of genetic drift. If our simulation was not at fault, what could the reasoning for this be?

Tuller and Zur 2014 suggests "the unusual codon frequency distribution at the 5? end of the ORF is not due to direct selection related to expression regulation, but due to weaker indirect selection related to codon bias in this region as compared to the rest of the ORF”. In addition, a recent paper from Miller and colleagues, 2018, found that in ramp screens of thousands of organisms, ramp sequences were only found in 10% of transcripts within that transcriptome.

4.3 Future Goals

The next steps in modifying this simulation would be to impose higher selection pressure by take into account the process of transcript degradation due to ribosome. In addition, tracking of mutation accumulation over generations will help us determine what changes promote ramp traits. To do this, I would like to use the Baum-Welch algorithm (figure 12).

Figure 12. Overview of the Baum-Welch, or forward-backward algorithm process for analyzing transcript-level hidden states. A) The hidden states of this system are not known, and we do not have a training dataset. To detect hidden patterns in the changes of fast or slow codon usage, you can initialize the system with random or arbitrary probabilities of states(B). Using these predictions, you can then run the forward algorithm on your set of transcripts, and measure how far your measured probabilities for transitions between states compare to your initial, predicted ones. C) By iterating through the algorithm until the measured and predicted probabilities show the most similarity, you converge on the most likely local maximum. This approach is limited, in that you may not always converge on the best global set of transition probabilities.

Courtesy of R Memes for Statistical Fiends

4.4 In Summary

Before this model can be used to test evolutionary changes that give rise to codon ramps, our process of selection needs to be fixed such that it has a stronger effect on the population.