Protein evolution

Protein coding genes make up only a small fraction of the genome in higher organisms but their protein products are crucial to the operation of the cell. They are the workers behind just about every task in the cell, including digesting food, synthesizing chemicals, structural support, energy conversion, cell reproduction and making new proteins. And like a finely tuned machine, proteins do their work very well. Proteins are ubiquitous in all of life and must date back to the very early stages of evolution. So evolution predicts that proteins evolved when life first appeared, or not long after. But despite enormous research efforts the science clearly shows that such protein evolution is astronomically unlikely.

One reason the evolution of proteins is so difficult is that most proteins are extremely specific designs in an otherwise rugged fitness landscape. This means it is difficult for natural selection to guide mutations toward the needed proteins. In fact, four different studies, done by different groups and using different methods, all report that roughly 1070 evolutionary experiments would be needed to get close enough to a workable protein before natural selection could take over to refine the protein design. For instance, one study concluded that 1063 attempts would be required for a relatively short protein. (Reidhaar-Olson) And a similar result (1065 attempts required) was obtained by comparing protein sequences. (Yockey) Another study found that from 1064 to 1077 attempts are required (Axe) and another study concluded that 1070 attempts would be required. (Hayashi) In that case the protein was only a part of a larger protein which otherwise was intact, thus making for an easier search. Furthermore these estimates are optimistic because the experiments searched only for single-function proteins whereas real proteins perform many functions.

This conservative estimate of 1070 attempts required to evolve a simple protein is astronomically larger than the number of attempts that are feasible. And explanations of how evolution could achieve a large number of searches, or somehow obviate this requirement, require the preexistence of proteins and so are circular. For example, one paper estimated that evolution could have made 1043 such attempts. But the study assumed the entire history of the Earth is available, rather than the limited time window that evolution actually would have had. Even more importantly, it assumed the preexistence of a large population of bacteria (it assumed the earth was completely covered with bacteria). And of course, bacteria are full of proteins. Clearly such bacteria would not exist before the first proteins evolved. (Dryden) Even with these helpful and unrealistic assumptions the result was twenty seven orders of magnitude short of the requirement.

Given these several significant problems, the chances of evolution finding proteins from a random start are, as one evolutionist explained, “highly unlikely.” (Tautz) Or as another evolutionist put it, “Although the origin of the first, primordial genes may ultimately be traced back to some precursors in the so-called ‘RNA world’ billions of years ago, their origins remain enigmatic.” (Kaessmann)


Axe, D. 2004. “Estimating the prevalence of protein sequences adopting functional enzyme folds.” J Molecular Biology 341:1295-1315.

Dryden, David, Andrew Thomson, John White. 2008. “How much of protein sequence space has been explored by life on Earth?.” J. Royal Society Interface 5:953-956.

Hayashi, Y., T. Aita, H. Toyota, Y. Husimi, I. Urabe, T. Yomo. 2006. “Experimental Rugged Fitness Landscape in Protein Sequence Space.” PLoS ONE 1:e96.

Kaessmann, H. 2010. “Origins, evolution, and phenotypic impact of new genes.” Genome Research 10:1313-26.

Reidhaar-Olson J., R. Sauer. 1990. “Functionally acceptable substitutions in two alpha-helical regions of lambda repressor.” Proteins 7:306-316.

Tautz, Diethard, Tomislav Domazet-Lošo. 2011. “The evolutionary origin of orphan genes.” Nature Reviews Genetics 12:692-702.

Yockey, Hubert. 1977. “A calculation of the probability of spontaneous biogenesis by information theory.” J Theoretical Biology 67:377–398.