Enrico Cappellini

ecappellini@snm.ku.dk

Enamel proteome sequences from Dmanisi (Georgia) enable molecular phylogeny of fauna remains beyond the limits of ancient DNA preservation

Enrico Cappellini§, Victor J. Moreno Mayar£, Luca Pandolfi#, Frido Welker§, Maia Bukhsianidze±, Jesper V. Olsenʭ, David Lordkipanidze≠, Eske Willerslev¤

§ Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen, Denmark; £ Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen, Denmark; # Dipartimento di Scienze, sezione di Geologia, Università degli Studi “Roma Tre”, Roma, Italy; ± Department of Human Evolution, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany; ʭ Georgian National Museum, 0105 Tbilisi, Georgia; ≠ Novo Nordisk Foundation Center for Protein Research, Faculty of Health Science, University of Copenhagen, 2200 Copenhagen, Denmark; ¤ Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, United Kingdom

We ignore how species who faced extinction earlier than one million years (Ma) ago are genetically related to the living ones because ancient DNA fully, and irreversibly, degrades after ∼0.5-1 Ma [1, 2]. Here we show that this limit can be overcome by using proteomics to: (i) sequence million years-old proteins extracted from fossil dental enamel, (ii) measure evolutionary changes in amino acid sequences, and (iii) reconstruct evolutionary histories beyond the limits of ancient DNA preservation. Deep-time palaeoproteomics enables: (a) unprecedented access to genetic evidence from epochs still considered impossible to routinely access by biomolecular investigation, and (b) molecular-based investigation of major evolutionary processes so far intractable for molecular phylogenetics, potentially including those of relevance to palaeoanthropology.

To make recovery of deep time genetic information routine, it is necessary to figure out a simple procedure to inexpensively obtain extended and reliable protein sequence coverage from a ubiquitous and abundant starting material. Dental enamel is the hardest tissue in vertebrates, frequently recovered and identified at palaeontological sites [3]. Teeth are therefore a key piece of evidence for fossil mammalian ecology and evolutionary studies.

We present a novel analytical approach using high-resolution, high-sensitivity tandem mass spectrometry (MS) that retrieves a population of peptides that are “mapped” to extant enamel protein reference sequences. The reconstructed sequences are then aligned and compared with homolog sequences from extant species using conventional phylogeny procedures.

From most of the fauna specimens analysed, limited peptide fragments of collagen type 1-alpha 1 and 2, as well as collagen type 3-alpha 1, were identified from bone and dentine, while extended stretches of amelogenin, enamelin, and ameloblastin were identified in enamel samples. To our knowledge, such an extended coverage, from samples of similar age and geographic origin, has never been achieved before. Glutamine deamidation, a spontaneous modification extensively observed in ancient samples, was surprisingly high [4]. This observation is a strong indicator of the authentic ancient endogenous origin of the sequences retrieved. Another element supporting authenticity is the tissue-specificity of the proteins identified. Enamel proteins are not expressed in other tissues, they never appear among regular random laboratory contaminants, and they are not detected in ordinary saliva proteomes. Finally, they were absent in any extraction and injection blanks involved in the study.