Computational paleoproteomics

PhD candidate: Petra Gutenbrunner

Objectives

Algorithm for the search engine driven and for the unrestricted identification of protein modifications will be developed for the specific challenges in ancient samples.

Expected results

The development of specific computational algorithms will highly improve the identification rate of proteins in ancient samples leading to a better understanding of the production techniques and chemical preservation of cultural heritage materials.

My project

Investigation of archaeological samples, when the target is species identification, can prove to be particularly challenging, as the genome of many non-domestic species used to produce everyday use objects, such as elk and deer for bone combs, or mink for elaborate fur garments, is still not available yet. Similarly, the streamlined identification of pathological variants, reported in publicly available protein databases, but commonly ignored in ordinary peptide-spectrum matching processes, could favour identifications of pathologies in ancient samples. In these cases reliable options for homology search, identification of sequence variants and amino acid substitutions, or de novo sequencing, integrated within the main search-engine would improve a lot the identification of cryptic, potentially species-diagnostic, peptides. Finally, improved algorithms for blind identification of spontaneous post-translational modifications will allow a better characterisation of the damage pattern affecting each sample, with direct relevance in conservation.

In the analysis of ancient proteins from cultural heritage material one faces several particular challenges that will all be addressed by development of software modules and algorithms in close collaboration with the experimental partners. These will be implemented as extensions to the established MaxQuant and Andromeda platforms. For several applications in ancient sample analysis, peptide identification from large search spaces are required. For this purpose the Andromeda search engine is further optimized to enable solid identification in this situation. Alternatively, pre-existing de novo sequencing techniques are further developed to be able to identify peptides and proteins independent of a sequence database. This includes clustering of single spectra de novo identifications in order to be able to confidently identify regions or domains of proteins that have a sufficient MS/MS sequence coverage, and to map out their modification content.

Algorithm for the search engine driven and for the unrestricted identification of protein modifications will be developed for the specific challenges in ancient samples. An expert system will be developed and trained to be able to automatically apply domain-specific rules for the interpretation of MS/MS spectra. This includes special rules for the proper treatment of deamidation and other modifications in order to aid distinguishing clear cases of ancient sample-specific processes from sample preparation by-products. An expert system-informed localization score will apply this knowledge to calculate reliable localization properties for PTMs. Furthermore a library will be built up with the information obtained on ancient peptides with and without PTMs in order to be able to apply this knowledge to new samples and to improve the dynamic range of detection in these.

The development of Top-down paleoproteomics is supported by the development of a Top-Down version of MaxQuant that will be able to identify and quantify proteins from relatively complex mixtures and the determination of their PTMs from MS/MS spectra acquired with a variety of fragmentation types. This will be done in close collaboration with Patrick Rüther (@UCPH) and Diana Samodova (@UCPH).

Networking

Planned secondments:

Secondment period of 6 months at UCPH (JVO co-supervision) to collaborate with Georgia Ntasi (@UNINA) to test on Patrick Rüther's (@UCPH) datasets the developed algorithms.
Secondment period of 4 months at Thermo (Bremen) to develop and test new control software for the Orbitrap Fusion Lumos Tribrid mass spectrometer.

Contacts

E-mail: cox@biochem.mpg.de

At: Max Planck Institute of Biochemistry

Supervisor: Jürgen Cox