Three-dimensional structures of proteins are important for understanding protein function, and useful for applications like drug development. While determination of protein structure has gotten easier in the past few decades, it has not been able to keep up with the massive amount of novel genes sequenced. While physical methods of determination, such as x-ray crystallography, NMR spectroscopy and cryo-EM, are the preferred ways to solve structure, many proteins contain enough information to determine 3D structure solely from their amino acid sequences. Various computational methods have sought to do this, including the EV Fold approach developed in the Marks lab.
These approaches are based on multiple sequence alignments between many members of related proteins. Based on these alignments, covariation matrices of how often each possible amino acid pair in a protein occurs in that particular pair of positions are determined. This strategy is based on the idea that, if two residues form a contact, a substitution at one of the two positions should be followed by an analogous compensatory mutation to maintain the interaction. These strategies have been used to accurately predict structures of many proteins (1,2,3) (Figure 1).
A situation where the EV Fold strategy could be useful is in the analysis of certain groups of GPCRs. G-protein coupled receptors, or GPCRs, are a large class of seven transmembrane domain proteins that operate as regulators of signaling, and constitute over 30% of current FDA approved therapeutic targets (4). While much work has been done to study both the basic biology of GPCRs and native ligand-GPCR interactions, drug discovery for these receptors remains a challenging problem. This is due, in part, to the large number of effectors that can signal downstream from initial activation of a GPCR by an agonist, posing a challenge for assay development. In addition, there are over 800 annotated GPCRs in the human genome, making off-target agonism a concern for GPCRs that are not able to be knocked out.
Expression of human GPCRs in yeast is an exciting alternative for development of new GPCR screens due to the orthogonality and simplicity of signaling pathways, ease and speed of culturing, and ability to work in a larger library space (5). Yeast have only two GPCRs and pathways, the glucose sensing and pheromone response pathways, and there is no cross-reactivity between them. The pheromone response pathway has been engineered to couple with many human GPCRs to provide a singular readout from any GPCR subtype, in comparison to the multiple assays necessary to cover the range of human GPCRs in mammalian-based systems due to the number of effectors that only signal with certain classes of GPCRs.
While yeast have proven useful for both the study and screening of human GPCRs, not all human GPCRs express and signal in yeast. To date, there has been no determination of why certain receptors can be heterologously expressed, and why some cannot. Some of this variability almost certainly comes from differential ability of receptors to traffic through the yeast ER, and the difference in sterol composition between yeast and human plasma membranes (6,7,8). While these are known entities that affect heterologous expression, there is no way to determine which receptors will be affected by these factors. GPCRs have historically been very difficult to structurally solve experimentally as they are membrane proteins that are allosterically modulated by G proteins. For this project, I decided to use 3D predictions determined by the EV Couplings strategy for receptors without solved structures to determine if and how structure affects heterologous expression of human GPCRs in yeast.
Figure 1 Overview of the EV Fold strategy. Conserved couplings are determined from multiple sequence alignments, which are then translated into 3D structures.