Triple-resonance protein sequence assignment

Introduction

If the program is not already open, start CcpNmr Analysis on the command line by typing:

-> analysis

(This assumes that the CCPN bin/ directory is on your path, otherwise you will need to type the full path or be in the bin/ directory.)

We will now look at the sequentially assigning protein backbone spin systems using triple-resonance experiments. There are two basic, although linked, parts to the process. The first is the linking of sequential spin systems (collections of resonances that relate to one residue) on the basis of matching peak positions. The second is the matching of runs of unassigned spin systems to residues within a sequence.

Open an existing project

We will now leave the previous CCPN project behind and load a new one; this differs from the old one in that it has two more spectra; HNcoCaCB and HNCACB, and that all the peaks have been picked and linked to the amide 'root'; resonances as described above (i.e. using the Assignment: Pick & Assign from Roots option). In the Analysis menu bar select M: Project: Open Project. Select to close the exiting project, but there is no need to save. Navigate to find and select the CcpnCourse2a project, then click [Open] (the spectra are found in the data/dataStore1 directory).

Spectrum Setup

First make sure that window2 and window4 are visible and arranged such that window2 is tall and narrow and on the left of a tall and wide window4 (both windows should be as tall as possible). Both window2 and window4 have HCN axis and can display the triple resonance spectra. Note that to make a new window you could use either use M: Window: New window or clone an existing window via R: Window: Clone. Using the [Spectra] button in the windows ensure that the HNcoCA and HNcoCACB are the only spectra turned on in window2 and that all four triple resonance spectra are turned on in window4:

Select M: Assignment: Protein Sequence Assignment. You should start out in the {Window & Spectra} tab. Set window2 as the "13C Window" in the Query section and window4 as "13C Window" in the Match section. Make sure that the "Use" column is set to "Yes" for the Query HNcoCA and HNcoCACB spectra in the top table by double-clicking. For the bottom table set the "Use" column to "Yes" for only the HNCA and HNCACB spectra (i.e. not the through-carbonyl experiments):

This setup means that we are going to compare specified 13C peak positions in the HNcoCA and HNcoCACB experiments with potentially matching peak positions in the HNCA and HNCACB experiments.

The rationale here is that the through-carbonyl experiment's peaks have the carbon shift of the preceding alpha and beta carbons along the polypeptide chain at a given amide location, where the HNCA & HNCACB have both the intra-residue and preceding alpha & beta carbon peaks for each amide. Thus we can potentially use both spectra to say two amide spin systems are sequentially connected; by saying that an inter-residue peak of the HNcoCA or HNcoCACB derives from the same resonance as an intra-residue peak of the HNCA or HNCACB.

Note that this system can readily use other backbone experiments like HNcaCO, HNCO, HAcacoNH, HAcaNH  etc. with the same approach and that this tutorial only uses the alpha & beta carbon experiments for simplicity.

Linking Sequential Spin Systems

Now go to the {Spin System Table} tab and in the to table scroll down and click on the row that corresponds to spin system 8. Enlarging the popup window to the full heigh of the screen will help view the tables more clearly. You will see that the two triple resonance windows move to new locations. The location of window2 is at spin system {8} and split into two regions one for the CA and the other for the CB peak. The other, window4, has moved to the position of any peaks that match the carbon frequencies of the query peaks. In this case there are three potential matches, but the first strip that corresponds to spin system {71} is the only one that matches both the CA and CB positions well. Note that if the "Filter 13C By Inter/Intra Type" option in the {Options} tab is set to off then spin system {8} matches even more strips. Because we already have a good match we don't really need this option on, but it is useful if the previous-residue and same-residue peaks overlap significantly in the match strips. In the case of spin system {8} matching {71} the CA position does not intersect the purple HNcoCA position, it clearly matches the separate orange HNCA peak (and similarly for the CB position).

In the {Spin System Table} tab, go to the middle "Match Peak Positions" table a click to highlight the row corresponding to spin system {71} (Rank 1), and then click [Set Seq Link]. You will now see that the tables of the popup update to show that {71} is set as "i-1" of {8}:

Also note that in the spectrum window the peak annotations of the aligned CA & CB peaks have changed to illustrate that they are both assigned to the same 13C resonances. Now click [Goto i-1], which will repeat the carbon shift matching, but this time from a sequence position one earlier in the sequence.  Repeat the above procedure for spin system {71}: select the best match and link. For this exercise go on to sequentially link six spin systems in total. If all goes according to plan, the order of spin systems (going i-1 each time) will be {8} -  {71} -  {42} -  {37} -  {48} - {41}.

Assigning Segments to the Sequence

You will see that as you select the various spin systems in the main table, in the lower right 'Residue Types' table there is a display of the probable types of amino acid residue. This prediction is based upon how well the shifts within the spin system match the chemical shifts in the RefDB database. As spin systems are connected sequentially, amino acid type predictions are made for the whole sequentially connected section. In the lower left hand table the connected spin systems, given their probable amino acid types, are matched to the protein sequence. Here the highest scoring positions of residue type match for various five residue sections are listed. You will see that the unassigned residue positions are coloured grey, and the one assigned regions become blue.

Click in the upper table on the row for spin system {37}, which is in the middle of our linked region, and you will see that the residue type predictions are strong at and either side of this position. The highest scoring option in the 'Sequence Locations' table (hopefully with a score of about 66.7) should correspond to the region from residue 17 Gly to 21 Leu, with the other sequence locations having lesser scores. Simply select the row for this highest-scoring location and click [Assign Selected] and then [OK] to confirm the assignment. You will see that all the residues in the section (by virtue of their links for the most part) become assigned to the selected section, and that the colors in the 'Sequence Locations' table change.

Select the option M: Molecule: Atom Browser. Make sure that the elements [N] and [H] are displayed (click the button to get the green hydrogen assignment options) and look at the amide atom for 19 Thr. You will see that not only is 19 Thr assigned (i.e. the atom option goes dark green), but the atoms in the residues which we just connected sequentially are also assigned. Go through the spectra and the Protein Sequence Assignment (with [Goto i+1])  to verify that all the connected spin systems have been assigned. Note that in this instance it was possible to assign the resonances to unique atoms, as well as assign the backbone spin systems to the sequence, because the resonances had their atom type set previously.

Automatic Protein Backbone Assignment

The semi-automated sequence assignment mechanism described above is supplemented in CcpNmr Analysis by automatic assignment routines. At the moment only one called "Nexus" is publicly available, but MARS, AutoAssign and other routines will be incorporated in the future. The automated assignment routines are great for saving time, but they won't always work for all parts of proteins, especially where peaks are missing or severely overlapped. In such instances you can run the automation to assign the easy parts and then fill in the rest more carefully, where possible, using the more manual routines.

Fortunately the data in the example CCPN project being used here is pretty good, and the peaks positions have been checked and have undergone a degree of curation. Accordingly, we can assign most of pure protein sequences automatically with a couple of clicks. To do this open M: Assignment: Automated Seq. Assignment and in the resulting popup you will see that the four triple-resonance spectra are already selected; changing the "Use" column entry to "No" would mean the spectrum was not used for the assignment. All of these default settings are correct, so move on to the {Spin Systems} tab. In the table here you will be presented with all of the resonances, available in the chosen spectra, that are linked to a backbone amide as either same-residue (Intra Residue) or previous-residue (Inter Residue). Initially the table is quite bare, with only a few resonances (and their chemical shift values) displayed for the few residues that we have already assigned. To fill in the rest of the table click [Find Resonances From Peaks], selecting [OK] at the confirmation dialogue box. This function will then automatically extract and label the CA and CB chemical shifts from the same- and previous-residue peaks:

This is done by considering the overlap of the through-carbonyl spectra with other spectra, together with the peak intensities and relative chemical shifts that are expected to be visible (in an experiment of that type).

With the resonances populated in the spin systems table, move on to select the {Automation} tab. Here, set "Use existing assignments?" to on and accept the other default settings and click [Run Nexus]. Once the procedure has finished all of its iterations you will be shown a graph of the assignment scores for each residue; where blue means good, yellow dubious and red bad. Moving to the {Predictions} tab you will hopefully see large regions of good (blue) prediction:

Before a spin system (and thus all of its contained resonances) can be assigned to the predicted residue, the prediction must be confirmed. To confirm the prediction either select a series of good rows ('Crtl' + click in the table) and click [Confirm Selected] or set all blue regions all together by clicking [Conform Above Score Threshold]. With your choices conformed click [Commit Assignments] and confirm [OK] to actually do the full assignment; which will affect the peak labels.

Look at the peaks in window1 to see the result of the assignment, and admire your handywork.

Back to Course Day 2 of 3