CcpNmr Analysis Macros

General principles

The CCPN program CcpNmr Analysis was primarily written for NMR spectrum assignment and analysis. It provides a means by which NMR data, such as spectra , chemical shifts and peak lists can be visualised and manipulated. For many analytical and structure determination and tasks Analysis acts a hub and becomes the primary means of managing CCPN data. The following tutorial assumes that you have installed and can start Analysis. The examples mostly require data, for demonstration purposes, and the downloadable demonstration project is sufficient for this.

Macros

A macro is a Python script that extends the functionality of the Analysis program, and takes advantage of the program's existing functionality. For example if you wanted to create a routine that manipulated a list of crosspeaks then Analysis can provide you with a means of loading the data, selecting the input, displaying the output and saving the result. Also, to further expedite the process, Python macros can make use of the large number of high-level functions that already exist within the CCPN programs. Hence for many common or tedious computational tasks you can choose from an array of functionality that is already written and tested, rather than having to write everything yourself.

Macros are not only the recommended way of adding small bits of bespoke functionality to Analysis (and hence CCPN in general), they also provide a convenient means of developing and testing code. Macros can be reloaded at any time, to take account of any changes to your Python script, so that you do not have to constantly restart Analysis while development takes place.

Argument Server

The Argument Server is the means by which all Analysis macros interact with the data in the currently active CCPN project. This is a link from your script to real data, and without it your Python scripts would have to do a whole lot more (and know about the innards of Analysis) just to access simple information. The Argument Server is a Python object that is passed as the first argument to a macro when it is run. It has many methods (functions that you can call from the object) which enable you to get hold of lots of useful data from within the CCPN project, and this will involve the graphical user interface if the user needs to choose between various options. For example, to name just a few things the Argument server can do: Get the current MemopsRoot (project), select spectra, select Restraint Lists, select Peak Lists, provide mouse-selected Peaks, give the current cursor coordinate etc...

Moving on

Eventually, if your Python script works and provides a functionality which would be useful to others, you can begin to think about where next to take the code. Your basic options are:

- Upload the macro unaltered to the CCPN web site for others to use.
- Put the functionality into a stand-alone application, perhaps making use of CCPN's Tkinter based graphical user interface library.
- Petition the CCPN team to include your code and/or ideas into Analysis. There are several instances of this happening.

Execution

For our first example of a macro the aim is simply to introduce the macro system. The macro will be a desperately simple Python script, but you can load it into Analysis so that it appears in a menu, then make some alterations and re-load the code and thus gain an appreciation of how macros are a convenient way to write CCPN-linked programs.

Firstly in a text editor write some simple Python code as follows, taking note that the first argServer attribute is always required and will be passed to the function whenever it is called:

def simpleExampleMacro(argServer):

project = argServer.getProject()

print 'Macro is using project "%s"' % project.name

Save this script to a file with an informative name e.g. "ExampleMacro.py". Beware of choosing a name for the file (which is effectively a Python module) that clashes with inbuilt Python modules like "test" or "math". If in doubt choose a name that is descriptive and erring towards verbose.

Now load this into CcpNmr Analysis by going to the menu option Macros::Organise Macros and click [Add Macro]. In the file browser, navigate to the spot where you saved the .py file (your choice) and click on the line of the file. Then in the lower "function" table click on "ExampleMacro" (or whatever you called your file) and then click [Load Macro]. Returning to the main macro table, double-click in the "In main menu?" column for the row of your script, such that it says "yes". You will now note that your function is mentioned in the Macros section of the main menu. By selecting this new option you will execute the function and will see the name of the project printed on the Python command line (in the window where you started Analysis).

It is a bit unfriendly to print text to the screen on the command line. In normal operation it would be better to display a small dialogue box for the user. To use a dialogue box for the message, change the script to the following:

def simpleExampleMacro(argServer):

project = argServer.getProject()

message = 'Macro is using project "%s"' % project.name

argServer.showInfo(message)

Now select Macros::Reload Menu Macros from the main menu, and only then re-run your script. Hopefully the resulting changes will be obvious. As a macro is simply a Python script you can put anything you like into it, drawing on examples given above for using CCPN's Python API. However, to give you a better idea of how this system may be used, we next show you a real CCPN macro which draws together several elements that have already been covered.

A Real-World Example

The following macro is a script that was written to display the secondary chemical shifts (sequence adjusted) for backbone atoms in a specified molecular chain, i.e. the difference between the observed chemical shift of a resonance and the value expected if it had a random coil structure. Here the macro will be discussed a few lines at a time, but you are expected to put all of the code into a single, contiguous block to create a file that you can test by loading, as detailed above. Don't forget to keep the indentation consistent (as shown in the code below), remembering that Python uses indentation to define block in functions, loops, conditional statements etc.

Firstly, use the def statement to specify the start of a function. We give the function an informative name and we accept one argument argServer which will be passed into the function. On the second line of the function (remembering to indent with spaces) we add an import statement so that we can use a function from the CCPN high-level function library. In this instance, the imported function does as its name suggests; it provides random coil chemical shifts - which we can compare to our data.

def printShiftDevFromCoil(argServer):

from ccpnmr.analysis.core.MoleculeBasic import getRandomCoilShift

Then we define the atomNames variable, which is a list containing the names of the protein backbone atoms that we want to get chemical shifts for. We also initialise a variable called lines, to which we will add lines of text to display the results. Initially this is just the title line containing the column headings that are derived from the atom names padded out to six characters.

atomNames = ['N','H','HA','C','CA',]

columnNames = [' %6.6s ' % an for an in atomNames]

lines = ' : %s\n' % ('|'.join(columnNames))

The next job is to get hold of some CCPN data model object from the project that is currently loaded into Analysis. This macro requires a list if chemical shifts, i.e. an Nmr.ShiftList object and a chain of residues; a MolSystem.Chain object. To get hold of these we simply use the functionality that is built into the Argument Server. If there is more that one Chain or ShiftList in the project then the user will be prompted graphically, in order to choose which one to use.

chain = argServer.getChain()

shiftList = argServer.getShiftList()

By querying the CCPN Chain object we get an ordered list of its residues and determine the length of that list. Note that if we were to call chain.residues we would get a unordered (frozen) set of residue objects and not the sorted list we require.

residues = chain.sortedResidues()

numRes = len(residues)

With the residues in hand we now loop though the list. This macro uses the built-in Python enumerate() function to not only loop through the residues but also provide a counter, which we denote i in this instance. Then inside the loop we initialise a blank list called context, which will contain a subsequence of five residues (in sequence order). We require a subsequence, rather than just a single residue so that we may perform a sequence dependent correction to the random coil chemical shift values.

for i, residue in enumerate(residues):

context = []

We fill in the subsequence by looping from the position two residues before the current residue to the position two residues after the current residue, adding Residue objects as we go with the append() command. Note that we use the counter from the current residue i, to create a new position j, but sometimes when the residue i is at the end of the sequence the j position would fall off the end of the sequence. In such circumstances the end residues are filled with None objects. Also note that j is the output of the range(i-2,i+3), not range(i-2,i+2): i.e. the second argument of the range() command is a limit that is not included in the output list.

for j in range(i-2,i+3):

if 0 <= j < numRes:

context.append(residues[j])

else:

context.append(None)

Next we create a text string to identify the residue. To do this we combine the .seqCode number and the .ccpCode attribute (e.g. "Lys"). Immediately following we define a new blank list for this residue, tableRow, which will contain all of the final results for this residue.

resId = '%3d %s' % (residue.seqCode, residue.ccpCode)

tableRow = []

Given that we have a residue, we can now loop through atoms. We loop through the list of atom names we have and then try to find an Atom of that name in the Residue.

The whole point of the macro is to measure chemical shifts and compare them to the random coil values, so we initialise a value delta, which will store this chemical shift difference. Initially the value is actually a single dash "-" character so that if no chemical shift is found then this null symbol will appear in the output. If there is a value delta will be overwritten with something more informative.

for atomName in atomNames:

delta = '-'

atom = residue.findFirstAtom(name=atomName)

If we cannot find an atom with one of the names it may be that we are looking for an 'HA' in a glycine residue that instead has 'HA2' and 'HA3'. Thus we add the clause that if we have not found an Atom for the name 'HA' we also try 'HA2'. Note that we do not add 'HA2' to our initial list of atom names because we still want to find its data in the 'HA' column, and not have an extra column.

if (not atom) and (atomName == 'HA'):

atom = residue.findFirstAtom(name='HA2')

Then there is a final check to make sure we have an atom. We would expect to be missing CB in glycine for example. If we do have an Atom we find its chemAtom: this is the reference to that kind of atom in the chemical compound template.

We need this kind of object because of the way that the function which provides the random coil chemical shift (imported at the beginning of the script) works. - It uses this template atom and a separate context (residue subsequence). The random coil shift is simply passed back by the function getRandomCoilShift().

if atom:

chemAtom = atom.chemAtom

coilShift = getRandomCoilShift(chemAtom, context=context)

With the Atom objects we follow the links to any assignments. This means getting hold of the AtomSet and ResonanceSet objects. As described earlier these objects link the residues Atoms to the Resonance objects which in turn carry the chemical shift data.

atomSet = atom.atomSet

resonanceSet = atomSet.findFirstResonanceSet()

If we managed to find a resonanceSet object for this Atom then it is assigned. We find the first resonance that it is assigned to and then the chemical shift (Shift object) that belongs in the ShiftList that was chosen at the start.

if resonanceSet:

resonance = resonanceSet.findFirstResonance()

shift = resonance.findFirstShift(parentList=shiftList)

If there is no recorded Shift then we cannot proceed any further. If there is a Shift object, we interrogate its .value attribute and take this away from the random coil value that we already have. Finally the values are used to fill in the text variable delta, which we can print to screen.

if shift:

delta = shift.value - coilShift

delta = '%.3f' % delta

We accumulate a delta text string for each of the atoms (even if we are missing data), and we add this text to the tableRow list which will contain all the delta values for the entire residue.

tableRow.append(' %6.6s ' % delta)

Once we have finished going though all of the atoms for this residue the row of text is complete, we join it with a "|" character to act as a separator, then it is made into a line (being careful that all of the formatting aligns) that joins the end of the other lines.

joinedText = '|'.join(tableRow)

lines += '%s : %s\n' % (resId, joinedText)

And finally we print out the results. We issue a plain print statement to get a blank line and then print the results. These will appear on the Python command line, where Analysis was started from.

print lines

Load and run this macro in Analysis to check that it works. remember to reload if you want to see the effect of any coding changes.