Tutorials‎ > ‎PDB Files‎ > ‎

Step 6

This step can be straightforward or very complicated depending upon the number and the type of atoms whose coordinates are undefined. A useful first step is to list the atoms for which coordinates do not exist. This can be done with the function IdentifyUndefinedCoordinates3 which is used in the following fashion:

# . Get the system. system = Unpickle ( "../step4/step4.pkl" ) # . Identify undefined coordinates. IdentifyUndefinedCoordinates3 ( system )

In general, hydrogens are easier to deal with than non-hydrogen (or heavy) atoms and so the function treats these separately. The numbers of both types of atoms with undefined coordinates are printed but, by default, only the identities of the heavy atoms are explicitly listed. In the PKA case, all of the hydrogens have coordinates which need to be built, but there are also two heavy atoms with undefined coordinates – A:LYS.8:CB and A:SER.17:OG. The latter was to be expected because it belongs to the residue that was mutated in entity I, but the former is more of a surprise as for some reason it is missing from the original PDB file.

The first step is to construct the heavy atom coordinates and then deal with the hydrogens. In this case, probably the easiest approach is to hand-build the coordinates as so few heavy atom coordinates are undefined. Visual inspection of the structure is essential to identify reasonable positions for the atoms but, having done this, a suitable program for construction is:

# . Define the atoms IDs and the parameters.
_ToBuild = ( ( "A:LYS.8:CB" , "A:LYS.8:CA" , "A:LYS.8:N" , "A:LYS.8:CG" , 1.54, 109.5, 0.0 ), \
( "I:SER.17:OG", "I:SER.17:CB", "I:SER.17:CA", "A:ATP.400:PG", 1.40, 109.5, 0.0 ) )

# . Get the system.
system = Unpickle ( "../step4/step4.pkl" )
system.Summary ( )

# . Build the coordinates of the missing heavy atoms.
for ( id1, id2, id3, id4, r, theta, phi ) in _ToBuild:
i = system.sequence.AtomIndex ( id1 )
j = system.sequence.AtomIndex ( id2 )
k = system.sequence.AtomIndex ( id3 )
l = system.sequence.AtomIndex ( id4 )
system.coordinates3.BuildPointFromDistanceAngleDihedral ( i, j, k, l, r, theta, phi )

# . Build the hydrogen atom coordinates.
BuildHydrogenCoordinates3FromConnectivity ( system )

# . Save the system.
Pickle ( "step6.pkl", system )

# . Calculate an energy if all atoms now have coordinates.
if system.coordinates3.numberUndefined <= 0:
system.DefineNBModel ( NBModelABFS ( ) )
system.Energy ( doGradients = True )

The salient points of this program are:

  • The program defines all the information necessary to build the two heavy atoms' coordinates in the tuple _ToBuild. The format of each item in the tuple is similar to a Z-matrix specification (for readers who are familiar with these) and consists of four PDB atom specifications, a bond distance, an angle and a dihedral angle. The first atom is the one whose coordinates are to be constructed whereas the remaining three are required to define the bond, angle and dihedral terms, respectively.
  • After retrieving the system from the PKL file, the coordinates of each of the heavy atoms are built using the method BuildPointFromDistanceAngleDihedral from the Coordinates3 class. This method requires integer indices to refer to the atoms, as opposed to strings, and these are retrieved using the AtomIndex method of the Sequence class.
  • Once the heavy atom coordinates have been constructed the hydrogen atom coordinates are built using the function BuildHydrogenCoordinates3FromConnectivity. The algorithm used by this function is a simple one that employs connectivity information only. It does not make use of any information from a system's MM model (if it has one) nor search for interactions that cannot be determined from the connectivity, such as hydrogen bonds. Such capabilities may be added in the future. The function recovers the indices of the hydrogen atoms whose coordinates require building from the undefined attribute that is stored in each instance of the Coordinates3 class. This attribute is modified as appropriate as each atom's coordinates are built.
  • The program terminates by re-saving the system and, if there are no undefined coordinates, by calculating an energy (at last!).

Various automated algorithms for coordinate building exist as alternatives to the hand-building approach employed above. There are many third-party tools which may be the best choice if large parts of the structure are missing. However, pDynamo has the beginnings of equivalent tools which are based upon distance-geometry methodologies. As yet, these are not very efficient although they work reasonably for constructing the coordinates of small molecules and small portions of a larger structure. It is intended that a tutorial on these techniques will appear in due course.