Model Refinement

Roger S. Rowlett

Gordon & Dorothy Kline Professor, Emeritus

Colgate University Department of Chemistry

Preparing the first electron density map

Preparing the first electron density map is an exciting and expectant time. You are either rewarded with the immense joy of actually seeing clear electron density delineating the path of the main chain and positions of many side chains, or you suffer the crushing disappointment of meaningless hash. Preparing the first map and doing the subsequent refinements uses the CCP4 program Refmac for refinement and the molecular display program Coot.

Using the CCP4i environment

Note: If using the CCP4 suite in Windows (or Linux for that matter) place project files in a directory path without spaces in the filenames. Long filenames, or filenames with spaces in them can cause unpredictable issues with CCP4. It is recommended you place your Windows-based CCP4 projects files in a folder rooted in one of the system drives, and with no spaces in the file path, e.g. D:\CCP4-projects.

Performing a REFMAC-based refinement

Refmac is a powerful and simple-to-use refinement package. It will perform coordinate and b-factor refinement of the model against the structure factor data, and automatically write out phase and intensity information that can be used to construct electron density maps in Coot. Refmac refinement is easily configured in CCP4i:

  • In the CCP4i main task window select Refinement from the task menu and click on Run Refmac5. A task window will open (Figure 1).

Figure 1. Task window for running Refmac.


  • Enter a job title (e.g.,refmac)
  • Select restrained refinement with no prior phase information.
  • For MTZ in, select the merged structure factor MTZ file for the entire dataset or the MTZ file from the previous refinement cycle. Note: Do NOT input the MTZ file from Phaser or from a previous refinement cycle. Refmac and Phaser calculate phase information based on your model, and may alter your original data. Always refine against the original structure factor file.
  • For PDB in, select the coordinate file that defines the starting point for the refinement, e.g. the Phaser output file or the rebuilt version of the coordinates from the previous refinement cycle.
  • The crystal, project, and dataset names should be automatically recognized from your input file.
  • Refmac will suggest filenames for the output PDB and MTZ files. You should change these and name them in a systematic way so that you can easily restart from a previous refinement step if necessary. The following file naming protocol is suggested:
    • The input PDB file for the first refinement cycle should be labeled "zero", e.g. hica-d44n-0.pdb. This would normally be the output from Phaser, perhaps with some cofactors added prior to the first refinement cycle.
    • The output files should be designated with the refinement cycle, e.g. hica-d44n-02.pdb and __hica-d44n-02.mtz. The numeric label not only indicates what stage of refinement to which these files belong, it also indicates which PDB and MTZ files are paired with each other for inspecting electron density in Coot.
    • When a PDB file is modified in Coot during rebuiding, add a letter suffix to indicate the version number, e.g. hica-d44n-02a. Saving sub-versions of files will make it possible to go back to a previous step when things go wrong, without losing all your work.
  • The default number of refinement cycles (10) is usually adequate. For the first refinement cycle up to 20 cycles may be required to converge.
  • Important! You should select the weighting term for the X-ray structure factors carefully. The default value of 0.3 is generally too high for typical data of 2.0-2.5 Å resolution, and will result in excessive distortion or mangling of the model during refinement. To increase the weight of geometric restraints, the X-ray weighting factor should be decreased. Typical values of the weighting factor are 0.05-0.20, depending on the resolution and quality of the data. You should adjust the weighting factor until the RMSD of bond lengths and bond angles are 0.010-0.020 Å and 1.5-2.0°, respectively. (You can determine the RMSD of bond lengths and angles by examining the REFMAC log file.) This degree of geometric constraint should generate acceptable structures that are appropriately dependent on the observed X-ray structure factors. Once you have determined an adequate X-ray weighting factor, you may use that value for the remainder of the refinement.
    • Note: In the latest versions of Refmac, the automatic weighting feature works very well, and is a good option to choose. However, if your refinement goes off the rails (this will be more likely at lower resolutions) then you should probably try manually restraining the weighting term.
  • For low resolution data, consider using jelly body refinement in the first refinement cycle after getting your molecular replacement solution. Jelly-body refinement places constraints on the movements of nearby atoms during the each refinement cycle. If using jelly body refinement,
    • Do 50-60 refinement cycles to give the algorithm time to converge.
    • Select a sigma (restraint) value of 0.01-0.02
  • Select a scaling protocol. Both simple and Babinet scaling work well. One scaling method may work better than the other for your data set. If using Babinet scaling for low resolution data, it may be advantageous to fix the solvent b-value. Typical solvent b-values are between 100-400 Å2, with 280 Å2 being a commonly accepted optimal value.
  • Start the job by selecting Run…Run Now at the lower left of the task window. The job will be entered into the job list in the CCP4i window, and you can monitor its status. To change settings, such as disabling the SS linkages, select Run & View Com File instead, a window will open, allowing you to edit the script. Make the desired changes, and then select Continue to start the refinement. You will have to do this each time you run Refmac.
  • When the job is finished, examine the Results or log file by double-clicking on the task in the job pane of the CCP4i window to verify that the job has run correctly.
  • You can open the final result in Coot by clicking on the Coot button under Structure and electron density at the bottom of the Results window.

Restraining metal-ligand distances during refinement in REFMAC

When refining models that contain metal ions, it is often desirable to restrain metal-ligand bond distances to reasonable values. For example, a Zn(II)-S(Cys) bond length is normally about 2.30 Å. It is likely that REFMAC will not be able to set up all metal-ligand bond length restraints properly without a supplemental library file. There are two possible procedures for setting up a properly restrained refinement with metal ligands. The second method (using Coot to make explicit links) is the easiest.

Generate CIF file with dummy REFMAC run

  • Set up a REFMAC refinement job as described above.
  • In the Setup Geometric Restraints section apply the following setttings:
    • For Checking against dictionary select "Check all monomers against Refmac's dictionary description"
    • For Make links between:, All others if select "residues are close only"
  • Start the job by selecting Run...Run Now. The REFMAC job will start, identify potential links, and fail. This is normal.
  • Examine the log file. REFMAC should have identified all your metal-ligand linkages. REFMAC will have written out a dictionary file with a .cif extension that contains information about any missing or problematic atom links. Identify this file for later use.
  • Important! Examine the log file carefully to ensure that REFMAC has not identified any erroneous links between other atoms in your structure. Edit the .cif file to remove any unwanted links, e.g. Zn-C bonds. If necessary, modify your model in Coot to rectify the problem and start over.
  • Select the failed job in the CCP4i job window and select ReRun Job. In the Library filename field, enter the .cif file you previously identified.
  • Start the refinement by selecting Run...Run Now. REFMAC should now identify all the metal-ligand links and conduct a properly restrained refinement.

Note: When you complete the first refinement with the REFMAC-generated library, REFMAC will write LINKR records into your PDB file to identify the newly identified and restrained bond linkages. You may delete any inappropriate bond linkages from the PDB file prior to the next refinement cycle. If you change the Make links between:, All others if setting to "defined in file only", no further atom links will be determined.


Make explicit links in Coot

  • While the model is open in Coot, Select Extensions...Modeling...Make Link
  • Select the two atoms you would like to make an explicit link between
  • A dotted line should appear in Coot to indicate the atoms have been linked.
  • Save these coordinates and use in the next REFMAC cycle. Refmac will normally be able to identify the proper metal-ligand distance constraints for refinement. You can check this in the log file of the REFMAC run.

Note: To make the link you will have to merge any metal ions into the current molecule, if applicable. You should also check the PDB file to ensure there are no extraneous links, e.g. spurious disulfide links that were made during refinement of Cys ligands into metal density in a previous REFMAC cycle.


TLS refinement

TLS (translation, libration, screw rotation) refinement allows the estimation of the anisotropic displacements of atomic positions in models generated from medium- to low-resolution data (> 2.0 Å). Accounting for this anisotropy can significantly improve Rfree by up to 5% or so, and is well worth routine application for medium- to low-resolution data. Instead of estimating the anistropic displacements of every atom in the model, TLS refinement models the anisotropy of identified rigid components of the model in terms of translation, libration, and screw rotation. The rigid components can be protein chains, domains, or secondary structure elements. TLS refinement is integrated into REFMAC, and should normally be applied near the end of the refinement process. TLS refinement requires setting up the TLS groups, and running a REFMAC refinement job:

  • In the task area, select Create/Edit TLS file under Refinement tasks.
    • Select Create a TLS file from scratch.
    • Enter a filename for the TLS output file, e.g., hica08-13.tls
    • Under TLS group definitions, define your TLS groups. Each group should be named under Group title and the identifying residues entered under Include residues. A typical initial assignment is one TLS group for each protein chain in the model.
    • Select Run...Run Now to create the TLS file.
  • In the task area, select Run Refmac5 under Refinement tasks.
    • Select Do TLS & restrained refinement using no prior phase information. Some additional file input dialogs will appear in the Refmac task window.
    • Enter the appropriate input and output filenames for the refinement, including the TLS input file, and any Library files if required.
    • Select the number of TLS cycles to perform. Ten (10) cycles is the default, and is usually satisfactory for the first TLS refinement.
    • Optionally, set all the initial B-factors to 20. This is useful for the first TLS refinement. For subsequent rounds of TLS refinment, this option should be disabled.
    • Select the number of maximum likelihood restrained refinement to perform. Ten (10) cycles is the default, and this is often satisfactory. If the refinement converges faster than 10 cycles, this value can be reduced as required.
    • Important! You should select the weighting term for the X-ray structure factors carefully. The default value of 0.3 is generally too high for typical data of 2.0-2.5 Å resolution, and will result in excessive distortion or mangling of the model during refinement. To increase the weight of geometric restraints, the X-ray weighting factor should be decreased. Typical values of the weighting factor are 0.05-0.20, depending on the resolution and quality of the data. You should adjust the weighting factor until the RMSD of bond lengths and bond angles are 0.010-0.020 Å and 1.5-2.0°, respectively. (You can determine the RMSD of bond lengths and angles by examining the REFMAC log file.) This degree of geometric constraint should generate acceptable structures that are appropriately dependent on the observed X-ray structure factors. Once you have determined an adequate X-ray weighting factor, you may use that value for the remainder of the refinement.
    • Select Babinet scaling. This is typically more accurate than the default simple scaling, and results in more reasonable b-values for the protein structure. For low resolution data, it may be advantageous to fix the solvent b-value. Typical solvent b-values are between 100-400 Å2, with 280 Å2 being a commonly accepted optimal value.
    • Start the job by selecting Run…Run Now at the lower left of the task window. The job will be entered into the job list in the CCP4i window, and you can monitor its status.
    • When the job is finished, examine the log file from the View Files from Jobs menu in the administration functions pane of the CCP4i window to verify that the job has run correctly.