FAQ

1. Q: I have a protein that contains 3 CYS residues coordinating a Co atom. Because the CYS are so close to one another MCCE is labeling them as CYD (bridged CYS). As a result, the CYS do not show in the final analysis (i.e. no pKa and no titration curves). Is there any way I can have MCCE recognize the CYS as such and not as bridged CYS? Thanks for the help.

A: This is a little tricky. The disulphide bridge subroutine has its parameters built in the code. A quick but tedious way is to rename the residues back in step1_out.pdb and run through step 2-4 again. But if you're running this in batches, then we need to change the code and pull that parameter out to be configurable.

2. Q:I make a mfe.py analysis to one residue with abnormal pKa value with command: mfe.py GLU-A0025_ 4.8 4 titration_point I think it should be the model pKa value for this residue;I try 0.5,1,and 4 for pH_cutoff,but the results are same as: Residue GLU-A0025_ pKa/Em=0.369

A: The total interaction to this particular residue is 0 from all sidechains. It is either a bug from delphi calculations or this residue is extremely exposed. If there is no interaction with any of the protein sidechain, you won't see the difference with different output threshold. My suggestion is to run mfe.py on a couple of residues to see if it is a delphi bug.

3. Q: I am not quite sure what are different columns of the opp file. The manual says as follows: The directory "energies" holds the resulted pairwise interaction lookup table. The first column is the electrostatic pairwise interaction and the second column is van der Waals interaction. The unit of the energy is Kcal/mol. But in my run, I see four columns of values: CODE

00003 NTR01A0001_003 -0.000 -0.435 -0.000 -0.000

00004 NTR+1A0001_004 -0.104 -0.444 -0.095 -0.104 *

00005 MET01A0001_001 0.004 -0.294 0.005 0.005 *

00006 GLN01A0002_001 -0.004 -0.000 -0.004 -0.004 *

What are the columns here and also what are asterisks (*) indicate here. I can see only one asterisk per residue. I am using mcce-2.2 version.

A: The first two columns are the electrostatics and vdw interactions used in monte carlo respectively. The third and forth are from delphi calculations and used for boundary correction to get the first column. The asterisks indicate the conformation used as the default one for setting a single conformer boundary condition. This boundary correction is new in this version and a little complicated. Hopefully we can finish the manuscript soon and upload here.

4. Q: MCCE stops running in the middle of step 3 - no error messages.

The last few lines of progress.log are below:

Doing pairwise 121 of 310 conformers. 0 seconds

Doing pairwise 122 of 310 conformers. 2 seconds

Doing pairwise 123 of 310 conformers. 1 seconds

Doing pairwise 124 of 310 conformers. 2 seconds

Doing pairwise 125 of 310 conformers.

And the last few lines of the output log file: Preparing DelPhi runs at epsilon 8.00 ... 4 focusing runs required for this protein. Done

Computing pairwise from conformer 1 to 310 of 310 total conformers see progress.log for progress...

Any suggestions on what the problem might be?

A: If delphi fails, there should be error message about that (although sometimes it could be still in the buffer and not printed). Check run.prm, if there is not line contains (DELPHI_FOLDER), then all delphi calculations are carried out in a /tmp/delphi_xxxxxx/ folder. Now if you system has /tmp in a different partition, it could run out of space and cause mcce to stop. Run 'df' and see if /tmp shows up. If that's the case, then add CODE

./ delphi temporary file folder, "/tmp" uses node (DELPHI_FOLDER)

in run.prm. You can also add CODE

f clean up delphi focusing directory? (DELPHI_CLEAN)

in case you want to keep the delphi temporary directory for error checking.

Also, which job submission method did you use? There are softwares that can keep running in the background when the terminal is closed. MCCE unfortunately is not one of them. So it could be accidentally killed if you didn't use nohup and closed the terminal.

5. Q: we have been using MCCE on our 32bit Linux cluster for quite a while, however we now have switched our computational resources to a 64bit Linux cluster (based on ROCKS 4.2). For that reason I have the following questions:

1. Are there any 64bit Linux MCCE binaries available?

2. Would it be possible to obtain the source code of the recent MCCE program to compile it ourselves?

3. Is there any possibility to distribute the Delphi calculations onto our cluster? Generating the necessary input files and subsequent batch submission to the cluster should be trivial to implement.

A: We don't have a 64-bit executable available, how I can send you the source code to compile it for 64-bit. One problem with a different system is that delphi might use a different binary I/O depending on the fortran compiling program. We will need to find out what that format is before compiling MCCE.

The step 3 delphi calculations can be divided. In run.prm, there are two lines: CODE

1 delphi start conformer number, 0 based (DELPHI_START) 99999 delphi end conformer number, self included (DELPHI_END)

You can change the starting and ending number to control which conformers will execute delphi calculation.

A simple python script wrote by Junjun can divide step 3 into multiple runs automatically:

CODE

/usr/bin/python

This program splits a working directory into subdirctories that can run partial delphi

import sys, os, zlib

def print_help():

print "subruns.py -s n_steps"

print " This program divides step 3 of MCCE into sub working directories, which"

print " contain partial delphi run setups, and prepare a condor submit script."

print " When the step 3 in sub directories are finished, thia program can combine"

print " the result into current directory.\n"

print " -s n_steps: split to sub directories which contain n_steps of delphi each"

return

def split_dirs(n_steps):

n_confs = len(open("head2.lst").readlines()) - 1

# create a run.prm template

runprm = open("run.prm").readlines()

lines = []

for line in runprm:

if line.find("(DO_PREMCCE)") >=0:

line = "f step 1: pre-run, pdb-> mcce pdb (DO_PREMCCE)\n"

elif line.find("(DO_ROTAMERS)") >=0:

line = "f step 2: make rotatmers (DO_ROTAMERS)\n"

elif line.find("(DO_ENERGY)") >=0:

line = "t step 3: do energy calculations (DO_ENERGY)\n"

elif line.find("(DO_MONTE)") >=0:

line = "f step 4: monte carlo sampling (DO_MONTE)\n"

elif line.find("(DELPHI_START)") >=0:

line = "#START# delphi start conformer number, 0 based (DELPHI_START)\n"

elif line.find("(DELPHI_END)") >=0:

line = "#END# delphi end conformer number, self included (DELPHI_END)\n"

lines.append(line)

# number of sub directories

n_dirs = n_confs/n_steps + 1

# create sub directories, copy run.prm and step2_out.pdb

for i in range(n_dirs):

dir_name = "sub%03d" % (i+1)

try:

os.mkdir(dir_name)

except:

pass

fp = open("%s/run.prm"%dir_name, "w")

for line in lines:

fp.write(line.replace("#START#", "%d" % (i*n_steps+1)).replace("#END#", "%d" % ((i+1)*n_steps)))

fp.close()

try:

os.remove("%s/step2_out.pdb" % dir_name)

except:

pass

os.symlink("../step2_out.pdb", "%s/step2_out.pdb" % dir_name);

# create condor_submit file

lines = []

lines.append("Executable = /usr/local/bin/mcce\n")

lines.append("Universe = vanilla\n")

lines.append("error = condor.err\n")

lines.append("output = run.log\n")

lines.append("Log = condor.log\n")

lines.append("getenv = true\n")

lines.append("Notification = never\n\n")

for i in range(n_dirs):

lines.append("Initialdir = sub%03d\n" % (i+1))

lines.append("queue\n\n")

open("submit", "w").writelines(lines)

return

if name == "main":

if len(sys.argv) < 3:

print_help()

sys.exit()

if sys.argv1 == "-s":

n_steps = int(sys.argv2)

split_dirs(n_steps)

else:

print_help()

sys.exit()

When all delphi runs are finished, you can use zopp program (http://www.sci.ccny.cuny.edu/~mcce/program/downloads.php) to merge all the energies.opp files together.

6. Q: In step2 rotamer making, suppose I have a structure like

**** C1 - C2

(sorry I have to use * to make the spaing right. They don't mean anything.)

In the tpl file I use a line to make rotamers by rotating the C1-C2? bond,

ROTAMER XXX 0 C1 - C2 C3 C4 O1 O2 H1

I found that the rotation only affect C4OOH, C3 stays in the same position, so I have a structure where C4 is overlapping with C3. How come?

A: That is quite odd. The parameter seems to be ok. The only thing I can think of that might go wrong is the spacing between C2 and C3. The format of most tpl parameters is not space separated, but field/column formatted. This is because some atom names from pdb file have space in it, for example "N A" in hemes. For MCCE to recognize C3, it needs to start from column 32.

For the formatting, there is an example: http://scratchpad.wikia.com/wiki/MCCE/How-...ed_by_two_atoms 7. Q: We noticed that after our mcce runs the fort.38 file has two types of magnesium ion, and the occupancy of the +2 charged version is 0 at all pH values. Here are the lines from the file. Am I reading this right, and if so, how do I get the ion to be charged.? I thought the Magnesium topology file being used is from param04 or param08 (where the Mg has a charge of +2) but now I'm not sure.

_MG+2B0602_001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

_MGDMB0602_002 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

_MG+2B0603_001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

_MGDMB0603_002 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

A: The DM (dummy) conformer has 0 atom and no interaction at all with the protein. This conformer represents Mg being released to the solution. We use it to calculate ion binding affinities. For Mg, it is likely that vdw interaction is too repulsive between Mg and its ligand and pushes Mg into the solution. Since Mg is ligated to protein, I don't think electrostatic+vdw would give a correct estimation for binding energy. I would suggest turning off the dummy conformer. To do so, you can change "_MGDMB0602_002 f" in head3.lst to "_MGDMB0602_002 t" and rerun step 4. Or, to avoid this conformer in the future, remove _MGDM from CONFLIST line in _mg.tpl file

8. Q: We are building a *.tpl file for a bound ligand (ATP). I have found lots of references on atomic radii and I'm confident that I can find some reasonable values. My first question is exactly how critical is this number? Second (and this one concerns me more) is the same question regarding partial charge? Specifically, would you suspect the coordination bonds in tridentate MgATP would need to be taken into consideration to obtain the correct partial charges?

Finally, from left field, we were reviewing the param04 and param08 files, and the epsilon amino nitrogen (NZ) of lysine is marked as sp2, not sp3. Is this an error?

A: 1. There are two types of radii used in MCCE calculations. Solvent-accessible radius (defined by RADIUS in tpl file) is used in delphi calculations to defined the boundary between eps=4 and 80 regions. This radius is paired with charge parameters and follows PARSE style. The second radius parameter is van der waals radius (VDW_RAD). This parameter is used to determine van der waals energy terms. Most of the values are taken from AMBER force field. The values for both sets of radii are close. Especially for cofactors, they are often the same. However you need to be aware of how sensitive your results are to small changes of these parameters.

2. For partial charges, we have used partial charges obtained in both ways in the past. For example on hemes, we used a so-called metal-centered charge set, which puts +2/+3 charge on Fe and -0.5 charge on each nitrogen on the porphyrin, and use PARSE charge for His and Met ligands. We've also used the strategy more widely applied, which is to calculate the charge distribution of the whole complex using DFT method. If the site of interest is near Mg-ATP, I would suggest using the second method, combining Mg and ATP. It would take some time to get the charge though. Whichever the method you choose to obtain charge, I would suggest to find some experimental pKas measured around a Mg-ATP binding site in literature and test the charge set on those. One of those I can think of is phosphonoacetaldehyde hydrolase where pKa of a Lys is measured by looking at the pH dependence of enzyme activity. (Zhang et al. Biochemistry (2002) vol. 41 (45) pp. 13370-7) Lys is not well-benchmarked though due to the lack to exp pKas. If you can find other residue types, it would be even better.

3. For Lys, I believe it's a mistake. We have been mostly working on eps=4 calculations. So some of param08 files can be out of date. I'll try to remember fixing this by the next distribution.

9. Q: Also, in head2.lst, the "self" term is "nan"-ed too. Could be it some conflict in the initial structure or other reasons? Is there a cutoff in the vdW terms that I can relax to avoid this probelm?

A: Which residue did you see this error on? MCCE put a threshold for atoms getting too close. If it is a clash (<0.5Ă…), mcce would simply print out 999 in vdw. My guess is there's an error in pdb or tpl file, causing mcce to do impossible calculations such as interpreting a character as number or divide a number by 0.

10. Q: I see that the individual .opp files are written to /tmp/delphi_xxxx, but I'm confused as to how to run mfe.py to dissect out the factors that influence pKa shifts. mfe.py is looking for an energies subdirectory within the directory from which mcce is run. do I manually create this directory and move the files there? Thanks.

A: Since MCCE 2.2, we started to use a binary file to store opp files to save disk space. To get the text file, you can run "zopp -x energies", which would extract the opp files into the energies/ directory.

zopp program can be downloaded here: http://www.sci.ccny.cuny.edu/~mcce/program/downloads.php 11. Q: I guess in step1_out.pdb it is the input strcture and the conformers are stored in step2/3_out.pdb, right? I found my conformers in step2_out.pdb have the same structure except the dissocating H (in a COOH group) having 3 different positions. How do I make the whole side chain flexible?

Also, In my case, step2_out.pdb and step3_out.pdb are the same. Can they be different and in what situations?

A: step1_out.pdb is close to the original pdb file in MCCE format

step2_out.pdb has all the conformations from rotamer generation and used for delphi calculation in step 3. Depending on the level of calculation you do, rotamers are built differently. To add flexibility for the sidechains, make sue (PACK) in run.prm is set to be 't' and (ROTATIONS) to be a number greater than 1 (by default we use 6).

Normally there is no difference between step2 and step3_out.pdb. However if step2_out.pdb is somewhat modified before running step 3, the loading pdb file step would re-sort the conformers. For example, if you delete conformer _001, then in head3.lst and step3_out.pdb, all the subsequent conformers would be renumbered to _001, _002 ...

12. Q: I want to compare the three levels of calculations to determine which one is appropriate for my work. So I also want to compare one conformer calculation with these three levels. Will changing the number of rotamers to 1 work for this purpose?

A: Choosing the level of calculation is sometimes tricky. We did some benchmark work comparing quick and default, and the default level improves the calculation significantly, especially for those residues near the protein surface. However, as the protein gets larger, the cost of calculation goes up rapidly. So for large proteins, we recommend using a hybrid configuration, which is to use the quick configuration for the most of the protein and default/full for the region near the site of interest (http://scratchpad.wikia.com/wiki/MCCE/How-....22gold_list.22)

And yes, the major difference between quick and default is the number of rotamers being 1 or 6 per rotatable bond.

The full configuration is even more expensive and wouldn't be the first thing I recommend to do. Mostly, we only use it when we know that certain conformations are not properly generated and want to explore the reason (more like a debugging mode).

13. Q: I installed today mcce2 on Debian Linux amd64 lenny. I tried to execute the example at section 2.2 of the manual (pKa with 4lzt) but command "mmce" complained that libgdbm.so.2 is not present. That version is too old for my OS, so that I made a symliknk in /usr/lib

ln -s libgdbm.so.3 libgdbm.so.2

and checked that the symlink was created. However, unlike under similar circumstances with executables asking old libraries, mmcce complained as above.

As I dislike trying to install older versions of libraries for a number of reasons, I looked for the executable to compile (which, with ifort, would very likely run much faster than than with gnu compilers, something important for a program that is not parallelized) but I did not find the source and accompanying configure.

Where does mcce expect to find libgdbm.so.2? I set it in /usr/lib (which is on the path) as a symlink to libgdbm.so.1.7.3, from the appropriate debian package and with the appropriate permissions.

Command 'locate' finds the library correctly. Still, command 'mcce' (trying to execute the example as written previously), complains that that so.2 library is not found.

When mcce was installed, that so.2 library was not present in the system. Does perhaps mcce remember that it was not present? During the installation, mcce did not complain about that so.2 library.

Anticipating other problems for old libraries, which is the full name of the expected "glibc2.2", and where expected, if to be on the path does not suffice.

A: Sorry for the delayed reply. I was pretty busy last week. Looks like you need to try recompile the program with the new library. You can download the source code here Run ./install after unzip.

14. Q: I'm currently trying to use MCCE to predict pKa of polymers. In the 3rd step of MCCE, I got an error like this:

WARNING: Delphi failed at focusing depth 1 of MAA01_0001_001, retry

WARNING: Delphi failed at focusing depth 1 of MAA01_0001_001, retry Trying changing scale on trial 2

......

This error was repeated 100 times then the program stopped. Is the delphi code not talking with MCCE correctly or other reasons?

A: There are a few possibilities that can cause this to happen: One most likely to the newly installed mcce is that run.prm is pointing to a different location than where delphi is.

Check the (DELPHI_EXE) line in your run.prm and make sure it has the correct location for delphi executable.

Another possibility to cause this is the incompatibility between mcce and delphi where delphi is compiled by an unknown complier; or, a rare bug in the surface algorithm of delphi, which causes it to crash. Either of which is harder to fix and I'll put the details up later.

15. Q: New to MCCE on Debian Linux amd64 with hot problems. The protein has standard amino acids, though a chloride ion is present as a ligand to these amino acids. The pdb file worked out with Amber10's LEaP sets the name for chloride ion, both residue and atom name, as "Cl-". What about the name for the tpl file? If different from "Cl-", will MCCE be happy with simply replacing "Cl-" with MCCE's preferred name in the pdb file? I plan to use the vdw data used by Amber10 for chloride ion.

A: The current mcce does recognize chloride, however by a different residue name. What you need to do is to rename residue name to _CL and atom name to CL. Do not put -1 into the naming, it is assigned by MCCE (like ASP-1).

16. Q: I am working on a complex DNA/protein, I would like to know the pka of some residus of the protein, particularly some Histidine that are not so far from the dna (4 to 6 Angstroem). Do you have a tpl file already made for the residus of the dna? I saw that some protein/DNA complex are in the database, could you make available the tpl file used for the dna residues?

A: Thanks! Unfortunately, we don't have tpl files for DNA yet. As far as I know, the database calculations did not include DNA even though the structure might be a complex.

If you plan to use MCCE to work on DNA/protein complexes, the answer is more complicated than getting the right DNA tpl files. Due to the high concentration of charge distribution of nucleic acids, it requires a non-linear version of delphi/mcce. Chris Tang from Barry Honig's lab developed a version that can calculate pKs on RNA: http://luna.bioc.columbia.edu/honiglab/sof...ets/RNA_params/

I believe DNA tpl file shouldn't be difficult to develop based on what he has done.

17. Q: I'd very much like to do some mcce runs including ATP, AMP, water and Magnesium ions. Do you have quasi-reasonable tpl files including partial charges and ionization parameters for these already made somewhere?

A: As far as I know, we don't have tpl files made for ATP and AMP. If you wish to create tpl files for those, for a lot of cofactors that are listed on the hic-up website, most part of the tpl file is relatively easy to make by following these steps. The trickier part is to make different ionization states, and get the correct charge and rotamer flexibility. For molecules like ATP and AMP, I'm sure we can find those parameters in amber or charmm parameters. Or, a more complicated way is to use a DFT package to calculate the charge distribution.

18. Q: What should I do if the pka of my residue is out of range?

A: Use sum_crg.out to check whether your residue is neutral or ionized. Your can set the pH range using (TITR_PH0), (TITR_PHD) and (TITR_STEPS) in run.prm file.

19. Q: Can we tell if my residue is accessible to water?

A: The acc.res and acc.atm files stored accessible information for residues and atoms. Also, desolvation energy can be found in the "dsolv" column of head3.lst.

20. Q: Can we get an image of the position of the specific rotamer of my residue used in this calculation?

A: Yes, you can find the most occupied conformer of your residue in fort.38. Using the conformer ID to get the structure in step2_out.pdb.

21. Q: I have several residues in the pK.out file which reads pKa titration curve too sharp or pKa or Em out of range. What does it mean and is there any way

to actually determine the pKa for these residues?

A:Sharp titration usually comes from the monte carlo step not converged well. A general way to spot this is to check the 2nd column (titration slope) and 3rd column (chi 2 of curving fitting) in pK.out file. If the slope is not between 0.8 and 1, or chi 2*1000 is over 20, then you have either not converged monte carlo or the titration is coupled with other residues.To make monte carlo run more converged, you can rerun just step 4, and increase the numbers of the (MONTE_RUNS), (MONTE_NITER) line in run.prm. Or, you can run a standalone monte carlo program in the same directory, which requires a zopp-2.4 program in the path, or using zopp to extract the energies folder first.If it is coupled titration, which usually causes titration to be too shallow instead of sharp, you can plot out the titration from sum_crg.out, or use a tool, getpK2, to globally fit all sum_crg.out entries with a bimodal titration.

22.Q: i tried, just to get used to your programm, to run mcce with 4lzt, with epsilon four and eight for the delphi calculation. but i would like to try it with, e.g. 14 or 18 or 1. my problem is now, that i dont know how i have to change the files respectively how to write such files which run.prm could use.

A: 1. Change (EPSILON_PROT) line to the number you desire, eg. 15.

2. Make a copy of the param04 directory in (MCCE_HOME) directory to be param##, eg. param15

3. This is the somewhat tedious part, you need to change the reference reacton field energy to be eps15 based. Those numbers are stored in RXN line in each tpl file in the param directory. You only need to change asp, glu, lys, arg, his, and tyr, and cofactors in case you use them.

To get the right number for RXN, first set them to be 0. Then run mcce on single amino acids, one by one. After step 3, take the dsolv number from head.lst and put them into the tpl file. If there are more than 1 number, average them.

23. Q: I want to run a single amino acid. And it is not working. This is the pdb file.

ATOM 1 C 1 1.000 0.041 0.667

ATOM 2 C 1 -0.254 -0.809 0.473

ATOM 3 C 1 -1.466 0.088 0.410

ATOM 4 O 1 -1.335 1.307 0.512

ATOM 5 N 1 -2.676 -0.438 0.244

ATOM 6 N 1 1.141 0.990 -0.480

ATOM 7 C 1 2.213 -0.857 0.729

ATOM 8 O 1 2.917 -0.862 1.725

ATOM 9 O 1 2.499 -1.606 -0.255

ATOM 10 H 1 0.917 0.604 1.594

ATOM 11 H 1 -0.357 -1.500 1.307

ATOM 12 H 1 -0.171 -1.372 -0.455

ATOM 13 H 1 1.991 1.566 -0.349

ATOM 14 H 1 0.310 1.605 -0.523

ATOM 15 H 1 1.220 0.455 -1.363

ATOM 16 H 1 -3.487 0.162 0.203

ATOM 17 H 1 -2.784 -1.439 0.160

A: First of all, you need to give mcce a more 'specific' pdb file. The atoms should be named in the same convention as the amino acid. For example, this is how mcce can identify an asp

ATOM 658 N ASP B 3 -11.592 52.843 33.189 1.00 23.86

ATOM 659 CA ASP B 3 -11.764 53.723 32.043 1.00 21.92

ATOM 660 C ASP B 3 -10.423 54.468 31.857 1.00 21.48

ATOM 661 O ASP B 3 -10.022 55.296 32.696 1.00 21.25

ATOM 662 CB ASP B 3 -12.911 54.697 32.359 1.00 23.21

ATOM 663 CG ASP B 3 -13.305 55.575 31.179 1.00 25.86

ATOM 664 OD1 ASP B 3 -12.448 55.997 30.379 1.00 26.46

ATOM 665 OD2 ASP B 3 -14.510 55.874 31.080 1.00 31.25

You need to have the backbone atoms named clearly as 'N CA C O' and sidechains as 'CB CG OD1 OD2'. If you only name all as C or N or O, MCCE cannot distinguish them.

In addition, when you run a single amino acid, turn off this option in the run.prm:

f Label terminal residues? (TERMINALS)

You don't need to label the terminal residues, since you only have one.

24. Q: I have two ZN ions and the calculation seems to be fine until it gets to step 3 and then I get the error message below.

WARNING: no ELECTRON for _ZN+2, 0 assumed

WARNING: No RXN entry for _ZN+2, set to 0.

WARNING: No EM entry for HOH01, set to 0

WARNING: No EM entry for HOHDM, set to 0.

A: The error message is due to a missing parameter in the tpl file. For Zn ions however, as long as you're not titrating against a neutral conformer, it is not important. you do need to check in head3.lst that it is charged with +2 and is fully occupied in step4 (fort.38)