MCCE tools
Here are MCCE tools that help data analysis and improve runs.
Note: On the server, the Tools folder is found within the folder of MCCE 2.5.1
Data analysis
Author: Junjun Mao
Calculate MFE approximation of residue ionization energy.
Syntax
mfe.py res_id titration_point [pH_cutoff]
Description
"res_id" is the identification string of an ionizable residue, as appeared in file "pK.out"
"titration_point" is the titration point that mfe is carried out. If this number is between two titration values of Monte Carlo simulations, a linear interpolation will be performed.
"pH_cutoff" is the cutoff value in unit pH of printing residue pairwise interaction. It is an optional argument, without which all pairwise interactions will be printed out.
mfe++ - Mean field energy analysis on conformer
Author: Yifan
This is similar to mfe.py. It's the approximation of conformer energy.
Usage:
mfe++ conformer_name {-t threshold_for_print} {-x threshold_for_exclusion}
closeup - Find atoms within the distance to a point
Author: Yifan
This program measures distance of each atom to x,y,z, and print out those pdb lines within the threshold.
Synopsis
closeup pdbfile x y z threshold
Description
pdbfile is your input pdb file
x y z are coordinate of a point where distances are measured, you can copy x y z from the pdb line and paste it here.
threshold is the distance limit of printing the atoms. Only those within this threshold are printed out.
The serial number field of an ATOM line in the output will be replaced by the atom distance.
res_confs.py - Report distance of conformers in a residue
Author: Junjun Mao
This program prints out the RMSD between the specified conformer and other conformers in the same residue.
Synopsis
res_confs.py IDstring1 IDstring2 IDstring3 ...
Description
IDstring1 is the identification string that appeared in the ATOM line of that conformer in step3_out.pdb
IDstring2 is optional identification string to help uniquely identify a conformer
IDstring3 and more are more optional identification string to help uniquely identify a conformer
This program expects file "step3_out.pdb" so it must be called in a working directory. The ID strings are not sensitive to the order.
Example
res_confs.py GLU A0073_051
getinfo - List residues with strong interactions
Author Yifan
This program finds the residues with strong interactions.
Synopsis
getinfo residue threshold
Description
This program is trying to guess coupled residues for the input residue, you need to run it in the directory with head3.lst and opp files in energies/.
residue: the residue you want to analyze, it can be the residue number (for example: 212) or it can be a string defining the residue (for example: L0212, GLUL0212 etc., L212 would not be recognized in this case)
threshold: energy threshold for deciding if two residues are coupled (in pK units), default value is 2.
Improving and helping runs
Author: Yanjun Wang
Syntax:
getpdb pdbID [file]
This program gets a pdb file from ftp://ftp.rcsb.org, and saves to a file with its PDB name or a user named file.
ligandinfo - command line tool to show cofactor information
Author: Junjun Mao
Syntax:
jmao@sibyl:~$ ligandinfo mse
Chemical ID = MSE
Type = L-peptide linking
Molecular Weight = 196.106
Chemical Name = SELENOMETHIONINE
Formula = C5 H11 N O2 SE
Example:
ligandinfo mse
This program gets cofactor information from PDB. The cofactor name is a 3-letter cofactor ID (in place of residue ID) in your PDB file.
Author: Junjun Mao
Syntax:
preprocess.py pdbfile [chainID]
Description:
   chainID could be a string of chain IDs that you want to keep in the output PDB file. If not given, all chains are preserved.
Â
This program will
   1) strip lines other than ATOM and HETATM records
   2) keep the first model of an NMR structure
   3) delete H and D atoms
   4) MSE to MET residue
   5) keep only one atom alternate position
   6) keep defined chains, if chain ID(s) are given in command
   7) remove water, some cofactors and salt ions
Author: Junjun Mao
Usage:
rcom filename command
filename: a file that has one directory per line
This command will enter the specified directory, run the command, and come back to the original directory. If the command has pipes and redirects, use quotes to enclose it. For example:
rcom dirs "preprocess.py temp.pdb > prot.pdb"
Here is an example that I use this command to run a batch of diorectories:
portal2net:E08 jmao$ vi run.prm
portal2net:E08 jmao$ rcom dirs cp ../run.prm .
portal2net:E08 jmao$ rcom dirs qsub submit
Your job 34205 ("pKa") has been submitted
Your job 34206 ("pKa") has been submitted
Your job 34207 ("pKa") has been submitted
Your job 34208 ("pKa") has been submitted
Your job 34209 ("pKa") has been submitted
Your job 34210 ("pKa") has been submitted
Your job 34211 ("pKa") has been submitted
Your job 34212 ("pKa") has been submitted
Your job 34213 ("pKa") has been submitted
Your job 34214 ("pKa") has been submitted
Your job 34215 ("pKa") has been submitted
Your job 34216 ("pKa") has been submitted
reduce_conf.py - trim unoccupied conformers to get a more precise electric boundary
Author: Junjun Mao
This program reads in step3_out.pdb and fort.38, then remove all unoccupied conformers except the native conformers which have "O000" in the history string. The resulted ATOM lines are printed out, which can be redirected to file step2_out.pdb. The purpose is to use this step2_out.pdb as the input of step3 for better dielectric boundary in DelPhi calculation.
Synopsis
reduce_conf.py
Description
This program expects file "step3_out.pdb" so it must be called in a working directory.
Example
reduce_conf.py > step2_out.pdb
merge_pdb - merge two mcce pdb files
Author: Yifan
This program merges several mcce formatted pdb files into one, duplicated conformers are deleted.
Synopsis
merge_pdb -i input_list_file -o output.pdb -p param_directory
Description
-i input_list_file, input_list_flie is a text file with a list of pdb files you want to merge, each line has one pdb file
-o output.pdb, output.pdb is the merged pdb file
-p param_directory, param_directory is the location of the parameter directory, if you have extra residues need new.tpl, put new.tpl in the local directory where this program runs
rewrite_opp - make symmetric opp file from head3.lst and opp files
Author: Yifan
This program takes head3.lst and opp files under energies/ directory as input and complete the opp file if they are asymmetric.
Synopsis
rewrite_opp (-d)
Description
-d: If you have this flag in the command line, and opp file has extra line than head3.lst (for example some conformers in step2_out.pdb has been deleted), then those extra conformers in opp file will be deleted. Otherwise, if -d is not in the command line, then those extra line are kept at the end of the opp file, but not deleted.
mk_iatom - Create derived entries for a parameter file
Author: Jinran
This program updates the derived entries (CONFLIST, NATOM, IATOM, and ATOMNAME) of a parameter (tpl) file from the connectivity table.
Synopsis
mk_iatom input.tpl output.tpl
Description
input.tpl is the input tpl file.
output.tpl is the output tpl file.
pdbdict2tpl.py - Convert Hicup to MCCE format
Author: Yifan
It uses connectivity in PDB dictionary file from hicup website and write out connectivity in the format of out tpl file.
Synopsis
pdbdict2tpl.py hicup_file
Parameter file preparation
Exporting and debugging
Author: Junjun Mao
This program reads in an unformatted pdb file used by DelPhi and print out its content.
Synopsis
testfmt unpdb
Description
unpdb is unformatted pdb file such as fort.13.
Example
testfmt fort.13
Author: Junjun Mao
This program prints out a pdb file that each residue is represented the most occupied conformer. The output pdb file can be viewed by other protein structure visualization programs.
Synopsis
mostocc.py column_in_fort.38
Description
column_in_fort.38 is the column number counted from 1 in the file "fort.38".
The program reads in fort.38 and step2_out.pdb, then picks up the backbone atoms and the most occupied conformer at the pH/Eh specified by column_in_fort.38.
Example
mostocc.py 8
cmp_yf.py - Substract two files
Author: Yifan
This program substarcts the number fields of file2 from file1, and prints the result bigger than the given threshold. String fileds are ignored.
Synopsis
cmp_yf.py file1 file2 option
Description
-t threshold : threshold for printing, default = 0.05
-hl n_headline : skip first n lines, default = 1
-hc n_headcolumn : using first n columns for matching lines between two files
-cf n_column_per_field : use every n column as one field to get numbers, default = delimited by space
-c column_file1 column_file2 : only compare one defined column from each file
Example
to compare two sum_crg.out:
cmp_yf.py sum_crg.out.1 sum_crg.out.2
to compare two fort.38:
cmp_yf.py fort.38.1 fort.38.2 -hc 14
to compare two opp files:
cmp_yf.py file1.opp file2.opp -hl 0 -hc 20 -t 1.0
prune - Prune step2_out.pdb standalone program
Author: Junjun
This program reduces conformers of step2_out.pdb by eliminating "similar" conformers. Similar conformers are defined by geometry, vdw pairwise interaction vector and ele pairwise interaction vector. If the atom distance difference in Angstroms, the vdw and ele pairwise interactions in Kcal/Mole of two conformers are all under the threshold values, then the second conformer is a "similar" conformer and will be deleted. Note any big vdw interaction (> 50) all treated as identical values.
Synopsis
prune geometry ele vdw
Description
The first line is the threshold values of pruning and the number of deleted conformers. The rest is a mcce pdb file which can be used as a new step2_out.pdb. This program will take a few minutes for a 2000-conformer step2_out.pdb.
geometry: threshold of geometry difference, default = 0.5
ele: threshold of electrostatic pairwise interaction difference, default = 1.0
vdw: threshold of vdw pairwise interaction difference, default = 8.0. In Monte Carlo sampling the vdw is scaled down by the protein dielectric constant. So 8.0 here is actually 1.0 viewed by Monte Carlo step when epsilon is 8.
Example
to prune step2_out.pdb and write the result to a file:
prune 2 1 8 > temp.pdb
to do it with default values
prune > temp.pdb
cutnmr.pl - Divide NMR structures
Author: Rajesh
Type the following at unix prompt "perl cutnmr.pl". It will prompt you for your pdb file that contains many structures. Then it will make individual files out of them.
sumup_charge.py
Author: Christian
The python script sums up the charges of the different conformers defined in a given .tpl file. the newest version will be on hestia: ~christian/bin/sumup_charge.py
For example:
__christian@hestia ~/BC1/v8 $ sumup_charge.py ~mcce/mcce2.2/param04/ubq.tpl
Conf Sum
---------
UBQBK 0.0
UBQ01 8.881784197e-16
UBQ-1 -1.0
UBQ-2 -2.0
UBQP1 -1.11022302463e-16
UBQP2 -5.55111512313e-17
UBQS1 -1.0
UBQS2 -1.0
UBQH2 0.0
This is a C shell script to split step2_out.pdb into conformer files for visualization
Syntax
split_conf {conformer_strings}
Description
conformer_strings.
This argument defines which conformers to create PDB files. It is a string used to match head3.lst
This argument is optional. Without it, all conformer PDB files will be created.
Output:
confs/CONFORMER.pdb: one pdb file for each conformer in confs directory
pymol_load.pml: a pymol script used to load all conformers.
Examples:
split_conf A0085_001, create PDB file for chain A, residue 85, conformer 1
split_conf A0085, create PDB files for all conformers of chain A, residue 85
split_conf "A0085|A0212|A0216", create PDB files for all conformers of chain A, residue 85, 212 and 216
mkpqr.py
Author: Yifan
This tool converts MCCE formatted pdb file into a pqr file, which has the charge and radii information and can be used as an input file for delphi and APBS
Synopsis
mkpqr.py input.pdb > output.pqr
Description
input.pdb is the MCCE formatted pdb file to be converted
output.pqr is the pqr formatted output
Database
getuniq.php - Get a list of pdb chains with unique sequence clustered at defined sequence identity
Author: Junjun Mao
This tool downloads a list of current pdb chains with unique sequence identity.
Synopsis
getuniq cluster=90 > uniquepdb.txt
Description
cluster= can be 10 to 100 at step of 10. This number is clustering level. The lower the number, the more chains are represented by a unique structure and the shorter the list.