Tutorial - Part 2

The following scripts will show how to build topologies and LAMMPS inputs for using the OPLS-aa force field (W. L. Jorgensen, D. S. Maxwell, and J. Tirado-Rives J. Am. Chem. Soc. 1996, 118, 11225-11236. doi:10.1021/ja9621760) to do simulations of multiple variants of alkanes and alkenes. The major difference to the previous step is, that we will have to assign different (partial) charges to atoms depending on the number of hydrogen neighbors. This part of the tutorial expects that you have completed and understood the first part. If you didn't or run into problems, please revisit it here. Please note that in the following steps the use of "real" MD units (kcal/mol, angstrom, AMU, ...) is assumed and that the LAMMPS data file output is written in atom_style full.Step 2a: Methane

The second part of the tutorial focuses on using TopoTools for building topology files for simple all atoms simulationsStep 2: Building topologies for linear and branched alkanes with an all-atom force field

Also the system has more bond types (CT-CT in addition to CT-HC), more angle types (CT-CT-HC in addition to HC-CT-HC), and a set of dihedrals for the rotation around the C-C bond (HC-CT-CT-HC). To get a roughly similar size bulk system than for methane, the replication is changed to 4 4 3 to yield a 48 ethane molecule configuration.TopoTools::replicatemol top 4 4 3

Overall the largest changes required are not in the topology building script, but in the MD input in.step2b, as it now needs to hold more parameters for the additional interaction types.

Step 2c: Propane

This section discusses the methaneaatodata.tcl script that builds an initial topology for a bulk methane simulation from the methane.pdb coordinate file. The process of building the topology mostly follows the strategy in the previous scripts. The first difference is in loading the coordinates. The atom names in the coordinate file allow VMD to correctly identify the elements and are given in angstrom, so we can trust the automatically computed bonds and don't have to adjust atomic radii:mol new methane.pdb autobonds yes waitfor all

Next we create three atom selections, one for the hydrogens, one for the carbon and one for all atoms and set parameters relevant to the force field:set selh [atomselect top {name H}]

$selh set type HC

$selh set mass 1.00800

$selh set charge 0.060

set selc [atomselect top {name C}]

$selc set type CT

$selc set mass 12.01100

$selc set charge -0.240 ; # = 4x 0.060

set sel [atomselect top all]

The carbon is a "tetrahedral" carbon and thus we set its type to CT. Hydrogens at tetrahedral aliphatic carbons have a charge of +0.060 and thus the carbon with four hydrogens would have a charge of -0.24. We still need to retype the bonds and guess the angles. OPLS-aa also defines dihedral interactions, however, the topo guessdihedrals command will (expectedly) come up empty for methane. The next step is to place the methane molecule into the center of the box and determine suitable box dimensions. The VMD Tcl script code to do this is (note the use of -withradii for measure minmax, which makes sure there is sufficient space between methane molecules; also note that we enlarge the box around the molecule by an additional, empirical 10% using the "vecscale" command to account for thermal expansion):

set center [measure center $sel weight none]

$sel moveby [vecscale -1.0 $center]

set minmax [measure minmax $sel -withradii]

set box [vecscale 1.1 [vecsub [lindex $minmax 1] [lindex $minmax 0]]]

pbc set $box

Now we need to create a 4x4x4 lattice of the single methane molecule to construct a bulk system with 64 molecules. This could be done with the replicate command in LAMMPS, but TopoTools has a special high-level utility method for it as well.

TopoTools::replicatemol top 4 4 4

This will create a new molecule containing the requested replications of the original single molecule data set. Now we can write out the result to a LAMMPS data file, and also write out the coordinate and topology information as .pdb/.psf file combo, so we can use the more space efficient and faster to read .dcd output in the MD code.

topo writelammpsdata data.step2a full

animate write pdb 64xmethane.pdb

animate write psf 64xmethane.psf

This input still needs to be equilibrated to be run at the desired temperature and density or pressure conditions. The attached input in.step2a sets up some initial equilibration at a temperature of 100K with constant volume. Methane should be liquid at this temperature and if the guess for the molecular volume via measure minmax was good, then the pressure should be fluctuating around 0.0 atm. The second input in.step2a-2 replaces 'fix nve' plus 'fix langevin' with 'fix nvt'. This removes the damping and randomization of velocities of the langevin thermostat, which helps in achieving equipartitioning of the kinetic energy quickly, with a proper NVT ensemble time integration. This results in a more realistic dynamics of the atoms and would need to be continued until the system is properly equilibrated.

Step 2b: Ethane

Setting up an initial configuration for ethane from the file ethane.pdb is very similar to the previous script. The commands in the script ethaneaatodata.tcl remain mostly the same, since we still have only two atom types (both carbons are "terminal"), however, the charge on the carbons is different (-0.18 = -1.0*3*0.06), since they are now bound to only three hydrogens instead of four.set selc [atomselect top {name C}]

$selc set type CT

$selc set mass 12.01100

$selc set charge -0.180 ; # = 3x 0.060

The rest is as usual and we can write a LAMMPS input, in.step2c, where we set all the force field parameters. We have now 2 atom types, 2 bond types, 3 angle types and 2 dihedral types. In this case the guess of the volume per propane molecule is a bit too large, so we immediately allow the system size to adjust to standard pressure.

Step 2d: Isobutane

When building a topology from isobutane.pdb we need only minimal changes to get the isobutaneaatodata.tcl script: the "middle" carbon is still located at index 4, and only its charge is different, since it it now bound to three carbons and only one hydrogen. When accounting for charges like this, it is easy to make typos and get part of it wrong. A good sanity check is to compute the total charge of the system:

$sel delete

set sel [atomselect top all]

set totq [measure sumweights $sel weight charge]

if {[expr {abs($totq)}] > 0.0005} { puts "Total system not neutral: $totq" }

and print a warning if the total is not the exepected value, allowing for rounding errors. The MD input file is in.step2d.

Step 2e: Hexane / Cyclohexane 1:1 mixture

For propane we have another incremental change, now there is a "middle" carbon atom type, which requires different partial charges on the hydrogen atoms it is bound to. We need the initial coordinate file propane.pdb and use the script propaneaatodata.tcl. The basic setup is like in the previous two examples and we first define all parameters for carbons and hydrogens in the same way. However, then we have to override the settings for the middle atom, which is identified interactively in VMD. It has the index 4 in this specific case. So the script code to select this carbon its charge is the following:set selc [atomselect top {index 4}]

$selc set charge -0.120 ; # 2x -0.060

In this example, mixedhexanetodata.tcl, we are first building two independent topologies for a single molecule of each, cyclohexane (cyclohexane.pdb) and n-hexane (hexane.pdb) and then combine the two (note, the coordinates of the original coordinates to not overlap) and then replicate and create the data file as previously. For the n-hexane topology we use a new method to identify the carbons that need a different charge. First we get a list of the atom indices for the carbons and then create a selection using the first (list index 0) and last (list index end) entry in that list:set cidx [$selc get index]

set selc [atomselect top "name C and index [lindex $cidx 0] [lindex $cidx end]"]

$selc set charge -0.180 ; # = 3x -0.060

The merging of the two independent molecule topologies is done with another utility function:::TopoTools::mergemols [list $mol1 $mol2]

This utility function can combine an arbitrary number of independent molecules. Note, it does not check for overlaps. The MD simulation input (in.step2e) is almost the same as for the previous step. Only due to using cyclohexane, we need to define one more dihedral parameter set. Due to the large amount of empty space, the simulation cell will shrink a lot during the first few picoseconds.