GPCR GROMACS Tutorial: B2AR in POPC Bilayer
The following tutorial discusses the steps involved in setting up an all-atom explicit MD simulation of a G-protein coupled receptor (GPCR) protein in a lipid bilayer using GROMACS 4.5.5. This tutorial is adapted from the membrane protein tutorial prepared by Justin Lemkul, which can be found here and can be used as a guideline for setting up any membrane protein MD simulation using GROMACS 4.5.5 or higher.
The GPCR protein considered in this tutorial is a class A human amine GPCR, β2-adrenergic receptor (B2AR) downloaded from the RCSB database (PDB id: 2RH1). The file was checked for any missing residues which were built accordingly. The option -missing can be used with "pdb2gmx" command to ignore any missing atoms, but should be used extremely carefully by understanding what is being ignored in this process. The lipid bilayer considered is a POPC (1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine) bilayer. The equilibrated POPC bilayer and the POPC topology for GROMACS was downloaded from Peter Tieleman's site. This site also offers similar structures and topologies for other lipid bilayers as well. The orientation of the GPCR molecule in the POPC bilayer can be achieved with “editconf” command using -princ or -rotate options (explained in detail in the section "Protein Insertion in Bilayer"). After insertion, visualize the system using any visualizer and verify that the intra and extracellular loops of the GPCR are just near the hydrophilic lipid head-groups or completely outside the lipid bilayer. The transmembrane (TM) regions should be immersed completely within the hydrophobic tails of the two leaflets. A similar study involving a homology built GPCR protein by Ghosh and co-workers can be found here.
The first step involves processing the PDB file and choosing the desired force-field and other parameters using:
a) pdb2gmx -f B2AR_2RH1.pdb -o B2AR_processed.gro -ignh -ter -water spc
When prompted interactively to choose the desired force-field. Choose 9 (GROMOS96 ff43a1) here. The choice of force-field depends on the system you are simulating. I have found that ff43a1 preserves the chemical and structural characters of both the protein and the lipids over long simulation times (in the range of 1 µs). However, recently the use of CHARMM36 force-field has gained impetus and can be used for GPCRs.
-ignh: Ignore H atoms in the PDB file
-ter: Interactively assign charge states for N- and C-termini (here choose “None” since already added)
-water: Choose a water model (SPC/SPCE/TIP3P etc.) (here choose “SPC”)
-inter: Interactively assign charge states for Glu, Asp, Lys, Arg, His
-missing: ignores any missing atoms (should be used judiciously)
MD simulation of heterogeneous systems involving different biological macro-molecules (like proteins and lipids) is a bit tricky even today as there exist no force-field that can handle all these biological molecules at a time. The “pdb2gmx” engine of GROMACS can only process proteins, nucleic acids, ions and a few co-factors. So additionally we need to add the topology of POPC to the force-field topology files. In GROMACS, "Berger lipids," derived by Berger, Edholm, and Jähnig (Ref) are the most commonly used lipid parameters for membrane protein systems. From Tieleman's website download popc128a.pdb, popc.itp and lipid.itp files. lipid.itp file contains all the atom types, bonded and nonbonded parameters for a wide range of lipids including POPC.
First, make a directory named “gromos43a1_lipid.ff” in the current working directory and copy aminoacids.rtp, aminoacids.hdb, aminoacids.c.tdb, aminoacids.n.tdb, aminoacids.r2b, aminoacids.vsd, ff_dum.itp, ffnonbonded.itp, ffbonded.itp, forcefield.itp, ions.itp, spc.itp and watermodels.dat files from “gromos43a1.ff” under GROMACS installation, into it. Next, copy the entries under the [ atomtypes ], [ nonbond_params ], and [ pairtypes ] sections from lipid.itp file and paste them into the appropriate headings within the ffnonbonded.itp file. Delete all the lines under the section [ nonbond_params ] starting from the line " ; ; parameters for lipid-GROMOS interactions. " till the end of the section. The modified ffnonbonded.itp file can be found at the end of the page. Now, add the contents under the [ dihedraltypes ] section from the lipid.itp file to the corresponding section of ffbonded.itp file. The modified ffbonded.itp file can be found at the end of the page.
A detailed description and explanation of all these procedures can be found in the membrane protein tutorial prepared by Justin Lemkul and hosted here.
Note: If using CHARMM36 FF, then no need to modify the contents of the bonded.itp and nonbonded.itp files, as CHARMM36 already includes parameters for some of the commonly used lipid molecules like POPC. You can directly download the CHARMM36 FF parameters from GROMACS user contribution section and place the directory under /your_path/gromacs_version/gromacs/share/top. Then pdb2gmx will automatically enlist CHARMM36 under its choice of force fields option and you can select it directly to process you lipid pdb/gro files.
Now, in the top section of the “topol.top” file change:
And, towards the end of the “topol.top” file, add the following to include the POPC topologies:
; Include Position restraint file
; Include POPC chain topology
; Include water topology
And, under the [ molecules ] section add POPC 128, to include the 128 POPC molecules.
Protein Insertion in Bilayer
Now, we need to insert the GPCR into the POPC bilayer and properly pack the lipids around the protein to correct density. We have already oriented the B2AR molecule earlier using “editconf” along the principle axis. First, remove the periodicity of the downloaded “popc128a.pdb” using the following commands:
a) editconf -f popc128a.pdb -o popc128a.gro
b) grompp -f EM.mdp -c popc128a.gro -p topol_popc.top -o EM.tpr
c) trjconv -s EM.tpr -f popc128a.pdb -o popc128_whole.gro -pbc mol -ur compact
The EM.mdp and topol_popc.top files can be found at the end of the page. If any error message pops up due to warnings, then use “-maxwarn 2”, but judiciously. Ignoring the warning (pertaining to the charge group radii) works in this particular step since here we are trying to remove the periodicity of a unit cell because of which this warning comes. Ignoring such warning in other simulations may not be wise.
Now, orient the B2AR molecule according to the X-Y-Z box vectors of the POPC unit cell. Copy the X-Y-Z box-vectors from the last line of popc_whole.gro and use:
a) editconf -f B2AR_processed.gro -o B2AR_newbox.gro -c -box 11.14741 11.14741 11.01306
After insertion, visualize the system using any visualizer and verify that the intra and extracellular loops of the GPCR are just near the hydrophilic lipid headgroups or completely outside the lipid bilayer. The transmembrane (TM) regions should be immersed completely within the hydrophobic tails of the two leaflets.
Now, we need to pack the lipids around the GPCR. For this we will use a PERL program InflateGro which can be downloaded from here. First, join the B2AR and POPC coordinate files using:
a) cat B2AR_newbox.gro popc128_whole.gro > system.gro
Remove the intermediate lines at the intersection of the two molecules and update the number of atoms at the top of the “system.gro” file accordingly. Impose strong position restrains on the protein while carrying out minimization steps of the InflateGro procedure. Generate strong position restraints on the protein using:
a) genrestr -f B2AR_newbox.gro -o strong_posre.itp -fc 100000 100000 100000
Now, include this new strong restraints file in the “topol.top” file as follows:
; Include Position restraint file
; Strong position restraints for InflateGRO
; Include POPC chain topology
In the EM.mdp file add the line "define = -DSTRONG_POSRES" to use the new restraints. Follow the steps described in InflateGro website to slowly pack the lipids around the protein till the area per lipid becomes around 65.8 2 (experimental value for POPC). The final value should not be less than the experimental value. Use a cut-off value of 0 to avoid deleting any lipid molecule. Note that InflateGro gives all the values in nm, so this value would be 0.658 nm2.
a) perl inflategro.pl system.gro 4 POPC 0 system_inflated.gro 5 area.dat
Now, carry out a round of energy minimization (restraining the protein).
b) perl inflategro.pl confout.gro 0.95 POPC 0 system_shrink1.gro 5 area_shrink1.dat
(from next step scale-down the lipids by a factor of 0.95; confout.gro is the file obtained after running EM)
We need to carry out these compression and minimization cycles for a quite a number of times (may be 10 or 20 depending on the system) until the desired area per lipid value is reached.
After the desired value for area per lipid is reached, solvate the entire system using the below given commands. To avoid placing water molecules between the lipid acyl chains, keep a copy of “vdwradii.dat” file in the current working directory and in it change the value of C from 0.15 to 0.375. Delete this file immediately after executing the following two steps.
a) editconf -f system_shrink.gro -o B2AR_POPC_box.gro -box 11.14741 11.14741 11.01306
b) genbox -cp B2AR_POPC_box.gro -cs spc216.gro -o B2AR_POPC_sol.gro
Often it is required to extend the solvent layer along the z-axis in order to fully immerse the intra and extracellular loop regions of a GPCR, that protrude above the lipid bilayer. For that case simple increase the z-axis value in the "editconf" command keeping the x and y-axis values same. After adding water check for any water molecule in the bilayer using a visualizer and manually delete those molecules from the .gro file.
Now, neutralize the system by adding the proper number of counter-ions using:
a) grompp -f Ions.mdp -c B2AR_POPC_sol.gro -p topol.top -o ions.tpr
b) genion -s ions.tpr -o B2AR_POPC_sol_ions.gro -p topol.top -pname NA -nname CL -nn 3
Since, B2AR carries a total charge of +3e (which is shown by pdb2gmx and grompp), here we add 3 chloride ions to neutralize the system. Ions.mdp file can be found at the end of the page.
Next, energy minimize the entire system now, using the following commands:
a) grompp -f EM.mdp -c B2AR_POPC_solv_ions.gro -p topol.top -o EM.tpr
b) mdrun -v -deffnm EM
The file EM.mdp can be found at the end of the page. At the end of the run, check whether Epot and Fmax have reasonable values. You can also check the potential energy values by plotting the output of the "g_energy" command using the EM.edr file obtained from the above run.
Equilibration I (NVT)
Equilibration is carried out in two steps, a NVT simulation followed by a NPT one. For GPCRs, or any other membrane protein system, we need to apply temperature and pressure couplings separately to three different groups, "Protein", "Lipid" and "Solvent + Ions". This usually increases the accuracy of the calculations. In order to remove the centre of mass (COM) motion relative to the bilayer (and protein) and the bulk solvent, we will create two special groups as follows:
a) make_ndx -f EM.gro -o index.ndx
When prompted by make_ndx, enter "16 | 14" to merge the SOL and CL to make one groups.
Then merge B2AR and POPC by entering "1 | 13" to make another group. The protein is restrained in this step and the NVT run is carried out for 500 ps. We will run the simulations at a temperature of 300 K which is above the phase transition temperature of pure POPC (271 K). The NVT.mdp file can be found at the end of the page.
b) grompp -f NVT.mdp -c EM.gro -p topol.top -n index.ndx -o NVT.tpr
c) mdrun -deffnm NVT
Like in the previous step, g_energy can be used on NVT.edr to check whether the temperature of the system has stabilized at around 300 K.
Equilibration II (NPT)
Next, we will carry out NPT equilibration under constant pressure condition for 1 ns using:
a) grompp -f NPT.mdp -c NVT.gro -t NVT.cpt -p topol.top -n index.ndx -o NPT.tpr
b) mpirun -np X mdrun_mpi -deffnm NPT (on X number of cores on a MPI cluster)
ORmdrun -deffnm NPT (serial run on a single core, SLOWEST)
mdrun -nt X -deffnm NPT (on X number of cores on a Workstation)
If using the first option, on a MPI based cluster, you need mpirun and mdrun_mpi (parallely configured mdrun of GROMACS) to distribute a job to multiple cores across nodes of the cluster. This enables you to run the jobs much faster. A typical job submission script (JobScript.pbs) on a MPI cluster using PBS job scheduler and InfiniBand network can be found at the end of the page. GROMACS 4.5 and higher version supports threading parallelization. Using the second option you can submit jobs on multiple threads of a multi-core workstation without requiring mpirun. The last option is to run jobs serially on a single core machine, which is not advised. It will take ages to run.
The NVT.mdp can be found at the end of the page. For the purpose pressure coupling we use Nosé-Hoover thermostat which is widely used for membrane protein simulations. In this step also the protein is restrained. Like in the previous step, g_energy can be used on NPT.edr to check whether the pressure of the system has stabilized at around 1 bar.
Once the system has reached the desired temperature and pressure, we can release the position restraints on the protein and carry out production run to collect date. MD.mdp file can be found at the end of the page.
a) grompp -f MD.mdp -c NPT.gro -t NPT.cpt -p topol.top -n index.ndx -o B2AR_POPC_1ns.tpr
b) mpirun -np X mdrun_mpi -deffnm B2AR_POPC_1ns
To further continue the simulations beyond 1 ns, use “tpbconv” with -extend option with desired time-frame.
If you have any queries or suggestions please contact me at firstname.lastname@example.org