If we have two structures, we calculate distances of residue pairs in one structure and the other structure. The difference of distances indicates the movement of these two structures. We can use the information to find promising residue pairs that lock the protein we are interested in to a state.
For example, we have two structures of CFTR, 5uak (close state) and 6msm (semi-open state). A distance difference, Δd is equal to the distance of one residue pair in 6msm minus the distance of the same residue pair in 5uak. Then a positive difference indicates a larger distance for the residue pair in 6msm (semi-open) than 5uak (closed).
To lock the protein into the semi-open state, we can prevent the amino acids in the residue pair from moving away (in the case of negative Δd) or from moving toward each other (in the case of positive Δd) by connecting them with a molecular linker.
Since calculating distances of all residue pairs in the whole protein takes a long time and the channel is in extracurricular region, we could make a list that contains residues in extracurricular region and calculate distances of residue pairs in the list. Also, there are missing residues in pdb file, we remove those missing residue in the list.
For example in 6msm, the extracurricular residues list: [81, 138] [195, 241] [308, 351] [860, 932] [991, 1034] [1103, 1150]
Missing residues in 6msm.pdb: [410, 434] [638, 844] [890, 899] [1174, 1201]
Missing residues in 5uak.pdb: [1, 4] [403, 438] [646, 843] [884, 908] [1173, 1206]
So, we remove [890, 899] [884, 908] in the extracurricular list.
The final list: [81, 138] [195, 241] [308, 351] [860, 883] [909, 932] [991, 1034] [1103, 1150].
If the distance of a residue pair is too far, we can tell the residue pair is not a linker and we are not going to force the residue pair open.
How to calculate a distance between two atoms in VMD
#Select two atoms in TK console (tcl).
For example:
set a [atomselect top "resid 102 and name CA"]
set b [atomselect top "resid 338 and name CA"]
#Get the atom IDs.
$a get index
1732
$b get index
5624
#Create a label for the bond between the two atoms.
label add Bonds 0/1732 0/5624
#Go to "Graphics" and then "Labels"
You can see the distance between two atoms in "Bonds"
We want to narrow down the range of residue pairs that can possibly bind and lock the protein into open state. Therefore, I mutated all possible residue pairs in the extracurricular region to Cysteine and calculate the distance of the sulfhydryl group. If the distance of one residue pair is too far, it probably will form two linker which is not we want. Also, I calculate the distance difference of sg in 6msm and 5uak to determine whether it is promising residue pair.
On computer 3, go to cftrProject/6msmCalculateDistance/extracellularSGPair, open a terminal, run command:
python generateDistance.py
Linkers are often composed of flexible residues so that the adjacent protein domains are free to move relative to one another. And the cross linking span of linker is limited.