Protein Modeling with Foldit

Tutorial 5, © Ron Hills

Assignment: In your lab report for this tutorial you'll summarize what we learn from this game about the principles that govern protein folding and molecular recognition. The molecular driving forces are thermodynamic, steric, electrostatic, geometric, noncovalent, hydrophobic, etc. Outline the basic procedure you used to solve each of the introductory puzzles. Make special note and explanation of where your procedure differed from that presented here and what problems you encountered. In your executive summary, discuss the findings of Horowitz et al.

  • S. Horowitz et al. 2016. Determining crystal structures through crowdsourcing and coursework. Nature Communications 7, 12549. doi: 10.1038/ncomms12549


Key Concepts: global free energy minimum, local energy minimum; hydrophobic interactions/electrostatics/hydrogen bonds; van der Waals, steric volume exclusion; primary amino acid sequence, sidechains, secondary structure (alpha helix/beta sheet), tertiary structure (native backbone fold), hydrophobic aggregation; harmonic restraints, conformational space; molecular dynamics simulation.


Homework: Explain the context of each problem in the Discussion Questions and answer in complete paragraphs. Summarize what you learned in an executive summary.


Focus Question: Explain observed protein folding events/mechanisms in terms of their thermodynamic driving forces. Can you think of a real world analogy for the protein folding process?


Highly Recommended: Bring an external mouse.

Intro Puzzles - Walkthrough 10 levels


  1. Foldit is an online competition in which computer users (with no scientific background necessary) compete to fold proteins whose 3D structure has not yet been characterized experimentally.

  2. Pick a username and create an account on the fold.it website. Search for our group on the website: uneRx, and click the 'Join group' link at the top left of the group page -- on the following page, select the 'Join' button to confirm.

  3. Download the Foldit game from: http://fold.it

  4. Open your Downloads folder and right-click the Foldit application, and then select open.

Mac: you must right-click on the program and select open for it to accept the Developer.

Ubuntu Linux: run ./Foldit from a terminal within the Downloads directory.

General Advice

If you do it right the puzzles are meant to be solved fairly quickly. But it is very easy to get stuck a few points from unlocking the trophy no matter how much you keep changing the protein. Proteins have an astronomical number of possible conformations so you don't want to be wandering around this misfolded space!

  • If you get stuck more than a few minutes, reset the puzzle. The solution is meant to be a few quick steps from the starting point, though it is easy to miss the precise sequence of steps.

  • Except for a few review exercises, it expects you to use the tool being introduced, so click on each pop-up messages and follow the instructions. Hitting s and then w cleans things up in between steps.

  • When in doubt, ask a partner for help. Besides the explanations below, you may find some online: http://foldit.wikia.com/wiki/Tutorial_Puzzles

  • Note: the puzzles must be completed in order, but you can restart an earlier puzzle for writing your report without losing your progress.

Summary of Commands

Click and drag part of structure to move it closer to the protein

Left click and drag on the background to rotate the viewing angle

Right click and drag (or hold control and drag) to translate/move the protein in the frame

Mouse wheel down/up (or hold shift and drag down/up) to zoom in/out

Shift + drag across two structures to draw a rubber band (click on pink band to remove, control click to weaken it)

Shift + click to fix a single backbone residue (blue segment)

Shift + double click to fix an entire secondary structure blue

Right or control click to refine a local backbone segment (Freeze, Remix, Tweak, Ideal SS, Align Guide)

Left click on a chain to bring up the Move Tool (click + drag rotates; right-click + drag translates; shift drag up/down moves in/out of screen)

2 enter Structure Mode (right-click to assign secondary structure or click and drag to extend)

4 enter Design Mode (click segment and pick new sidechain to mutate to; right-click to insert/delete residues in segment)

1 go back to Pull Mode (click and drag to pull parts of the structure)

Chain coloring: green if folded; red/orange if not properly folded

Red spheres: hydrophobic voids that need to be filled with orange sidechains

Blue sidechains: polar and hydrophilic (should be surface exposed)

Orange sidechains: nonpolar and hydrophilic (should be buried in protein core)

Blue/white rods: hydrogen bonds

Green/yellow rod: disulfide bridge

Red stars: steric clashes

Yellow bubble: exposed hydrophobic sidechain

z undo last step

^y redo

s Shake: energy minimize (relax) sidechains (toggle s key to stop and score your protein)

w Wiggle: energy minimize (relax) backbone (toggle w key to stop and score your protein)

LEVEL 1 - Sidechains Easy

A protein is a polymer chain made up of individual monomer units called residues or sidechains. In biology, there are 20 different amino acids possible at each position in the protein chain.


1-1 One Small Clash: Click and drag the orange sidechain away to remove the unfavorable steric clash (red ball). Science: Torsional rotation about the phenylalanine Cα-Cβ single bond reduces the van der Waals overlap with the other sidechain. By default, Foldit colors hydrophobic sidechains orange and hydrophilic sidechains blue using stick representation. Hydrogen atoms are not drawn, only carbon, oxygen and nitrogen.


Q1a: Which amino acid could be the sidechain shown in blue? How do you know? (Note how the protein main chain, or backbone, including the peptide bonds and alpha carbons is represented using a flat ribbon.)


1-2 Backbones Collide: We need to move the backbone to relax this helix-turn-helix motif, so shake alone won't work. Click and drag a helix to move it. Science: The polypeptide backbone can rearrange by rotation about the phi and psi single bonds in each amino acid residue between N-Cα and Cα-C, respectively. Note: individual chemical bonds are not visible in the backbone ribbon representation.


1-3 Swing It Around: Click and drag your mouse down over the background to rotate the view so you can see the three sidechains. Notice how the segments of the backbone are colored red for the clashing residues in this 5-residue beta strand (rectangular zigzags). Now, drag the lysine and arginine away from the central threonine residue.


Science: Your "score" in the game is a measure of how well the protein is folded: Score = -energy (not necessarily kJ/mol). Proteins fold to adopt a minimum energy structure. Favorable interactions lower the energy. E.g, salt bridge and hydrogen bond electrostatics lower the enthalpy. From thermodynamics:

G = H - TS Equation 1

Enthalpy, H, is a measure of potential energy (bonds broken minus bonds formed) while entropy, S, measures the tendency of systems to be randomly disordered. The Gibbs free energy (G) is the sum of both and if its negative that means the process is energetically downhill and spontaneous. The folded state is the global free energy minimum for a healthy protein. Besides the enthalpy from backbone hydrogen bonds, the driving force for protein folding is the hydrophobic effect of nonpolar sidechains segregating from water (resulting in increased water entropy upon folding). In certain diseases, toxic protein aggregates form when copies of the protein unfold and aggregate with each other, forming a plaque. Self-aggregation can be caused by a genetic mutation in the protein's amino acid sequence.

_________________________


LEVEL 2 - Backbone Packing Easy



2-1 Shake! (s): Click the shake button (shortcut: toggle the s key to turn on/off) to try to remove the clashes in the two alpha helixes. You must click the button again to stop shaking and count your score (do so as soon as the blue score bar turns yellow!). Notice how with shake the sidechains quickly relax and then stop moving; the backbone remains fixed. This is process is called energy minimization. The computer is applying physical forces and geometric principles to find the conformation with the nearest energy minimum. It does so mathematically by evaluating the first derivative of the energy, and taking small steps in the direction of steepest descent until the slope equals zero.

Science: In the mathematics minimization problem, we start with a high energy unfavorable structure. How do we relax it? Energy is a function of the 3N xyz atom coordinates and we need to know in which direction to move the coordinates to lower the energy. The first derivative of the energy, called the gradient (grad), tells us this.. it is a vector pointing in the direction of greatest increase. For each atomic degree of freedom, the energy gradient tells us whether we need to increase or decrease the coordinate to lower the energy function. Minimization will rapidly get you to the closest local minimum, but will then get stuck because the slope is zero at the minimum. Protein fold prediction is hard because we are looking for the global minimum: we need to compare many local minima to find the best overall structure. Energy minima are separated by energy barriers, which requires thermal motion to overcome (wiggle will do this)..


2-2 Close the Gap: The large red sphere means there is a hydrophobic void that should be closed off to water. Drag one of the helixes closer to form a helix bundle: the orange sidechains should pack together. Go slowly, your score will jump when you fill the red void (bonus points), but if you go too far you will create steric clashes and your score will start dropping. If your score bar turns yellow, stop immediately to score it and pass the puzzle. The sidechains should touch but not the backbone. You may try shaking the sidechains, but for this intro puzzle it's a very simple move of dragging the helix slightly inward (don/t change the starting viewing angle).


Science: The sidechains in blue are the polar and hydrophilic amino acids (Thr, Glu, His..): they prefer being exposed to the surrounding water. The nonpolar sidechains in orange are hydrophobic and want to be buried in the core of the protein (sequestered from water). Minimizing the red voids will expel remaining water in the protein core.


Potential problems: Make sure the helices pack in the correct orientation. Rotate the view by dragging the background to look down the helix axis. The helices actually need to be packed at a slight angle to each other.

2-3 Wiggle! (w): While shake energy minimizes the sidechains, Wiggle All relaxes the backbone conformation, helping elements of secondary structure such as helices to come together to form the overall tertiary fold. Rotate the view so you can see the helix sticking out in solution, tap the w key to wiggle the structure together, then tap w again to stop and score your fold. Wiggle is attempting a molecular dynamics simulation: the simulator is applying thermal motion to the allow the protein take a walk on its energy landscape in search of more favorable, lower energy, conformation (the mathematical procedure is discussed in Level 3).


Q1b: Discuss the two computational algorithms that correspond to shake and wiggle.

LEVEL 3 - Hydrogen Bonding Easy


3-1 Sheets Together: Beta sheets form via hydrogen bonds between peptide amide N-H donor atoms (small blue knob) and carbonyl (C=O) H-bond acceptor groups (small red knob). By dragging one strand subtly in different directions you should be able to align the blue/red knobs for forming two more H-bonds (white-blue rods). Then hit w to see if the 2-strand beta sheet forms on its own (requires 4-5 H-bonds). Stop wiggling.

Q2: Rotate the structure to view the strand-loop-strand motif. Is the beta sheet parallel or antiparallel? How did you determine the orientation?


3-2 Hide the Hydrophobic: Click and drag the orange phenylalanine sidechain into the middle of the protein to get rid of the yellow bubble (exposed hydrophobic residue). Next, click and drag the blue Glu/Gln sidechain (hydrophilic) so it faces outside the protein. You should not need shake/wiggle.


Q3: Protein surfaces are polar but their core is hydrophobic. A drug binding pocket is usually not far from the protein surface.. do you think it would be polar, nonpolar, or have elements of both? Explain what your choice would mean for designing a drug that is both specific (no cross receptor reactivity) and a high affinity binder.


3-3 Lonely Sheets: Wiggle by itself will only partially collapse this structure, so we need a distance restraint to force the two beta strands (rectangles) together. Draw a rubber band by holding shift and clicking and dragging across the two strands. Rotate the view to make sure your band was applied in the right spot. Otherwise, click on the pink band to remove it and try again. With the band in place, wiggle should minimize the distance between the two strands.


Science: The folding problem is complex--there are too many degrees of freedom to simultaneously optimize. We can add springs to the structure to "zip up" parts of the structure we think should fold a certain way. Mathematically, restraints use the equation for a harmonic spring:

U (x) = k (x - x0)2/ 2 Equation 2

Equation 2 expresses the potential energy, U, stored by stretching (or compressing) a spring from its equilibrium value x0. k is the stiffness force constant. Equation 2 is obtained from integrating the force equation for a spring:

F = -k(x-x0) Equation 3

which says that the force, F, is simply increased proportional to the distance stretched. In three dimensions, the gradient tells us the direction and magnitude of the force on an atom: F = -grad U (R), where R is the coordinate vector in x, y, and z. The negative sign means the force points in the direction of decreasing slope to minimize the energy according to its derivative. By drawing a rubber band between the two beta strands, we are placing an energy penalty for the strands being far apart. Energy minimization with wiggle will decrease the distance between the strands. To form a beta sheet, the distance between the strands should be 5 angstroms in order to form hydrogen bonds.


Q4: What happens if your rubber band is in the wrong place and the strands are not properly aligned? Reset the puzzle and test if wiggling with an incorrect restraint forms the beta sheet. Describe the process as it tries to fold.


3-4 Control Over Clashing: Under the Behavior menu, reduce the steric Clashing Importance to 0.01 and set the Wiggle Power to Low (coarse-grained refinements are good for big changes to the structure early on). Hit w to wiggle while slowly increasing clashing to 1.0. You may need to hit w to stop then hit s to shake and stop.


Q5: Reset the puzzle and compare different settings: low vs. high wiggle power and low vs. high clashing importance. Which settings tend to make the protein stuck? How do you get it unstuck? When you get close to the final structure, going to lower powers can make the score decrease.

Science: van der Waals interactions are mathematically represented by the Lennard-Jones potential between two particles i and j:

U (rij) = Eij [ (rmin / rij) 12 - 2 (rmin / rij) 6 ] Equation 4

Recall that a parameter helps determine the shape and location of a function. Here, the potential energy U is the dependent variable, and the distance r between atoms i and j is our independent variable. The Lennard-Jones equation has two parameters: E (units of energy) and rmin (units of length). Physically, the energy has a global mimimum of depth Eij when the atoms are separated by a distance rmin: U (rmin) = E (1 - 2). The van der Waals interactions are maximally favorable at this distance. The LJ potential goes to zero at infinite separation as desired, but the energy goes to infinity or blows up as the distance approaches zero.

Steric clashes in an improperly constructed system will thus have very high energy waiting to be released as kinetic energy which can potentially split molecules apart. In running molecular dynamics simulations this means that you must first perform a local energy minimization of the system to relax bonds and remove clashes. The dynamics of the system can then be simulated by solving Newton's equations of motion. Given an initial position and velocity for each atom, the new coordinates can be calculated for a short time interval later (known as the time step Δt). This is numerical integration: the acceleration on each atom is propagated by taking the negative derivative of U:

force = -gradient of potential energy

Q6a: Since by definition U(r = σ) = 0, replace rij in Eqn. 4 with sigma and set the expression equal to zero.. collect rmin on one side of the equation and sigma on the other to derive the expression for rmin in terms of σ.

Q6b: Plug rmin(σ) in the expression for U so that you have U(rij) with σ as the parameter constant rather than rmin.


So, in scaling the steric clashes to 0.01, we reduce the van der Waals energy so it's not as important compared to the other energetics in the protein.

3-5 Sheets and Ladders: Pull the 3-stranded sheet together using two rubber bands (one at either end) for each pair of strands (a total of 4 restraints). Wiggle and shake to get the energy close. Then, remove or reduce the strength of the bands to further refine the structure with wiggle (control-click to adjust strength or length of band).

Tip: You can zoom in on the structure if you have a wheel button on your mouse--alternatively, hold shift and drag the left mouse button up/down. Also, the entire protein molecule can be 'translated' by right or control-clicking and dragging the mouse. Science: Translational motion does not involve any of the internal (torsions) degrees of freedom of the protein. Instead, we are merely moving the molecule's center of mass in the Cartesian (xyz) reference frame. This is equivalent to diffusion, and is important in the random fluctuations between a soluble drug molecule and where it binds to its target receptor.

3-6 Lock and Lower: Shift and double-click to FREEZE secondary structures in place (don't click too fast). Freeze each of the four beta strands that are in the plane by using shift and double click, turning the full strands blue (or shift and single click on each residue). The strands now have their structure frozen. Drag the outer strand into its space, which should form a few hydrogen bonds. Then try to zip up the rest of the sheet using shake and wiggle.

Science: The reason we fixed parts of the backbone is because the pairing of two new strands could pull the existing sheet apart. In physics this is termed frustration. Proteins have evolved to fold efficiently and minimize frustration. The energy landscape is said to be funneled, with the native fold lying at the bottom of a steep energy well (global minimum). This is accomplished by folding up small elements of secondary structure first (helices and sheets), before these motifs combine into the correct overall fold.

_________________________

LEVEL 4 - Hydrophobics and Hydrophilics Medium


4-1 Turn It Down: A patch of the helix is red, meaning we need to rotate the amphipathic helix about its axis to line up the hydrophobic contacts with the protein. Adjust the view slightly so you are looking down its axis. Control or right-click on the helix backbone and select TWEAK. Click and hold the purple arrow to rotate counter-clockwise to generate a clash clearing bonus +860. It actually won't let you rotate in the clockwise direction, which would mess up the loop. Click Stop, shake, and then wiggle. Click the pop-up messages and click and drag on the background so you can see the arrows.

Science: Activation of a GPCR receptor involves conformational rotation about the helix axis of one or more of the seven transmembrane helices that sit in membrane. Recall from biochemistry that alternating register patterns of polar and nopolar sidechains (hydropathy plot or helical wheel diagram) can be used to predict secondary structure or elements that cross the membrane. E.g., leucine zippers are two aligned helices that each have a hydrophobic leucine residue repeated on average every 3.6 residues. The patch of leucines on each helix zips the structure together.

4-2 Flippin' Sheets: Click and drag on the background to rotate the view until you find the strand that needs to be flipped (it has 3 orange nonpolar sidechains on the outside and 3 blue polar residues on the inside). Right-click on the backbone of this strand and select Tweak. Click the arrow in the direction that the strand needs to shift to align with the existing beta sheet (otherwise the strand will stick even further out from the sheet). Stop, shake, and wiggle.

4-3 Hydrophobic Disaster: For this open-ended puzzle, I chose to freeze the three-stranded sheet and draw two rubber bands to the fourth beta strand. Shake, wiggle, unfreeze, and wiggle.

4-4 The Right Rotation: Rotate the view until you are looking down the helix axis. Right-click the amphipathic helix backbone and select Tweak. Click and hold on the arrow until the orange phenylalanine rotates into the core of the protein. Make sure you rotate in the direction that does not distort the loop connecting the helices. Release the button before the other orange sidechains protrude out the other side of the protein. Stop, shake, and wiggle. You may need to adjust the clashing importance under Behavior to keep the structure moving. Stop and shake again. Video - solved puzzle

_________________________

LEVEL 5 - Tools and Types Medium


5-1 Quest to the Native: Right-click the helix and select ALIGN GUIDE to overlay it with the known structural template in shadow. Next, shift click and drag from the middle a beta strand to the point in space where its shadow is. You will have to rotate the view in different orientations to verify that it landed on the shadow (you can adjust the band by dragging the end nob). Repeat for the other two beta strands (look at the chain connectivity to make sure you are not mixing up the strands). Hit w and wiggle should pull the 3-stranded beta sheet together.

Next, use shift click and drag across the remaining random coil backbone and the helix. Wiggle should fold up the coil, getting you close in score. To complete the puzzle you may need to select Actions > Disable Bands and shake/wiggle one last time.

Science: Whenever we are comparing two structures of the same protein (whether from different experiments or active vs. inactive conformational states), we first must align the part of the protein that is the same in both structures. The computer does a root-mean-square deviation (RMSD) alignment. Just like computing the standard deviation of a data set, the second protein is translated and rotated by the computer until the deviations in all the atomic positions is minimized. The resulting minimum RMSD between the two structures will usually be a few Angstroms, meaning on average each atom moves a few bond lengths in space. Structural alignment is useful for predicting the structure of a new protein from that of a closely related gene (homolog).

5-2 Movin' Along: Translation and rotation in 3D space is important for the binding of two molecules together. Left click the center of one of the helixes to bring up the pink arrows for the Move Tool. Hover your mouse over them to highlight the arrows, then right-click or control-click and drag to move the helix exactly onto its shadow. This should work if you use the default view, shake, and wiggle. Otherwise, you may try left-click and drag to rotate the helix. You will then need to click off the helix on the background and drag to rotate the view to ensure the helix is completely overlaid.. repeat a few times for each helix. Once the dimer is formed, shake and wiggle will complete the puzzle. The solution is only requires a subtle change to the view. If it doesn't overlay perfectly, it is more efficient to keep resetting the puzzle and try short runs.

5-3 Electron Density: Bring up the Electron Density window under the Actions menu. Increase the Threshold until you can just make out the strand and helix elements. You should see a shadow where the fourth strand needs to be dragged into to complete the beta sheet. Shift-click on the middle of the beta strand out in space to draw a rubber band to connect it to the middle of the third beta strand in green. Rotate the structure to make sure your band is on the middle of the third and fourth beta strands. Press 'w' to wiggle and the entire protein should eventually fold up to complete the puzzle. Hit 'w' again to stop and score the result when it turns orange! You can select Electron Density > None or Wireframe so you can see what is happening to the protein when you try wiggle. Wiggle is using RMSD to minimize the structure onto the density, but the problem is that all atoms are treated equally and the strand doesn't know how to get there without your rubber band restraint.

Science: Mapping the electron density is actually how the structure of a protein gets solved experimentally. High resolution maps are solved by crystallizing a protein out of solution. The regularly ordered crystal will diffract X-rays in a pattern that predicts the spacing between atoms. Lower resolution maps can also be constructed for larger quaternary protein complexes using electron microscopy.

5-4 Remix: Easy First shake and wiggle the whole protein to get ~8400 points. Now, the folding process can be simplified by optimizing pieces of the structure separately. The loop section between the sheet and helix is dark orange rather than green, meaning it needs fixing. Control or right-click on the backbone loop and select wiggle. Notice how the loop barely moves during the minimization.

Next, control/right click on the backbone loop and select Remix. Instead of minimizing, after a few seconds the Remix tool generates a library of random, completely different, conformations for you to choose from. Click the right arrow to cycle through the library. Because you are changing the backbone without relaxing the sidechains, your score at the top will likely decrease. The number under the remix tool estimates what your score would be after shake and wiggle. Cycle to a structure in which both scores are relatively high. Then click stop, shake, and wiggle!

Science: Why don't we just use Remix to search all the possible bond rotations and fold the entire protein? In a 100-residue protein, there are 198 phi and psi torsion angles that can be in one of 3 conformations (see the Ramachandran plot for phi and psi angles corresponding to alpha helix, beta sheet or turn regions). This results in 3198 = 1094 total possible protein conformations, an astronomical number. Searching all possible conformations would take the age of the universe! This was posed in 1969 as Levinthal's paradox, and is taken as evidence for the funnel-like nature of the underlying energy landscape. Rather than searching all conformational space, the unfolded chain folds rapidly on biological timescales (millisecond) driven by the strength of locally attractive sequences of native contacts.


_________________________

LEVEL 6 - Sequences Easy


6-1 Structure and idealize: So far in Foldit we have been in what's called Pull Mode, where your mouse can manually pull and drag parts of the structure relative to each other. Sometimes you will need to assign or reassign the secondary structure. Notice how one end of the structure starts with the backbone as a thick helix, but most of the peptide is unassigned (a thin random coil). Note how the full length of the backbone is orange, which means it needs refining. Hit the 2 key or select Modes > Structure Mode. In Structure Mode, click and hold your mouse on the helix and drag across the rest of backbone chain. It should turn into a thick yellow tube, meaning we assigned it to be alpha helix secondary structure. Hit the 1 key to enter Pull Mode, right-click on the backbone, and select Ideal SS. This should redraw the entire peptide as a perfect helix, completing the puzzle.

Science: Difficult structure prediction problems are often bootstrapped by first predicting the secondary structure of local segments. For each 5-7 residue amino acid combination, a statistical database (PDB) is used to predict whether the sequence is most likely to adopt a sheet or helix secondary structure.

6-2 Basic threading: Select Actions > Show Alignment. Click the play button to thread your protein onto the structural template. Start and stop shake, then wiggle until you score is achieved and stop.

Science: Threading refers to aligning (via sequence) your protein with a homologous protein of known structure. Using the template you can arrive at a reasonably folded structure if the two proteins are sufficiently similar (~30% sequence identity). Because your protein has different amino acids (and may contain gap deletions or insertions that result in variable loops), it will need to be relaxed after threading.

6-3 Cut and Paste: Close the blue cut in the protein backbone by wiggling until it turns yellow. Then stop and click the yellow cut.

6-4 Alignin' sequences: Now try aligning the single-letter amino acid code sequences manually. Click on one letter, then shift click on a later letter to select an entire segment and move it using left/right arrow keys. Insert gaps where needed to obtain the best alignment (there should be two long white shaded bars with a 4-residue gap between them). Note: selecting one residue and moving it right will shift all residues after it. Click the play button to thread the sequence on to the template structure. It will ask you to do a very brief wiggle, stop, and then click the yellow gap to close the cut. Finally, do a good shake and then wiggle until your score is obtained. You may need to shake and wiggle a second time.

_________________________

LEVEL 7 - Conditions Easy


7-1 Make Five Bonds: Rotate the view toward you so you're looking down on the antiparallel beta sheet. Click and drag the end of the lower beta strand a tiny bit closer to the other and you should form five hydrogen bonds, scoring bonus points. No need to shake or wiggle. In design puzzles, conditions help direct solutions toward specific research goals.

7-2 Disulfide bonds: Two cysteine sidechains in proximity can form a covalent sulfur linkage (-S-S-) known as a disulfide bridge (green-yellow rod). Click and drag on the background to find the other two cysteine pairs that are close to each other (red backbone). Zoom in and rotate the protein core into view by holding shift and then click-drag left. Now, click and drag on each yellow sidechain atom to pull it closer in order to form a covalent bond (the yellow thiol group will rotate about the orange methylene -CH2-).

Science: Disulfides do not form in the reducing environment of the cytosol, but are important for secreted hormones such as insulin.

7-3 Combo conditions: Could be difficult Recall that a thick tubes means alpha helix secondary structure has been assigned to a section of the backbone. In order to convert the backbone into an ideal helix structure, rotate the elongated tube segment into view, right-click on the tube, and select Ideal SS to draw a perfectly straight helix. Satisfying the first condition gives +460 bonus points.

Next, the core existence condition wants us to bury the hydrophobic sidechains, turning the backbone orange. Click and drag on the background to rotate the view (control and drag to translate the entire protein into the frame). Notice how the beta strand is about the same length as the helix, suggesting that they should align. Use shift and double-click-drag to draw a rubber band from the free end of the helix to the free end of the strand. Draw one more band between the middles of the two structures (make sure the band is actually on the peptide -- you will hear a nice clicking sound). Wiggle, stop, shake, then wiggle. For me this actually turned the score yellow then blue. Under the Undo menu, select Restore very best and you will have solved the puzzle. You can also check the Show box under Conditions satisfied for Core Existence. Residues that are buried will be highlighted green. Select Actions > Disable Bands at the bottom of the window.. keep pulling parts of the structures on top of each other until you get the beta strand's orange sidechains to pack onto the helix.

Potential Problem: You want the sidechains of the strand to be close to the helix and buried. If the backbone of the beta strand is too close to the helix, the sidechains will actually be pointing away. Ask for help if you can't get it after a few tries.

_________________________

LEVEL 8 - Blueprint Easy


8-1 Hello Blueprint: Actions > Blueprint gives you access to building blocks of specialized loop structures that connect elements of secondary structure. Drag the blueprint panel to the lower left of the window in order to view the protein. Drag the second of the helix-helix connectors in the Building Blocks window on the right over to the intervening -D-L- sequence between the two alpha helixes in the Blueprint window. Wiggle and score. Note how different geometry linkers can result in better scores.

8-2 More Building Blocks: Building blocks place predefined geometry constraints on bond torsions in the loop regions. Analogous to rubber bands, these constraints can help you to guide the structure early on in the refinement but then will end up impeding you from gaining points later on in the refinement. For the all of building blocks in the sequence, click and drag the yellow boxes off the blueprint screen to remove the torsional constraints. Wiggle will then solve the puzzle.

Science: Notice how we used the term constraints here instead of restraints. In mathematics, a constraint is an absolute restriction imposed on the calculation, while a restraint is a bias that tends to force the calculation to a certain restriction. We can constrain a bond angle to an exact value in the internal molecular coordinates, but in terms of absolute atom coordinates we can only restrain the distance between two objects in three dimensions.

8-3 De-Novo: the Final Challenge: Select Actions > Blueprint and drag red helix building block segments at the very top over to the Blueprint sequence to elongate an ideal alpha helix. You can also click and drag this helix in the Blueprint window across the entire sequence. Wiggle and stop to solve the structure. Or, click Idealize SS. Lastly, you could drag a loop into the middle of the sequence and Wiggle. Blueprint enables you to build your own protein piece by piece!

_________________________

LEVEL 9 - Protein Design Easy


9-1 Intro to Design: Select Modes > Design Mode (or press the 4 key), and click each bright orange segment to make a point mutation. First, rotate the view to identify the bright orange sidechains (aromatic hydrophilics).

Q7: What are the two sidechains that you are mutating? What residues did you put in to reduce size yet conserve polarity/properties?

Hint: pick a smaller aromatic hydrophilic for each. You may need to shake the resulting structure, but wiggle is not necessary.

9-2 Swappin' sidechains: Press 4 to enter Design Mode and click each of the two bright yellow segments to put in one of the bigger hydrophobic sidechains (in the outer orange ring).

Hint: try combinations of Ile, Leu, and Val. Clusters of these three branched chain amino acids (BCAAs) commonly stabilize the hydrophobic core of proteins and are sold as a supplement for promoting muscle recovery after a workout.

9-3 Mass Mutate: Press 4 and click on each yellow segment to put in five small hydrophilic residues (inner blue ring). Make sure they are small enough (e.g. T, N, Q) so that you don't introduce any clashes. Or, click the Actions > Mutate button to design residues automatically.

Q8: Why should proteins be more picky about hydrophilic sidechains being put in/near the protein core than swapping hydrophobic residues?

9-4 Insertion and Deletion: Get rid of the thin red line (violated constraint) for the two pieces of backbone that should be together. In Design Mode, right-click the middle of the helix and insert two residues between, wiggle, stop, and select Actions > Mutate. Then, mutate the residue within backbone in red to a hydrophobic residue. Wiggle and solve the puzzle.

_________________________

LEVEL 10 - More Molecules Easy


10-1 Ligand Debut: Click on the ligand to show the purple arrows and reveal the Move Tool. Right-click and drag to translate the ligand just a little (1 Angstrom) to the right in the direction of its binding partner. Click on the background to get your score. If the hydrogen bonds are formed, wiggle to complete the puzzle. Both arginine sidechains hydrogen bond to the red oxygen atoms.

Science: The placement of one molecule relative to another is defined by a distance vector describing the x, y and z displacements as well as three orientation angles: the Euler angles. In spherical coordinates, these are called latitude/longitude or alternatively pitch, roll and yaw.

10-2 Ligand Constraints: Use the Move Tool to contrll-drag the ligand slightly closer in the direction of its binding pocket. It's hard to see but you just need to minimize the thin red lines a little bit, which are constraints because we know these functional groups in the drug pharmacophore must line up with atoms in the protein binding site. Then Wiggle Sidechains will do the rest, relaxing the protein sidechains while allowing the ligand to diffuse completely into its binding pocket. Stopping should solve the puzzle. If not, reset the puzzle and try dragging the ligand a different distance. The purple atom should dock to the histidine H-bond acceptor (red). The terminal phenyl ring should dock to the tryptophan H-bond donor (blue). Wiggle Sidechains does more than Shake sidechains by using molecular dynamics rather than energy minimization while slowly reducing the temperature (known as simulated annealing).

Science: Computer-aided drug design consists of screening thousands of chemical compounds or molecular fragments for their ability to dock into a protein structure. Flexibility within both the protein target and drug molecule ligand is important in the recognition (docking), but it is also the rate limiting part of the computer calculation and limits how many drugs or targets we can practically screen. Drug discovery software automates this process by employing libraries of many rigid drug and receptor conformations and randomly places them at intervals on a discrete xyz grid in space.

Q9: If computers are algorithmic and thus predictable, how can they do anything at random?

Computers use an algorithm that takes a large integer as input, called the 'seed', and then uses a complicated ( function to return a long sequence of large numbers that appear to be random. For instance, a sequence of algebraic operations can be applied and then we truncate the decimal places. Using the same seed in the pseudorandom number generator actually produces the same sequence of numbers, so a unique seed needs to be specified each time the program is run. Usually the system time is called and a function is chosen that gives a sequence of decimals between 0 and 1. If the number generator is perfect, all values will have an equal probability of being produced (uniformly distributed)..

csh

set seed=`date +%s` #prints current time in seconds elapsed since 1970

echo $seed

number = float(seed) #pseudocode.. convert integer seed to decimal

loop_begin:

number = randomfunction(number) #apply function to generate next pseudorandom number in sequence

randy = number - int(number) #truncate decimal portion between 0 and 1

print randy

10-3 DNA pairing: Select Modes > Design Mode and click on the DNA residues to mutate them. You need to match the color in the base pairs (purple or yellow) and select a double purine ring if pairing with an existing single ring or select a singe pyrimidine ring if pairing with a fused ring. At the outset, only the base pairs with a purple base are mismatched and you should click on the yellow/pink base to mutate into a purple purine (G) or pyrimidine (C).

10-4 DNA and protein: Transcription factors are proteins that bind specific sequences of DNA (turning on or off gene expression) by reading out the bases exposed by the major groove of the double helix.

Step 1) Enter Design Mode and mutate in a polar lysine sidechain where it asks for a long hydrophilic residue. The epsilon amino group will donate a hydrogen bond (white-blue rod) to the sp2 lone pair nitrogen on a guanine base (purine with two fused rings).

Step 2) Mutate in an asparagine where it asks for a short, forked hydrophilic. The Asn amide group forms two hydrogen bonds with the purine adenine purine. Note how the H-bond donor and acceptor groups align at 120 degree angles. This is due to the sp2 planar geometry/orbital hybridization. The Lys-Asn-Gln sidechains of the protein use hydrogen bonds with nucleobases to form a tight binding interaction with this specific GAG DNA sequence.

That's it for the Intro Puzzles! Now that you have all the tools you are ready for an online competition. Foldit uses crowdsourcing to solve the protein folding problem by having players across the globe compete to solve structures for which there currently exists no experimental solution. Click on one of the available 'Science Puzzles' to see what a real fold prediction problem looks like. See if how far you can rank up in just a few minutes by dragging the protein, using Shake and Wiggle, etc.

To close the Foldit application, under Menu select Save and Exit.

...Tomorrow, we will be learning a visualization program for structural biology data files: Tutorial 6.