Here is a word cloud generated from my papers over the last two years. It does a decent job explaining the kind of work I do and the kinds of techniques I prefer.


Science

I am interested in developing wide spread predictive power in biology and bioengineering. I want to leverage the rapid technical advances in genomic sequencing, systems, and synthetic biology to develop the kind of far reaching predictive power seen only in physics today. This necessarily involves working at the intersection of two traditionally disparate fields: theoretical/computational biology and experimental molecular biology.

To that end, I generally use a biophysical approach to complement bioinformatic, genomic, and other computational tools to understand organismal evolutionary constraints at the molecular level. Into the future, I would like to use this new found predictive power to technological ends.

Software
In addition to my scientific research interests, I am interested in scientific software development. My language base covers Python, R, Bash, and C++ as well as some intermittent Mathematica, Matlab, and Tcl/Tk. In addition, the Wilke lab maintains a public git repository of the one-off tools we've produced for various projects... https://github.com/clauswilke/WilkeLabProteinEvolutionToolbox. Beware, it is not a complete listing of all the scripts we've produced. We also have a tutorials page that covers some of the basics... http://wilke.openwetware.org/Tutorials.html. I have yet to fill out the structural biology segments, but that will happen eventually.

If there is any piece of software that seems related to what we do but is not in the repository I may have it laying around;
don't be afraid to ask.

Current Work


Geometrical constraints predict adaptive evolution in influenza hemagglutinin

I am currently writing this manuscript and I will post figures as they are completed.

Identifying evolutionary constraints with a biochemically meaningful protein model
This work is detailed in a manuscript currently under review at Science. I will post more details here as it becomes possible.

Past Work in the Wilke Lab

Predicting viral evolution
By combining techniques from bioinformatics, statistical mechanics, biochemistry, molecular biology, and genetics it may be possible to predict
the path of viral escape from host challenges. To accomplish this we can narrow the field of possible mutations with evolutionary intuition and bioinformatics, build a data set for training an in silico evolutionary algorithm, and test our resulting predictions experimentally.

Our first foray into the field is to use steered molecular dynamics to pull apart the two proteins in a complex several dozens times and calculate the force that was required to successful separate the complex.

The curve is generated with an anchored receptor and a force applied to the viral spike protein in our test system. The plot shows the average interpolated force over the replicas with p-value for difference in maximum force in the bottom left. This work combines folks from Wilke, Ellington, and Sawyer lab.
This article has been published in PeerJ and is available here.


Biophysical constraints on protein coding sequence evolution
On a whole genome scale, a direct correlation between relative solvent accessibility (RSA) of sites (amino acids) in proteins and the rate of evolution (measured as rate of non-synonymous versus synonymous substitutions) at that site was established previously here, providing a reasonable model for (relatively) neutral mutations.

Fitting the rate versus RSA curve one can find sites that differ significantly from neutrality, then, compare the predictions to the current state-of-the-art methods for functional predictions. Results from this work was published in two separate articles appearing here in Molecular Biology and Evolution and here in Philosophical Transactions of the Royal Society B.

Red points represent amino acid sites identified in other studies of Hemagglutinin H1. Points that do not fall between show significant deviation from neutrality. Sites above the dashed line in the preceding plots.

Past Work in the Barrick Lab

Population evolutionary simulator
Written in C++, bpopsim, has the ability to output several useful statistics regarding organismal evolution. It may be reasonable to compare this to something like the relative frequency of simultaneously sweeping mutations in a laboratory evolution experiment.

Below are some preliminary results from my test runs with bpopsim. The genotype frequencies plot was assembled and plotted in R.

Blue tones show the line of descent to the final dominant genotype. Gray tones represents all lineages that died out.

Past Work in the Sutton Lab

My master's work, in structural biology, was completed in the lab of Dr. Bryan Sutton at Texas Tech University Health Sciences Center in Lubbock, TX. I crystallized several novel constructs of the human proteins Synaptotagmin-1 and Dysferlin; I also used fluorescence spectroscopy to measure the potential for calcium binding of each domain in the proteins.