There is now a very large amount of data being produced by high throughput sequencing, gene expression studies, identification of protein binding sites throughout the genome, and more recently, by measurements reporting the probability that two sites meet each other in 3D space. These large data sets need novel methods of analysis to identify significant patterns, and the results can be correlated with those obtained by complementary techniques. For example one can look for the correlation between a given protein binding on the DNA and the probability of a gene being expressed nearby. A significant pattern means that the pattern has a low probability of being created by chance and can be associated with a specific phenotype, or observable cellular property, and thus carry information about things like cellular response to a change in environment.
Our collaborators, who are physicists and mathematicians, create theoretical models of the complex biological systems we are studying in order to help us to better understand our results and make predictions for future experiments.