There is currently an urgent societal need to better match individual organisms - such as oyster, seagrass, corn, or trees - to the local environment for restoration, breeding, and farming. Many species, however, have evolved a lot of genetic variation among individuals, which in turn causes some individuals to grow and survive better in a specific environment than others. By harnessing this genetic information, managers can better match individuals to environments. These basic principles underlie the idea of genomic forecasting, which is a general term for models that are used to predict how an organism will perform in a different environment based on its genotype (DNA sequence).
Many genomic forecasting models have recently been developed, but the field does not fully understand the conditions under which they are accurate or how they are influenced by neutral evolutionary processes such as genetic drift. Our lab pioneering the way by evaluating existing methods and developing novel statistical methods to integrate data across biological, spatial, and temporal scales. Our current research is studying how to best incorporate genomic data into machine learning algorithms for prediction and falls into three areas:
Advancing research by clarifying the philosophy underlying genomic forecasts and providing recommendation on how to test and validate forecasts
Developing empirical datasets for the evaluation of forecasts for applications in breeding and restoration
Testing and evaluating simulations
We have published three important papers that advance the philosophy and validation of genomic forecasts.
Annual Reviews in Ecology, Evolution, and Systematics
This paper reviews the philosophy for rigorous method evaluation and validation. This philosophy underlies much of the lab's work.
This Perspective uses critical reasoning to highlight issues in the way genomic forecasts are being interpreted, and lays the groundwork for different conceptual and mathematical approaches for calculating fitness offsets.
Methods in Ecology and Evolution
This review outlines important considerations for the design of experiments for validating genomic forecasts, and introduces the distinction between a current-future evaluation and a local-foreign evaluation. We apply these principles to our empirical studies discussed below.
In this ongoing project, we are developing genomic forecasting models and testing them with both simulated data and in the field. We are using the Eastern oyster for the field tests, which should make for an interesting challenge to these methods given its complex patterns of population structure and adaptation to disease, salinity, and temperature.
Collaborator: Jessica Small, Virginia Institute of Marine Science
Funded by the National Science Foundation
Check out this video time lapse of the start of our massive experiment. In this video, we crossed hundreds of oysters from different populations in a single day at the Virginia Institude of Marine Science Aquaculture Breeding and Technology Center.
This ongoing project is testing the accuracy of genomic forecasting models for seagrass restoration in the Baltic Sea. The Baltic sea has a steep salinity gradient to which seagrass and many other species have adapted, so it makes for a good system to test these models.
Collaborators: Marlene Janke and Carl Andre
Funded by Fulbright and the Swedish Research Council
Machine learning methods related to random forests and gradient forests are used for genomic forecasting. In this paper we show how genetic drift can create relationships between genetic offset (a measure of maladaptation to environmental change) and population size, even when there is no adaptation in the population. This can explain why some studies have found relationships between genetic offset and population size, or why some papers find high genetic offsets are range edges where species have small population sizes. The finding highlights the problem with interpreting genetic offset as a fitness offset.
Published in the Proceedings at the National Academy of Sciences
Population geneticists have historically modeled adaptation in meta-populations to a single environmental gradient, which evolves monotonic clinal patterns in allele frequency at the loci under selection. This study shows that under complex multivariate adaptation, trait clines can evolve despite non-monotonic allele frequency patterns across environmental gradients. These patterns are not discovered by genotype-environment association methods, which are widely used to discover adaptation. This result challenges widely held conceptual models of adaptation via subtle shifts in allele frequencies across environmental gradients, and can explain why genes that underlie environmental traits do not always evolve clines. Additionally, this study shows that even when inference from genotype-environment association methods is inaccurate, multivariate quantitative traits can still be accurately estimated from genotypes and environments.