Sorghum Cultivars 2022


The Sorghum-100 dataset is a curated subset of the RGB imagery captured during the TERRA-REF experiments, labelled by cultivar. This data could be used to develop and assess a variety of plant phenotyping models which seek to answer questions relating to the presence or absence of desirable traits (e.g., "does this plant exhibit signs of water stress?''). In this contest, we focus on the question: "What cultivar is shown in this image?''

Predicting the cultivar in an image is an especially good challenge problem for familiarizing the machine learning community with the TERRA-REF data. At first blush, the task of predicting the cultivar from an image of a plant may not seem to be the most biologically compelling question to answer -- in the context of plant breeding, the cultivar, or parental lines are typically known. A high accuracy machine learning predictor of the species captured by the sensor data, however, can be used to determine where errors in the planting process may have occurred. For example, seed may be mislabeled prior to planting, or planters may get jammed, depositing seeds non-uniformly in a field. Both types of errors are surprisingly common and can cause major problems when processing data from large-scale field experiments with hundreds of cultivars and complex field planting layouts.

Data Description

The Sorghum-100 dataset consists of 48,106 images and 100 different sorghum cultivars grown in June of 2017 (the images come from the middle of the growing season when the plants were quite large but not yet lodging -- or falling over).

Each image is taken using an RGB spectral camera taken from a vertical view of the sorghum plants in the TERRA-REF field in Arizona.


Start Date: 16 March 2022

End Date: 31th May 2022

Kaggle URL:


  • Abby Stylianou (Saint Louis University)

  • Rashmi Kamath (Saint Louis University)

  • Robert Pless (Saint Louis University)

  • Richard Souvenir (Temple University)