A Project for the 2013 Rubenstein Research Fellowship Awards and the Encyclopedia of Life

"The Encyclopedia of Life organization seeks to provide global access to knowledge about life on Earth by gathering trusted information on all species known to science into a single free, open and actively curated website at eol.org. Now featuring information on over a million species gathered from over 230 content providers, EOL is becoming a crucial resource for researchers, educators and citizens seeking to understand the world around them."

The EOL Rubenstein Research Fellowships have been granted to seven awardees in 2013 to answer research questions using the online data resources of the EOL and its partner sites.

In my proposal for the 2013 EOL Rubenstein Fellows program I hope to show the usefulness of online-accessible scientific archives such as the EOL in research. I will be using data found on the EOL and partner sites to investigate the distribution of blue flowers in North America, in response to the question "Is blue coloration more likely to occur in high altitude plant species?" This project will use images, morphological descriptions, and habitat information from the EOL, BHL articles and EOL partners to identify species of plants with blue flowers. These species will be mapped to latitude and longitude and altitude to identify any trends. This data can then be used in research to answer questions about selective pressures acting on flower coloration.

Throughout this project I will be working in collaboration with Katja Seltmann at the American Museum of Natural History.

This website is currently under construction, and will feature major updates on the project and data collected.  Brief updates and images will be at the project Tumblr.


12/05/2013 - Writing of the paper continues, as we look for places to submit where our data and thoughts can be brought to the world.  The learning process of writing a real scientific paper, a first for me, is challenging and exciting.  Realizing that what isn't in our dataset is just as important as what is was eye-opening.

Please look forward to our analysis on EOL and here!  As always, the data is in the documents to the left - play with it yourself, see if anything interests you or pops up.

11/01/2013 - Writing!  So much writing!

10/12/2013 - We've been analyzing data, and as we began to realize when the collection was complete, this is much less straightforward than a simple correlation of color and altitude.  During our forays into color science, we've learned some really neat things, and are excited to include them in the paper.  The RBG values will also be converted into HSV (hue-saturation-value) so as to be accessible for other types of applications. 

Another concern is the definition of 'high altitude.'  While medical science has definitions based on human tolerances, these aren't as important in botanical and environmental terms.  By comparing studies of alpine flowers, we've been able to decide on a range, which will require correlation with our locality data.  A good graphic example can be found in Alpine Plant Life by Christian Körner.

8/31/2013 - Data Collection complete!

The time for data analysis has arrived, as all flowers selected from EOL for image analysis have been looked at in Photoshop for RBG data, AND geolocated for elevation. As we look at the data for possible trends in flower color, there a few different directions we can go.  Does blue really follow a pattern based on elevation?  Is what we as scientists and public users of EOL call blue the same as blue in a color spectrum?  Does this affect our study of color?  Perhaps this will end up being more a study of our definitions of blue and a new color ontology, than a straightforward  blue-to-elevation correlation.

The next few months will be filled with graphs, charts, coding, and math, but I am optimistic that we will identify some interesting trends.  In the meantime, stay tuned to the Tumblr for more pretty pictures, and take a look at the current dataset in the documents here.

7/8/2013 - Major update!

Through discussion, it has been decided to make the data collection for this project more focused and streamlined, based on time issues and the fact that some species on EOL have many more images than others.  The new criteria for image selection is as follows:

- 10 images of each species analyzed, chosen by random (coin toss) start at the beginning or ending of the image list

- different localities for each, unless there are less than 10 images of flowers, in which case doubles can be chosen

- NA localities favored, but if less than 10 images of NA species are listed, than introduced examples can be analyzed

- skip dried and botanical garden specimens

It would be interesting in a separate project to see if color loss in dried specimens follows any trend, but it was determined that is outside the scope of this current EOL project goal.

With this newly streamlined method, nearly half the species chosen have been analyzed in Photoshop.  This will leave more time to identify possible trends in the total data upon completion, and to produce a paper.

Please continue to follow the Tumblr for images of all the species, from EOL!

5/10/2013 - Image analysis data has been doubled. 

As I have read locality and specimen data associated with the images, I have been making a note of what color the collector/photographer thought the flower was at the time of capture.  It might be interesting to see if what they call "blue" tracks with the analysis.

4/28/2013 - Such a great update!  Photoshop analysis has been going well, and the preliminary data is in a spreadsheet under "Documents," in the left-hand menu.  A few points:

 - unless they are provided with the photo, georeferencing for lat/long/elevation will take place after the color analysis, for ease of workflow.

-  I am developing definitions as I go along, and a new section is open for those in the menu.

-  a spreadsheet with links to maps and references used to determine the range of the plants can also be found under "Documents"

The Tumblr has also been updated; images from the species investigated so far are in the posting queue.

4/7/2013 - All 485 results for "blue flower" have been researched on the EOL site, and those of North American flowering plant species have been added to the master list.  The total result is 190 species to be analyzed.

My first thoughts are 'wow, that's a lot of Lupinus species!" and that grass gets described as 'blue' quite often, but this is not the same thing as flowers...  With this master list, the Photoshop analysis can begin in earnest.  After these species have been analyzed, further searches will be done for 'alpine flower' and 'mountain flower' to see if any more species can be uncovered on EOL for analysis.

3/31/2013 - Fifty more species have been added to the Master List.  Some speices have variable color - will this correlate to variation along an altitude gradient within a species?

3/25/2013 - The Master List of species of flowering plants to investigate and analyze for the project has been created, and the first 100 species have been selected.  The list can be found under 'Documents' under Navigation, to the left on this site.

The species were selected by using the search phrase "blue flower" on EOL in the main search page.  All search hits were then narrowed down to those of flowering plants (many butterfly species naturally come up when searching for flowers), and then narrowed down to those species in North America, which is where this project will focus.  Luckily most species pages included images.  Species pages without images, or without sufficient images, will have images selected from EOL partner sites.  Species selected will also be compared with specimens at the New York Botanical Garden when possible.

Interestingly, not all species with flowers described as "blue" in the literature are ones that I would pick out as blue merely using my own subjective standards.  Many look "purple" to me.  The spectrum analysis of the amount of blue in a flower, and whether there is any correlation between that and location and altitude, will be data points investigated during this project. 

Finally, a side question that occurred to me as I collected these species accounts - are more intensely blue-colored flowers more likely to receive a common name with 'blue?'