Methods           

Putting Public Data to Work

The August 2023 edition of the eBird Basic Dataset was downloaded for the province of Alberta. This set contains approximately 3.5M records ("checklists") which include observer ID, date, time, coordinate location, duration, and a species abundance matrix. Pond information was obtained from the Drainage - Storm Water Management layer from the City of Edmonton's Open Data Portal in ArcGIS (Figure 4), resulting in a working dataset of 2,913 checklists from 2015 to 2023. Due to simplifications in the stormpond layer, ponds were redrawn over basemap imagery for more accurate estimates of shape and size and were then interesect with the eBird data to select 40 suitable ponds (Figure 5). eBird data was minimally cleaned to remove known outliers as recommend in the eBird Best Practices Guide and Johnston et al. (2021).

Sampling Design

Checklist data was filtered to 2015-2023 during peak migratory months of May, June and July.

The sampling unit is defined as a pond during a single month of a single year as a form of time-for-space substitution. All subsamples within one pond type are aggregated to produce one measurement for that pond-month-year to reduce spatial and temporal sampling biases inherent in eBird data (Figure 6). Samples are stratified by effort (duration in minutes) before comparisons between factorial categories to allow "like-to-like" comparison. 

Since all factors are observed and not manipulated, summaries were calculated to investigate study balance and find limiations and data gaps (Table 1)

Predictors and Covariate

Wetland vs. Conventional Stormponds

Categorization of wetland/conventional stormponds was completed using a combination of satellite imagery and the previously cited EPCOR dataset. This is a nominal predictor variable with two levels, "Conventional / Wetland".

Perimeter/Area Ratio (PAR)

Perimeter/Area Ratio was calculated for each pond using length and area of manually drawn overlays in ArcGIS. Ponds were split into two classes by the median value. This metric is scale and shape dependent. This is an ordinal predictor variable with two levels, "High PAR/ low PAR".

Internal Features

Classification of ponds into those with features (islands and peninsulas) or no features was completed visually with ESRI satellite imagery. This is a nominal predictor variable with two levels: "Feature" / "No Feature".

Checklist Effort

Checklists can range from 1 minute long to several hours. Given that richness will be correlated with effort, I have applied four effort strata so that comparisons between categories can be made between samples of similar effort. This will also allow insight into the amount of effort needed reach satisfactory sampling completeness which can inform future study design. Checklist Effort is an ordinal categorical covariate with four levels (in minutes): 
(0-30')  [30-60')  [60-90')  [90-120']

Response

Avian Species Richness

Species richness is the continuous response variable. Samples contain a mean richness estimate based off the mean of all checklists available for that sampling unit. Subsamples contain discrete integer measures of richness that represent the sum of all species counted.

Figure 4. GIS process map for the creation of study dataset.

Figure 5. Selected locations from Edmonton Stormwater Management Layerand eBird Basic Dataset point counts. Colour intensity from blue to red indicates low to high numbers of checklists at that location.

Figure 6. The sampling design and subsample, sample, strata, factor hierarchy.

Table 1. Number of available samples by pond type, Perimeter/Area Ratio (PAR) class, and effort strata for eBird checklists in the Edmonton region in May-July, 2015-2023.