Data

Data Management:

The dataset used in this study is part of a greater dataset being actively developed in the Rueppell lab at the University of Alberta. As such, it had to be cleaned and summarized for analysis. Much of the data was stored in a central excel file, though some auxiliary files contained data on variables such as honey production and queen longevity. Queen longevity was not exact (to the nearest month) and had to be inferred from various datasets related to different aspects of the project. Because of the uncertainty surrounding time of death for queens that perished overwinter and the resulting odd distribution of longevity scatter plots (Figure 3), we opted to use overwintering success as response variable corresponding to overall colony success - winter is also the period that colonies are most at-risk for death, and overwintering failure is a relevant concern for Canadian beekeepers. However, we did include longevity in our analysis of differences among honey bee stocks. Our other metric of colony success is the reason that beekeepers manage colonies in the first place: honey production. We used 2023 honey output, measured in kilograms.

Figure 3. Example scatter plot of approximate queen longevity following field placement plotted against a predictor variable (here geotaxis score). Note the large gap in point distribution over winter months.

Figure 4. An example line graph depicting the type of error prevalent in the temperature data.

Data for temperature and humidity were contained on individual files for each queen. We encountered considerable sensor error in this data, with the temperature data for most queens stagnating to a single value for long periods of time, as can be seen in Figure 4. This tended to occur later in time, rather than earlier, which led us to focus on August 2022 for our summary statistics. Our analysis for this aspect of the project focused on temperature variance, as a metric of a colony's ability to to thermoregulate. After the data was summarized, it was put into a master datasheet, along with the data from the central file, honey production, and longevity discussed above. A portion of this sheet can be seen below in Table 1.

Table 1. Sample of the data table used in analysis, containing predictor, response, and identifying variables.

Figure 5. Box plots demonstrating the distribution of A) Head Width, B) Body Weight, C) Temperature variance, D) Geotaxis Score, E) Number of Turns, F) Sperm Count, G) Sperm Viability, H) Ovary weight, and I) Ovariole count by genetic stock

There are many variables present in the dataset for this project, and we had to select a handful from each variable category for our analysis. Our predictor variables (and reasons for their selection) in this study include:

Queen body measurements
- Body weight: Measured in grams. Larger queens are frequently presumed to have greater reproductive success and to be of higher "quality". Some studies have suggested that heavier queens are more readily accepted by worker bees upon introduction to new colonies (Masry et al. 2015)
- Head width: Measured to-scale from photographs in arbitrary measurement units (AMU). Alternate method of evaluating queen body size that would not be influenced internal (ie. ovary, sperm) metrics.
Queen behavioural test scores
- Geotaxis running score: Measures the tendency of a queen to move up (positive values) or down (negative values) when placed in a tilted tube for a 30 second period. Honey bees should naturally orient themselves upwards (hence the vertical arrangement of honey comb).
- Number of turns in a flat tube: How often a queen "changes her mind" about which direction she should travel when placed in a flat tube for a 30 second period. Agitated behaviour by queens is not well received by worker bees during initial queen acceptance (Robinson 1984).
Colony Attributes
- Temperature variance: As recorded by in-hive sensors, related to a colony's ability to thermoregulate, an important aspect in overwinter survival on a colony level.
- Varroa mite count: Number of varroa mites (major bee pest) in a sample of approximately 100 bees. Varroa mites are the only factor ranked above queen quality in surveys of Albertan beekeepers on leading causes of overwintering hive loss (CAPA 2022). There is also a genetic link to the practice of hygienic, varroa-reducing behaviours in worker bees (Tsuruda et al. 2012).

In addition to the above variables being evaluated as predictors of colony success metrics (overwintering success and honey output), they were were also used as dependent variables in analysis of differences in queen and colony attributes by stock. The below destructively sampled internal queen traits were also evaluated to observe differences by stock:

- Ovary weight (g): Size of ovaries may relate to reproductive efficacy and and therefore queen quality (Gilley et al. 2003).
- Ovariole count: Number in a single ovary. In other insects, ovariole numbers have been tied to egg production, though the relevance of this in honey bee contexts is unclear (Bouletreau-Merle 1978, Jackson et al. 2011).
- Sperm count: Number, in 100000s, of stored sperm in the spermatheca. As queens do not mate past the first few weeks of their lives, this number relates to their life long fertility.
- Sperm viability: Proportion (%) of sperm that is living. Indicates how sucessful a queen was at mating and future fertility.

Figure 5 provides a preliminary glance at the distribution of some of those variables within each genetic stock tested. It allows us to see outliers that may need to be accounted for in further analysis, such as queen #134, who is more than twice as heavy than any other queen (Figure 5B). She was removed when calculating statistics for body weight, as her weight was likely a result of a misinput. Control colonies (sensors places in empty brood boxes) also needed to be extracted from the temperature data, which was demonstrated by Figure 5C.