Raw Data

Data Table

Table 1 shows the data from the first five plots at the Beaverlodge site as indicated in the "LOCATION" column. The treatments listed also include which soil temperature (°C) and cultivar were used. In the last column the yield is presented in kg/ha.

For this study the predictor variables are the soil temperature at seeding and the cultivar used; to simplify the combination of these variables a treatment number was created, although not observable in Table 1 there are twenty treatments total (1-20). Both components of the predictor variables are categorical and they have been manipulated for this study; this means that we changed the soil temperature at seeding and the cultivar used. The response variables for this experiment include yield, and other variables that have yet to be analyzed including protein content, thousand kernel weight, and test weight.


Table 1. This table represents the first five plots at the Beaverlodge site and includes the predictor variables (soil temperature at seeding and cultivar) as well as yield which is the predictor variable.

Exploratory Data

To check the data for errors and outliers I used the summary() and boxplot() functions in RStudio to check that all of the data appeared reasonable and likely without far outliers. Although there was a great range in yield data, there were only two far outliers at the Edmonton location; the values here are still below the maximum yield values in the entire trial and therefore should still be considered for analysis because those yield values are not impossible to achieve under the irrigated conditions that the Edmonton plots experienced. The minimum yield reported was 585 kg/ha and the maximum yield reported was 5802 kg/ha. Although the minimum yield is very low it is not impossible; this yield data came from Lethbridge which experienced extreme drought conditions and this may have led to reduced yield. The value for the maximum yield of 5802 kg/ha is a very high yield; however, it was from central Alberta which often has higher yields than the north and south.





Shown here are boxplots depicting the yield data on several different factors. In Figure 6 the yield from the two soil temperatures at seeding treatments can be observed; this data is close to being normally distributed without any outliers.

Figure 6. These boxplots indicate the average yield (kg/ha) in each of the soil temperature at seeding treatments regardless of cultivar or location.



Figure 7 shows the distribution of yield data from each of the four study sites. Note that central Alberta typically has higher yielding crops than northern and southern Alberta due to different growing conditions.

In Figure 8 the yield as a function of cultivar is depicted in a boxplot. It can be observed that the yields are normally distributed for the most part.

Figure 7. The average yield (kg/ha) at each location regardless of treatment.
Figure 8. Yield for each cultivar in kg/ha regardless of location or soil temperature at seeding.


Lastly, Figure 9 shows the yields for the combined cultivar and soil temperature at seeding treatments. A relatively normal distribution can be observed in all box plots, Figures 6-9.

Figure 9. Yield of each cultivar at each soil temperature at seeding date regardless of location.