Method & Data

Method

A total of 66 different studies were included in the dataset I used, with a total of 316 sets of study data. These studies were from 15 different countries (Figure 1).

Figure 1: Countries of Data Collection

Countries of Data Collection

Figure 2: Comparsion of Organic and Conventional Farming Yields

Yields Difference between Organic and Conventional Farming

This study aims to quantitatively compare the yields of organic and conventional farming systems and the impact of various agricultural practices on these yields. First, in order to compare the yields of organic and conventional agriculture, descriptive statistics were conducted to summarize the concentration trends and degree of dispersion in the yield data. Paired t-tests were then conducted to determine if there were statistically significant differences between the yields of the two agricultural systems.

DAta

Data collection sites

This is a consolidation of the number of sites from different studies. Each bar in the histogram represents a different study, represented by a number on the x-axis, ranging from 0 to over 60. The y-axis shows the "number of sites", ranging from 0 to over 20. The sites with the largest number of studies are the bars with study numbers 9, 15, 34, and 45, with more than 20 sites in each study. The remaining bars show varying numbers of sites, with many studies having fewer than 10 sites. When studying yields, more sites usually means more variables and a more comprehensive investigation. Thus this dataset is extremely helpful in demonstrating the scope and focus of the agricultural studies captured. It is important to note that different colors do not mean different things. In this case, the rich colors are used to better differentiate and identify the information on each site.


Figure 3: Number of Sites per Study

Units

In the original dataset, the authors used many different units. Therefore, in this project, I performed a careful unit harmonization of the original dataset to ensure that all crop yield data are in megagrams per hectare (Mg/ha). This conversion involved a variety of original units, including tons per hectare (t/ha), kilograms per hectare (kg/ha), deciles per hectare (dt/ha), and bushels per acre (bu/ac). We ensured data consistency and comparability through an exhaustive literature review and standard conversion methods.


However, some data were in units such as pounds per plant (lb/plant) or boxes per hectare (boxes/ha) that could not be accurately converted due to insufficient conversion information. For these data, we adopted a conservative treatment strategy of marking them as unavailable ("-") and excluding them from subsequent data analysis. In the Rstudio environment, we converted these flags to NA (not applicable values) to ensure that the numeric columns in the dataset consisted entirely of numeric types.


Given the large size of the dataset and variable agricultural conditions, the dataset contains many extreme values. To minimize the potential impact of these extreme values on the analysis results, we calculated the 5% and 95% quartiles for each numerical column and filtered the data according to this condition. This rigorous data-cleaning process not only improved the robustness of the statistical analysis but also provided a more reliable basis for our findings.

Crop Type

Different crop types can also have an impact on yield. Crop types include vegetables, cereals, fruits, sugar crops, fibers, oilseeds, pulses, tree nuts, others, roots and tubers, and fodder. The chart illustrates the diversity and number of crop types used in each study, with some studies using a variety of crop types and others focusing on fewer crop types. When performing the final analysis of the results, we will remove some of the sparse data, such as fiber, forage, oil crops, and tree nuts, which are crop types that do not have enough data to support them.

Figure 4: Partial Data After Integration and Unit Conversion
Figure 4: Different Crop Types for Each Study
Figure 5: Average Yields by Crop Type

Fertilizers Used 

Subsequent analyses focused on the effects of agricultural practices on yields for each crop type on organic farming and conventional farming. Variables considered included types of organic fertilizers used. When performing the final analysis of the results, we will remove some of the sparse data, such as fiber, forage, oil crops, tree nuts and others, which are crop types that do not have enough data to support them.


Figure 6: Average Yield by Organic Fertilizer Type for Vegatables
Figure 7: Average Yield by Organic Fertilizer Type for Cereals
Figure 8: Average Yield by Organic Fertilizer Type for Pulses
Figure 8: Average Yield by Organic Fertilizer Type for Roots and Tubers
Figure 8: Average Yield by Organic Fertilizer Type for Fruits