Precision agriculture is revolutionizing modern farming by enabling farmers to make data-driven decisions about their farming strategies. This dataset provides valuable information for building predictive models to recommend the most suitable crops for specific farms based on various parameters.
Original Dataset :- Crop Recommendation Dataset (kaggle.com)
Categorical Variables:
Crop: Represents the type of crop. This is a categorical variable because it categorizes the data into distinct groups.
Season: Represents the growing season of the crop (Kharif, Rabi, Zaid). This is a categorical variable.
Soil Type: Represents the type of soil (Clay, Loamy, Sandy, etc.). This is a categorical variable.
Numerical Variables:
Nitrogen Ratio: Represents the nitrogen content in the soil.
Phosphorous Ratio: Represents the phosphorous content in the soil.
Potassium Ratio: Represents the potassium content in the soil.
Temperature: Represents the temperature in degrees Celsius.
Humidity: Represents the humidity percentage.
PH: Represents the pH level of the soil.
Rainfall: Represents the amount of rainfall in mm.
Nominal Scale:
Crop: Different categories of crops without any inherent order.
Season: Different categories of seasons without any inherent order.
Soil Type: Different categories of soil types without any inherent order.
Ratio Scale:
Nitrogen Ratio: Measured on a ratio scale as it has a meaningful zero point and equal intervals.
Phosphorous Ratio: Measured on a ratio scale as it has a meaningful zero point and equal intervals.
Potassium Ratio: Measured on a ratio scale as it has a meaningful zero point and equal intervals.
Temperature: Measured on a ratio scale as it has a meaningful zero point and equal intervals.
Humidity: Measured on a ratio scale as it has a meaningful zero point and equal intervals.
PH: Measured on a ratio scale as it has a meaningful zero point and equal intervals.
Rainfall: Measured on a ratio scale as it has a meaningful zero point and equal intervals.
Discrete Variables:
Nitrogen Ratio: While this can take many values, it is often measured in discrete units.
Phosphorous Ratio: Similar to Nitrogen Ratio, it is measured in discrete units.
Potassium Ratio: Similar to Nitrogen Ratio, it is measured in discrete units.
Continuous Variables:
Temperature: This can take any value within a range and is thus continuous.
Humidity: This can also take any value within a range and is continuous.
PH: This can take any value within a range and is continuous.
Rainfall: This can take any value within a range and is continuous.
Categorical vs. Numerical: Categorical variables classify the data into distinct groups, while numerical variables represent measurable quantities.
Scales of Measurement: The scales of measurement help understand the nature of the data. Nominal scales classify without order, while ratio scales provide ordered values with a true zero point.
Discrete vs. Continuous: Discrete variables take distinct values, often integers, while continuous variables can take any value within a range, allowing for more precise measurements.
Count of Soil Type
Reason for Choosing: A bar chart is effective for comparing the count of different categories. It allows for a clear visual representation of how many times each soil type appears in the dataset.
Interpretation: The chart shows the frequency of each soil type. If a particular soil type dominates, it indicates that this soil type is more common in the dataset. The bar chart reveals that Loamy soil is the most frequent, suggesting its prevalence in the region studied.
Crop Count
Reason for Choosing: Similar to the soil type count, a bar chart is useful for displaying the number of occurrences of each crop type, making it easy to compare their frequencies.
Interpretation: This chart reveals which crops are most and least common. High counts for certain crops, such as Rice and Wheat, may suggest they are more prevalent or preferred in the studied region.
Season Distribution
Reason for Choosing: A pie chart is effective for showing the proportion of each category within a whole. It provides a clear visual of the distribution of seasons.
Interpretation: The pie chart illustrates the proportion of data points belonging to each season. This can highlight seasonal trends and potential biases in data collection. The chart shows that Kharif season is the most common, indicating it might be the primary growing season.
Histogram of Nitrogen Ratio
Reason for Choosing: A histogram is ideal for showing the distribution of a single numerical variable. It helps in understanding the spread and central tendency of the nitrogen ratio in the dataset.
Interpretation: The histogram shows how nitrogen ratios are distributed. The data appears to have a slight right skew, indicating that higher nitrogen levels are less common but do occur.
Histogram of Temperature
Reason for Choosing: Like the nitrogen ratio histogram, this plot shows the distribution of temperature values, helping to identify common temperature ranges and outliers.
Interpretation: This histogram helps in understanding the typical temperature range experienced in the dataset. The data suggests a normal distribution with moderate temperatures being most common.
Histogram of Rainfall
Reason for Choosing: To visualize the distribution of rainfall amounts, a histogram is appropriate as it shows the frequency of different rainfall levels.
Interpretation: This plot helps in understanding how much rainfall common and how often extreme rainfall occurs. It shows a left skew, indicating that lower rainfall amounts are more common.
Temperature vs. Humidity by Soil Type
Reason for Choosing: This scatter plot shows the relationship between temperature and humidity while categorizing by soil type, revealing how soil properties might influence these conditions.
Interpretation: The plot indicates if certain soil types are associated with specific temperature and humidity conditions, which can help in determining suitable soil for different crops under varying climatic conditions. Sandy soil types appear to be associated with a wider range of temperatures.
Temperature vs. Humidity by Season
Reason for Choosing: Similar to the previous scatter plot, this one shows the interaction between temperature and humidity but highlights seasonal variations.
Interpretation: It helps in understanding how different seasons impact temperature and humidity levels, and how these factors might affect crop growth differently in each season. Kharif season shows higher humidity levels compared to Rabi and Zaid.
Line Graph of PH
Reason for Choosing: A line graph is suitable for showing trends over a continuous variable, such as time or, in this case, pH variations across different categories of soil.
Interpretation: The line graph shows how pH values vary for different categories of soil, indicating which soils might be more pH resistant or require low pH conditions. It reveals that certain soils maintain a more stable pH, which is crucial for crop health.
Line Graph of Humidity
Reason for Choosing: A line graph is suitable for showing trends over a continuous variable, such as humidity variations across different seasons.
Interpretation: The line graph shows how humidity levels vary across different seasons. It indicates which seasons have higher or lower humidity, helping to determine which crops might be more suitable for planting in each season. It highlights that Kharif season has higher humidity, beneficial for certain crops like rice.
Soil and Crop Types: The bar charts reveal the distribution and frequency of different soil and crop types, indicating the common varieties in the dataset.
Nutrient Levels: Histograms of nutrient ratios (like nitrogen) and environmental factors (temperature, rainfall) show their distributions, highlighting common ranges and any skewness.
Seasonal Effects: The pie chart for seasons and scatter plots by season provide insights into how different seasons affect temperature, humidity, and overall agricultural conditions.
Environmental Relationships: Scatter plots showing temperature vs. humidity by season and soil type highlight interactions between these factors, which are crucial for understanding optimal growing conditions and potential challenges.
PH and Humidity Trends: Line graphs for pH and humidity reveal how these factors vary across different soils and seasons, respectively, providing valuable information for crop management.
The visualizations provide a comprehensive overview of agricultural conditions, highlighting patterns and trends within the dataset. Bar charts for soil type and crop counts reveal the frequency of various categories. Histograms for nitrogen ratio, temperature, and rainfall display their distributions, while scatter plots for temperature vs. humidity by season, and soil type show interactions between these variables. These visualizations offer valuable insights into optimal growing conditions, seasonal effects, and the distribution of essential nutrients, aiding in better agricultural planning and decision-making.