Digital Advisory Services

Digital Advisory Services for European Fields

Delineated image of a Poland Field

Process and Workflow

Our development journey moved from exploratory analysis to a sophisticated segmentation and labeling pipeline.

1. Data Selection & Retrieval

Satellite Choice: After comparing Landsat and ESA options, we selected Sentinel-2. Its Near-Infrared (NIR) sensor is superior for contiguous crop and delineation detection.
Retrieval Logic: We adapted techniques from the EveryField master’s thesis to manage image bands and download protocols.

2. Field Delineation

To solve initial boundary issues, we integrated Meta’s "Segment Anything" (SAM) model.

The Masking Process: We use SAM to generate field masks, followed by logic checks to prevent overlapping or faulty polygons.
Stability: This ensures clean separation between adjacent fields, which is critical for accurate data attribution.

3. Current Pipeline Phases

We are currently training a Long Short-Term Memory (LSTM) model to classify crop types based on the following steps:

Identify Area of Interest (AOI): Locate a 10km² subset with the highest density of LUCAS survey points.
Centroid Calculation: Determine the center point of these survey locations.
Time-Series Download: Retrieve Copernicus images for the entire growing season.
Subset Selection: Focus on a 5km x 5km square centered on the survey centroid.
Field Segmentation: Process the RGB subset through the "Segment Anything" algorithm to isolate fields from non-agricultural features (rivers, cities, etc.).
Feature Engineering:
- Calculate NDVI (Normalized Difference Vegetation Index) and NDWI (Normalized Difference Water Index).
- Formula for NDVI:
  $$\text{NDVI} = \frac{\text{NIR} - \text{Red}}{\text{NIR} + \text{Red}}$$
- Aggregate these scores as monthly averages per field.
Labeling: Assign crop types (e.g., Sugar Beet, Wheat) from the LUCAS dataset to the segmented polygons.

Output: A comprehensive data frame where each row represents a specific field, featuring monthly vegetation indices as inputs and crop types as targets for the LSTM model.

Results & Demo

This demonstration focuses on a single subset to showcase the logic required for a future large-scale batch processing pipeline.

Feature

Observation

Input

RGB Subset (Left)

Segmentation

Meta’s Pytorch Segment (Right)

Target Labels

Red dots (e.g., Common Wheat, Sugar Beets, Unknown Cereal)

Regional Performance

High accuracy in Southern Poland due to smaller field sizes.

Performance Note

While the 5km x 5km subset provides excellent segmentation, we observed that increasing the subset size to 10km x 10km leads to sub-optimal results. Our current focus remains on maintaining high precision at the 5km scale to ensure data integrity for the LSTM training phase.

Here is the refined, structured version of your project summary. I have organized the technical requirements and future milestones into a clear, professional format.

Seasonal Crop Signatures

Our analysis leverages the unique temporal "signatures" of different crops to feed the LSTM (Long Short-Term Memory) model.

Winter Wheat: Characterized by two distinct cycles—summer wheat (spring cycle) and winter wheat (fall cycle).
Root Crops & Others: Crops like sugar beets follow a single-cycle growth pattern, typically harvested in late September or early October.

By identifying these specific temporal patterns, we can establish a robust signature analysis to classify various European crops and derive secondary agricultural factors.

Future Work & Technical Requirements

The current workflow serves as a proof-of-concept. To reach a production-ready state, we must scale our data processing and infrastructure.

1. Scaling Infrastructure

Each satellite tile ($110\text{km} \times 110\text{km}$) occupies 1.2 GB. To train a high-accuracy LSTM, we must process hundreds of these tiles into $5\text{km} \times 5\text{km}$ subsets.

Estimated Monthly Cost: ~$150 for a dedicated Virtual Machine.
Storage Requirement: Minimum of 3 TB of cloud storage.

2. Immediate Roadmap

Yield Data Acquisition: Find a yield estimation dataset with field-level granularity.
LUCAS Synchronization: Download contiguous tiles matching the LUCAS dataset coordinates (two priority areas identified).
Model Training: Train the LSTM using improved masks from the "Segment Anything" model once storage is upgraded.
Crop Code Integration: Expand the model to recognize and classify all relevant European crop codes with high accuracy.

3. Economic & Ecological Analysis

Once the fields are classified and delimited, we can move toward predictive analytics:

Yield Extrapolation: Calculate expected harvest based on polygon surface area and historical data.
Input Requirements: Determine the necessary amount of fertilizer and seed.
Financial Planning: Help farmers calculate the short-term loans required for seeds, which are typically repaid post-harvest.

Conclusion

We have demonstrated the feasibility of segmenting and monitoring agricultural fields at a micro-level using free and open-source tools. While commercial APIs like ResUNet SentinelHub (approx. €300) offer similar capabilities, our open-source protocol provides a scalable, cost-effective stepping stone for the agricultural community.

Key Benefits

Precision: Field delineation allows for exact surface area calculations.
Integration: Masks can be tied to secondary LSTMs for fertilizer and health monitoring.
Insight: Yields can be calculated using a combination of field size and NDVI (Normalized Difference Vegetation Index).

Data Resources

Dataset

Description

CORINE Land Cover

Pan-European raster data for land usage.

Global Land Cover

Pixel-based data masks for Europe.

GUS (Stat.gov.pl)

Crop yield data by Polish provinces.

LUCAS Dataset

Primary land usage survey information.

Project Links

https://github.com/OmdenaAI/cracow-poland-rural-farmers

Phase III Omdena IFAD project openingThis project is our singular pipeline of an IFAD project to create digital advisory services for rural farmers.

Page updated

Report abuse