The data infrastructure is built on FAIR principles (Findable, Accessible, Interoperable, Reusable) to support data-intensive science. The team adopted the OGC SensorThings API (STA) standard as the primary framework, utilizing the Fraunhofer Open Source SensorThings server (FROST) implementation.
Comprehensive load testing was conducted to assess FROST's performance and scalability:
Data ingestion: Sustainable performance for expected bulk data loads
Data retrieval: Some API constraints identified for large-scale data management
Scalability confirmation: Architecture aligns with project requirements
HDF5 layer: Ensures scientific-standard format and ingestion-ready file structure
Quality control processing: Complete validation of time series data
Dedicated API access: Standardized data retrieval interface
Docker containerization: Ensures portability and resource isolation
GEOSS compatibility: Deep integration feasible through GEO DAB middleware
Copernicus services: Static dataset provision via periodic data freeze (API limitations noted)
Standard formats: JSON-based data exchange ensuring broad compatibility
Registration process initiation with GEOSS development team
Exploration of Copernicus data upload solutions
Interoperability deliverable scheduled for Month 45
Current hosting: Virtual Machine at Fondazione Edmund Mach
Future migration: Planned relocation to Nature 4.0 facilities
Architecture: Docker containers for enhanced portability and development efficiency
Docker architecture
Using test data from 10 devices over 70 days (July-September 2023):
Data volume: 6.2MB input → 5.5MB HDF5 → 98,000 Observation entities
Storage requirements: 22MB (Layer 0) + 146MB (Layer 1)
Processing time: ~5 hours ingestion
Scalability projection: 500 devices/month ≈ 5 days processing time
Built with Streamlit framework featuring:
Dynamic sensor data querying via SensorThings API
Custom filtering and time range selection
CSV data export functionality
Sensor fault detection and anomaly identification
Web-based interface for real-time monitoring
Example of sensor fault displayed on the interactive dashboard
Source code: Private GitHub repository for consortium collaboration
Distribution: Docker image availability for cross-platform testing
Access: VPN-secured prototype available to FEM internal devices
The prototype serves as the foundation for production-ready environment implementation, with continued monitoring and optimization planned.
Lead
Edmund Mach Foundation
Contact: claudio.donati@fmach.it