Data Management

The TAHMO stations have a 5 minute measurement interval after which they'll write the observation value into their internal memory. The dataloggers are configured to send their once per hour under normal circumstances. We lower the upload frequency for stations with low battery levels which can occur when there's a poor cellular signal at the station location.


Depending on which country the station is installed in, the data might first be sent to a server which is managed by the national meteorological agency. At this stage the datalogger is sending small UDP packages which have a custom compression which is developed by the datalogger manufacturer (Meter Group). If the dataloggers do not initially report to a local server, they will directly send their data towards the server of Meter Group.


The TAHMO infrastructure will retrieve the data from the server of Meter Group by API's which are different for EM50 and EM60 dataloggers. For EM50 dataloggers we get raw 32 bit port values for each port of the datalogger while for the EM60 dataloggers (3rd generation stations) we receive the observation values itself. The observation values retrieved at this stage are marked as "raw measurements" and are stored in our database under this name. As soon as this data is available within our system, we will run classical quality control on it consisting of sensor range, climate range, temporal step, temporal delta and temporal sigma tests. This data is available through our API for our data consumers and hourly aggregated data is also available through the datahmo data platform.


The classical quality control lacks the ability to properly control the precipiation observations in the TAHMO dataset. The SensorDX which is lead by prof. Tom Dietterich is actively working on methods to improve this specific aspect for the quality control. They're now working on incorporating spatial quality control for neighbouring stations within their SensorDX system. This system will however not be able to be implemented for regions where there's a very low density of TAHMO stations (Congo, Madagascar, Mozambique, Zimbabwe) etc. Also during the first years of the TAHMO initiative there weren't a lot of stations thus alternative methods need te be developed to provide adequate quality control for this part of the dataset. We are therefore also looking at using various other data sources to quality control the precipitation dataset to at least remove periods in time where rain gauges could have been clogged or have structurally malfunctioning sensors.