On this topic I will be trying to add information from the Spanish weather service (AEMET) to my climate change environment.
The first workflow below will extract and add weather station information and load to the table aemet_stations.
If I execute a SELECT count (distinct province) FROM public.aemet_stations on this table, I get a result of 54. This is an issue, because officially there are only 50 provinces in Spain. This means we have 4 provinces too many.
Let`s try to spot them:
One obvious 'error' we can see straight at the beginning of our table : 'BALEARES' and 'ILLES BALEARS' are obviously the same Province.
Another one is that some stations have a value 'null' for province, We will examine thesa later on.
Another error seems to be that we have both "SANTA CRUZ DE TENERIFE" and "STA. CRUZ DE TENERIFE", which are obviously the same
One more error left to detect...
We need to clean this up. Below is the resulting list of doing a SELECT distinct(province) FROM public.aemet_stations:
"province"
"A CORUÑA"
"ALBACETE"
"ALICANTE"
"ALMERIA"
"ARABA/ALAVA"
"ASTURIAS"
"AVILA"
"BADAJOZ"
"BALEARES"
"BARCELONA"
"BIZKAIA"
"BURGOS"
"CACERES"
"CADIZ"
"CANTABRIA"
"CASTELLON"
"CEUTA"
"CIUDAD REAL"
"CORDOBA"
"CUENCA"
"GIPUZKOA"
"GIRONA"
"GRANADA"
"GUADALAJARA"
"HUELVA"
"HUESCA"
"ILLES BALEARS"
"JAEN"
"LA RIOJA"
"LAS PALMAS"
"LEON"
"LLEIDA"
"LUGO"
"MADRID"
"MALAGA"
"MELILLA"
"MURCIA"
"NAVARRA"
"OURENSE"
"PALENCIA"
"PONTEVEDRA"
"SALAMANCA"
"SANTA CRUZ DE TENERIFE"
"SEGOVIA"
"SEVILLA"
"SORIA"
"STA. CRUZ DE TENERIFE"
"TARRAGONA"
"TERUEL"
"TOLEDO"
"VALENCIA"
"VALLADOLID"
"ZAMORA"
"ZARAGOZA"