Environmental processes and

human activities:

capturing their interactions via statistical methods (EphaStat)

The three-year EphaStat project, funded by the Italian Ministry of Education, Universities and Research, aims at improving existing statistical methods fostering the appropriate consideration of uncertainty in environmental and ecological research. Its principal tool is the development of probability based models for supporting decision making in complex systems, that properly take into account uncertainty sources and propagation. The study of the interactions between humans and the environment is considered, in terms of the impact human activities have on the environment and the effects environmental changes have on human’s wellbeing.

The EphaStat project starts from several interrelated motivating examples, among them ecosystem status evaluation, air pollution, the study of the relationship between pollution and human health. All these examples share the common feature of being manageable by the hierarchical modelling approach. Hierarchical modeling allows to partition a complex process into a number of simpler conditional processes obtaining more manageable probabilistic representations that facilitate inference both from the analytical and computational point of view. In the hierarchical modeling approach, parametric, semi-parametric and non-parametric models are important modeling tools, fully suitable to represent complex space-time causal relationships. Moreover, hierarchical models find a suitable inferential framework within Bayesian statistics: this approach facilitates the inclusion of expert opinions and external forms of knowledge in the model. In this way, the collaboration with stakeholders is encouraged, supported and enhanced. Further, this approach allows the development of models that can be easily generalized and extended to similar problems.

The EphaStat project will focus on three broad methodological areas.

  • Uncertainty over data production related to sensible statistical procedures needed for obtaining statistical data starting from raw observations of environmental phenomena.
  • Spatial and space-time modelling in environmental modelling. In this regard, appropriate extensions for dealing with geostatistical, lattice and cylindrical data are proposed.
  • Probability models for multivariate count and compositional data concerned with methods for modelling biomonitoring data and the ecosystem status.

The proposed methods require sophisticated computational tools that need to be made available for the large audience of non-statisticians dealing with complex model fitting. The introduction of new user-friendly tools for statistics users implies the development of fast computational algorithms, possibly relying on parallel computing resources.