Past Seminar Presentations
This page contains past presentations, slides, videos and relevant material.
Wed., May 15, 2024 (3pm Europe time)
Speaker: Jake Peter Amooti Grainger (EPFL, Switzerland)
Title: Spectral estimation for spatial point processes and random fields
Abstract: Spatial data can come in a variety of different forms, but two of the most common generating models for such observations are random fields and point processes. Whilst it is known that spectral analysis can unify these two different data forms, specific methodology for the related estimation is yet to be developed. In this talk, we will discuss how multitaper estimation can be extended to this setting, enabling us to estimate coherence, a measure of cross dependence between processes. In particular, we discuss how to deal with different kinds of sampling mechanism, handle the effect of non-zero mean, and extend estimation to observations on non-rectangular sampling domains.
Links : [#Link to paper] [#recording]
Wed., April 17, 2024 (3pm Europe time)
Speaker: Fan Bu (University of Michigan, USA)
Title: Inferring HIV Transmission Patterns from Viral Deep-Sequence Data via Latent Spatial Poisson Processes
Abstract: Viral deep-sequencing technologies play a crucial role toward understanding disease transmission patterns, because the higher resolution of these data provide evidence on transmission direction. To better utilize these data and account for uncertainty in phylogenetic analysis, we propose a spatial Poisson process model to uncover HIV transmission flow patterns at the population level. We represent pairings of two individuals with viral sequence data as typed points, with coordinates representing covariates such as sex and age, and the point type representing the unobserved transmission statuses (linkage and direction). Points are associated with deep-sequence phylogenetic analysis summary scores that reflect the strength of evidence for each transmission status. Our method jointly infers the latent transmission status for all pairings and the transmission flow surface on the source-recipient covariate space. In contrast to existing methods, our framework does not require pre-classification of the transmission statuses of data points, instead learning them probabilistically through fully Bayesian inference. By directly modeling continuous spatial processes with smooth densities, our method enjoys significant computational advantages over previous methods that discretize the covariate space. In a HIV transmission study from Rakai, Uganda, we demonstrate that our framework can capture age structures in HIV transmission at high resolution and bring valuable insights.
Links [#recording]
Wed., Mar 20, 2024 (9pm Europe time - NOTE THE DIFFERENT TIME)
Speaker: Tilman Davies (University of Otago, Dunedin, New Zealand)
Title: The Inhomogeneous World of a Spatial Statistician
Abstract: Statistical tools for the analysis of spatial data have often been developed as either special cases or generalisations of non-spatial analogues, such as multivariate density estimation or time series models. In this seminar, I discuss my own path to becoming an applied spatial statistician, and cover recent developments in two key areas of my research -- spatially adaptive kernel smoothing with a view to estimation of point process intensity; and spatial autoregressions with a view to modelling physiological trends in mammalian muscle fibres.
Links [#recording] [#slides] [#Link to first paper] [#Link to second paper]
Wed., February 21, 2024 (3pm Europe time)
Speaker: Diala Hawat (LPSM, Sorbonne université, Paris, France)
Title: Repelled point processes with application to numerical integration.
Abstract: Linear statistics of point processes yield Monte Carlo estimators of integrals. While the simplest approach relies on a homogeneous Poisson point process (PPP), more regularly spread point processes yield estimators with fast-decaying variance. Following the intuition that more regular configurations result in lower integration error, we introduce the repulsion operator, which reduces clustering by slightly pushing the points of a configuration away from each other. Our empirical findings show that applying the repulsion operator to a PPP and, intriguingly, to regular point processes reduces the variance of the corresponding Monte Carlo method and thus enhances the method. This variance reduction phenomenon is substantiated by our theoretical result when the initial point process is a PPP. On the computational side, the complexity of the operator is quadratic and the corresponding algorithm can be parallelized without communication across tasks.
Links: [#paper] [#code] [#recording] [#slides]
Wed., December 13, 2023 (3pm Europe time)
Speaker: Radu Stoica (Univ. Lorraine, Institut Élie Cartan)
Title: Random structures and patterns in spatio-temporal data: probabilistic modelling and statistical inference
Abstract: The useful information carried by spatio-temporal data is often outlined by geometric structures and patterns. Filaments or clusters induced by galaxy positions in our Universe are such an example. Two situations are to be considered. First, the pattern of interest is hidden in the data set, hence the pattern should be detected. Second, the structure to be studied is observed, so relevant characterization of it should be done. Probabilistic modelling is one of the approaches that allows to furnish answers to these questions. This is done by developing unitary methodologies embracing simultaneously three directions: modelling, simulation and inference. This talk presents the use of marked point processes applied to such structures detection and characterization. Practical examples are also shown.
Links: [#recording]
Wed., November 15, 2023 (3pm Europe time)
Speaker: Francesco Sanna Passino (Department of Mathematics, Imperial College London)
Title: On mutually exciting node-based and edge-based models for point processes on networks
Abstract: This talk discusses node-based and edge-based models for temporal point processes on networks. In particular, two classes of models are presented: GB-MEP (Graph-Based Mutually Exciting Processes) and MEG (Mutually Exciting Graphs). GB-MEP is a node-based model that incorporates known relationships between nodes in a graph within the intensity function of a node-based multivariate Hawkes process. This approach reduces the number of parameters to a quantity proportional to the number of nodes in the network, resulting in significant advantages for computational scalability when compared to traditional methods. The model is applied on event data observed on the Santander Cycles network in central London, demonstrating that exploiting network-wide information related to geographical location of the stations is beneficial to improve the performance of node-based models for applications in bike-sharing systems. The proposed GB-MEP framework is more generally applicable to any network point process where a distance function between nodes is available, demonstrating wider applicability. On the other hand, MEG is a scalable edge-based network-wide statistical model for point processes with dyadic marks, which can be used for anomaly detection when assessing the significance of future events, including previously unobserved connections between nodes. The model combines mutually exciting point processes to estimate dependencies between events and latent space models to infer relationships between the nodes. The intensity functions for each network edge are characterised exclusively by node-specific parameters, which allows information to be shared across the network. This construction enables estimation of intensities even for unobserved edges, which is particularly important in real world applications, such as computer networks arising in cyber-security. A recursive form of the log-likelihood function for MEG is obtained, which is used to derive fast inferential procedures via modern gradient ascent algorithms. An alternative EM algorithm is also derived. The model and algorithms are tested on simulated graphs and real world datasets, demonstrating excellent performance.
Links: [#slides] [#recording]
Wed., October 18, 2023 (3pm Europe time)
Speaker: Christophe Biscio (Aalborg University, Denmark) [zoom link, passcode: 818155]
Title: Non-parametric intensity estimation of spatial point processes by random forests
Abstract: We will present a non-parametric intensity estimator for spatial point patterns inspired by random forests. Our estimator can handle both a large number of covariates as well as complicated shapes for observation windows. After presenting its definition and the methodology, which include a way of measuring the importance of each covariate, we will illustrate its performances on simulated data sets. Then, we establish its asymptotic behaviour in three asymptotic regimes: increasing domain, infill, and intermediate regime. This is joint work with Frédéric Lavancier.
Links: [recording] [#slides]
Wed., September 20, 2023 (3pm Europe time)
Speaker: Ying Sun (KAUST, Saudi Arabia)
Title: Spatio-temporal DeepKriging for Interpolation and Probabilistic Forecasting
Abstract: Gaussian processes (GP) and Kriging are widely used in traditional spatio-temporal modeling and prediction. These techniques typically presuppose that the data are observed from a stationary GP with a parametric covariance structure. However, processes in real-world applications often exhibit non-Gaussianity and nonstationarity. Moreover, likelihood-based inference for GPs is computationally expensive and thus prohibitive for large datasets. In this paper, we propose a deep neural network (DNN) based two-stage model for spatio-temporal interpolation and forecasting. Interpolation is performed in the first step, which utilizes a dependent DNN with the embedding layer constructed with spatio-temporal basis functions. For the second stage, we use Long-Short Term Memory (LSTM) and convolutional LSTM to forecast future observations at a given location. We adopt the quantile-based loss function in the DNN to provide probabilistic forecasting. Compared to Kriging, the proposed method does not require specifying covariance functions or making stationarity assumptions and is computationally efficient. Therefore, it is suitable for large-scale prediction of complex spatio-temporal processes. We apply our method to monthly PM2.5 data at more than 200,000 space-time locations from January 1999 to December 2022 for fast imputation of missing values and forecasts with uncertainties.
Links: [#recording] [#slides] [#recording]
Wed., May 3, 2023 (3pm Europe time)
Speaker: Renaud Alie (McGill University, Canada)
Title: Thinning, Data Augmentation and Bayesian Inference
Abstract: Some well known examples of point process distributions are defined via the thinning of a base point process. The thinning procedure can provide the model with interesting properties such as inhomogeneity or repulsiveness. However, such processes will generally have intractable densities which can make likelihood based or Bayesian inference a challenge. In this talk, we only discuss the latter, but some of the results we present apply more broadly. An interesting idea that arose in the recent literature is to rely on data augmentation to circumvent any intractability caused by the selection procedure. Basically, if thinned locations can be instantiated, then the complete data likelihood of discarded and observed locations will be tractable provided the base point process and the thinning procedure are not too “outlandish”. We begin this talk by reviewing some traditional examples of thinning procedures. Next, we recap some notions on point process densities and present an important result pertaining to joint densities under any form of thinning. Our modeling examples will be focused on the so-called sigmoidal Gaussian Cox process. We showcase how this model can be used to conduct non-parametric inference on the intensity function of a non-homogeneous Poisson process. Also, we introduce a multitype extension that allows to measure cross-species dependence in the position of trees. If time allows, we will discuss further applications that may be expanded upon in the future.
(joint work with Alexandra M. Schmidt and David A. Stephens)
Wed., April 5, 2023 (3pm Europe time)
Speaker: Joseph Yukich (Professor of Mathematics, Lehigh University, USA)
Title: Asymptotic Analysis of Statistics of Point Processes with Interacting Marks
Abstract: Consider a spatial point process in Euclidean space, where points are equipped with interacting marks, possibly time evolving. This gives a system with two sources of randomness, one from the random set of locations of sites, the other from the dynamically interacting states of the sites. When the site locations exhibit asymptotic de-correlation and when the marks satisfy a bounded Lipschitz stabilization criterion, a geometric localization criterion weaker than classical stabilization, then it is shown that statistics of the marked point process restricted to windows are asymptotically normal when the window size tends to infinity. This yields central limit theorems, as well as weak laws of large numbers, for statistics of point processes with geostatistical marking, for statistics of empirical random fields, as well as for statistics of interacting particle systems and interacting diffusions on spatial point processes. The talk is based on joint work with B.Blaszczyszyn (INRIA) and D. Yogeshwaran (ISI Bangalore).
Links: [#recording]
Wed., March 8, 2023 (3pm Europe time)
Speaker: Jonatan Gonzalez (King Abdullah University of Science and Technology (KAUST), Saudi Arabia)
Title: Recent developments in spatio-temporal point process methodologies
Abstract: Spatio-temporal point process data has become very popular in some scientific fields trying to understand the intrinsic mechanisms that govern the time evolution of points (events) in a planar observation window. In recent years, methodological developments have accelerated thanks to the proliferation of spatio-temporally indexed datasets.
We present some statistical descriptive methods and models to analyse point process observations when questions of scientific interest concern their spatial and temporal behaviour. We review recent advances in first and second-order characteristics and some models used in practice. We present some exciting real-data examples to illustrate the most relevant techniques.Links: [#recording] [#slides] [#paper]
Wed., February 8, 2023 (3pm Europe time)
Speaker: Bartłomiej (Bartek) Błaszczyszyn (Inria and ENS Paris, France)
Title: Particle gradient descent model for point process generation
Abstract: This paper introduces a generative model for planar point processes in a square window, built upon a single realization of a stationary, ergodic point process observed in this window. Inspired by recent advances in gradient descent methods for maximum entropy models, we propose a method to generate similar point patterns by jointly moving particles of an initial Poisson configuration towards a target counting measure. The target measure is generated via a deterministic gradient descent algorithm, so as to match a set of statistics of the given, observed realization. Our statistics are estimators of the multi-scale wavelet phase harmonic covariance, recently proposed in image modeling. They allow one to capture geometric structures through multi-scale interactions between wavelet coefficients. Both our statistics and the gradient descent algorithm scale better with the number of observed points than the classical k-nearest neighbour distances previously used in generative models for point processes, based on the rejection sampling or simulated-annealing. The overall quality of our model is evaluated on point processes with various geometric structures through spectral and topological data analysis, compared in particular to [Tscheschel, Stoyan (2006) Statistical reconstruction of random point patterns]. Joint work with Antoine Brochard, Stéphane Mallat and Sixin Zhang, Statistics and Computing, 2022, 32 (3).
Links: [#recording] [#paper]
Wed., January 11, 2023 (3pm Europe time)
Speaker: Daniela Flimmel (Charles University, CZ)
Title: On the hyperuniformity of short range Gibbs point processes
Abstract: A stationary point process is called hyperuniform if the number of points that fall into a given bounded set fluctuates in lower order than the volume of the set. Usually, proving hyperuniformity rigorously is a rather difficult task and therefore, a lot of attention is given to estimating the so called structure factor instead. We demonstrate a set of simple assumptions implying that Gibbsian models having short range of interaction are not hyperuniform. This class includes for instance pair potentials with finite range but also many non-pairwise interaction depending on the geometrical structure such as Voronoi interactions.
Links: [#recording] [#slides] [#preprint]
Wed., December 7, 2022 (3pm Europe time)
Speaker: Ed Cohen (Imperial College, United Kingdom)
Title: Testing for complete spatial randomness on 3D convex bounded shapes
Abstract: The development of statistical methods designed for analysing spatial point patterns has typically focused on Euclidean data and planar surfaces. However, with recent advances in 3D biological imaging technologies targeting protein molecules on cellular membranes, spatial point patterns are now being observed on complex shapes and manifolds whose geometry must be respected. Consequently, there is now a demand for tools that can analyse these data for important scientific studies in cellular and micro-biology. For this purpose, we extend the classical functional summary statistics for spatial point patterns to general convex bounded shapes. Using the Mapping Theorem, a Poisson process can be transformed from any convex shape to a Poisson process on the unit sphere where existing theory and methods can be leveraged. We present the first and second order properties of such summary statistics and demonstrate how they can be used to construct test statistics to determine whether an observed pattern exhibits complete spatial randomness on the original convex space.
Links: [#recording , #paper]
Wed., November 9, 2022 (3pm Europe time)
Speaker: Giada Adelfio (University of Palermo, Italy)
Title: Weighted second-order statistics for Diagnostics and Inference of complex spatio-temporal point processes
Abstract: In Adelfio et al (2020), we defined an approach to assesses the goodness-of-fit of spatio-temporal models, weighteing by the inverse of the conditional intensity function. The method accounts for the local weighted second-order statistics, providing a quite general approach for individual diagnostics. Starting from these results, we are now developing an estimation based on the local second-order characteristics of the weighted process, providing also local estimates. The method does not rely on any particular model assumption on the data, and thus it can be applied for whatever is the generator model of the (even complex) process.
Links: [#recording, #paper 1 , #paper2 ]
Wed., October 12,2022 (3pm Europe time)
Speaker: Yongtao Guan (University of Miami, USA)
Title: Point Process Methods for Modeling Human Activity Data
Abstract: In this talk, we consider using point processes to model human activity data. Firstly, we propose a general framework of multi-level log-Gaussian Cox process for repeatedly observed point processes and use that to model individual investor's stock trading activities. A novel nonparametric approach is developed to efficiently and consistently estimate the covariance functions of the latent Gaussian processes at all levels. We further extend our procedure to the bivariate point process case in which potential correlations between the processes can be assessed. Secondly, we propose a group network Hawkes process model, where the network structure is observed and fixed, and use that to model the posting patterns of Sina Weibo users. We introduce a latent group structure among individuals to account for the heterogeneous user-specific characteristics. A maximum likelihood approach is proposed to simultaneously cluster individuals in the network and estimate model parameters. A fast EM algorithm is subsequently developed by utilizing the branching representation of the proposed model.
Links: [recording]
Wed., September 14, 2022 (3pm Europe time)
Speaker: Peter Craigmile (Ohio State University, USA)
Title: Optimal Design Emulators: A Point Process Approach
Abstract: (This is joint research with Matt Pratola, The Ohio State University and Chunfang Devon Lin, Queen's University.) Design of experiments is a fundamental topic in applied statistics with a long history. Yet its application is often limited by the complexity and costliness of constructing experimental designs, which involve searching a high-dimensional input space and evaluating computationally expensive criterion functions. In this work, we introduce a novel approach to the challenging design problem. We will take a probabilistic view of the problem by representing the optimal design as being one element (or a subset of elements) of a probability space. Given a suitable distribution on this space, a generative point process can be specified from which stochastic design realizations can be drawn. In particular, we describe a scenario where the classical entropy-optimal design for Gaussian Process regression coincides with the mode of a particular point process. We conclude with outlining an algorithm for drawing such design realizations, its extension to sequential designs, and applying the techniques developed to constructing designs for Stochastic Gradient Descent and Gaussian process regression.
Wed., April 27, 2022 (3pm Europe time)
Speaker: Thomas Opitz (Biostatistics and Spatial Processes, INRAE, Avignon, France)
Title: Modeling at the interface of point processes and extreme values
Abstract: Statistical methodology for the two application domains of point processes and of extreme values has been developed by different scientific communities, but there are important theoretical connections between these two fields. The purpose of this talk is to highlight some of these connections and to show recent work at the interface of stochastic geometry and extreme-value theory. A major difference between these two fields is that extreme-value theory is mostly concerned with continuous random variables while point processes are used in the context of discrete events. However, both approaches deal with rare events.
Classical extreme-value limit theory for independent samples of increasing size n is concerned with the behavior of sample maxima tending towards a max-stable limit. Equivalently, this convergence can be expressed through point-process representations, which is quite natural due to the law of rare events: a Poisson distribution arises asymptotically for extreme event counts as n increases if the occurrence probability p of more and more extreme events decreases at a certain rate. The theory also extends to dependent extremes (e.g., weather extremes with spatial correlation), where max-stable limit processes arise and can be used to model observations of componentwise maxima extracted from temporal blocks (for instance, annual location-wise maxima of daily temperatures).
In this talk, I will review some general theory and present two specific results. First, I will explain how the mark distribution in marked point processes can be chosen when marks are extreme, i.e., when they exceed a high threshold. This problem is illustrated with spatiotemporal wildfire modeling, where points represent ignitions and marks represent burnt areas. Few very extreme wildfires contribute the same burnt area as all the remaining ones and must be modeled with great care. Second, I will show how statistical models pertaining to the classes of max-stable limit processes and the more general max-infinitely divisible processes can be constructed through point process representations. Use of such models will be illustrated for modeling nonstationarity in temperature extremes over Europe.
Wed., April 6, 2022 (3pm Europe time)
Speaker: Samuel Soubeyrand (INRAE Avignon, France)
Title: Estimation of spatio-temporal networks from trajectory data, with an application to tropospheric networks
Abstract: Tropospheric movements can be represented by three-dimensional trajectories of air masses connecting distant areas of the Earth. Such connection between distant areas is essential to infer and predict the dispersal of entities that air masses may carry, e.g., volcanic ash, dust, radioactive chemical elements and micro- or small-biological organisms. We proposed a mathematical formalism to construct spatial and spatiotemporal networks where the nodes represent the subsets of a partition of a geographical area and the links between them are inferred from sampled trajectories of air masses passing over and across them. In this talk I will introduce this mathematical formalism, propose diverse bio-physical hypotheses for modeling the links based on trajectories and present different options and sampling schemes for estimating the intensity of links. I will also discuss properties that can be derived from the inferred networks, and how these networks can be used as input data in models of propagation of airborne pathogens for example. Beyond the construction of tropospheric networks, our approach could be applied to other types of trajectories, such as animal trajectories, to characterize connectivity between different components of the landscape hosting the animals.
Links: [#slides] [#recording]
Wed.,March 23 , 2022 (3pm Europe time)
Speaker: Eliza O'Reilly (Caltech, USA)
Title: Random Tessellation Forests
Abstract: Random forests are a popular class of algorithms used for regression and classification. The original algorithm introduced by Breiman in 2001 and many of its variants are ensembles of randomized decision trees built from axis-aligned partitions of the feature space. One such variant, called Mondrian random forests, were proposed to handle the online setting and are the first class of random forests for which minimax rates were obtained in arbitrary dimension. However, the restriction to axis-aligned splits fails to capture dependencies between features, and oblique splits have shown improved empirical performance for many tasks. By viewing the Mondrian as a special case of the stable under iterated (STIT) process in stochastic geometry, we resolve some open questions about the generalization of split directions. In particular, we utilize the theory of stationary random tessellations to show that STIT random forests achieve minimax rates for Lipschitz and C^2 functions and adapt to sparsity in high dimensional feature space. This work opens many new questions at the intersection of stochastic geometry and machine learning. Based on joint work with Ngoc Tran.
Links: [#slides, #recording, #paper1, #paper2 ]
Wed., March 2, 2022 (3pm Europe time)
Speaker: Yao Xie (Georgia Institute of Technology, USA)
Title: Non-stationary spatio-temporal point process modeling for high-resolution COVID-19 data
Abstract: Most COVID-19 studies commonly report figures of the overall infection at a state- or county-level, reporting the aggregated number of cases in a particular region at one time. This aggregation tends to miss out on fine details of the propagation patterns of the virus. This paper is motivated by analyzing a high-resolution COVID-19 dataset in Cali, Colombia, that provides every confirmed case's exact location and time information, offering vital insights for the spatio-temporal interaction between individuals concerning the disease spread in a metropolis. We develop a non-stationary spatio-temporal point process, assuming that previously infected cases trigger newly confirmed ones, and introduce a neural network-based kernel to capture the spatially varying triggering effect. The neural network-based kernel is carefully crafted to enhance expressiveness while maintaining results interpretability. We also incorporate some exogenous influences imposed by city landmarks. The numerical results on real data demonstrate good predictive performances of our method compared to the state-of-the-art as well as its interpretable findings. This is a joint work with Zheng Dong, Shixiang Zhu, Jorge Mateu, and Francisco J. Rodriguez-Cortes
Links: [ #recording , #slides ]
Wed., February 9, 2022 (3pm Europe time)
Speaker: Tomáš Mrkvička (Univ. of South Bohemia, Czech Republic)
Title: Bayesian MCMC inference for complex cluster models
Abstract: The stationary Neyman-Scott point process can be extended for inhomogeneity in many ways. The center points, cluster sizes or cluster spread can be inhomogeneous. Also a combination of these types can be of interest. Further, the distribution of cluster sizes can be non Poisson, usually the Poisson distribution is assumed. We consider all these models and propose the Bayesian MCMC algorithms to estimate parameters of these models. The Bayesian MCMC approach is tractable for all these models and in cases where the faster methods can be applied it gives more precise results. We developed an R package binspp available here, which contains these algorithms.
Links: [ #slides ,#recording ]
Wed., January 19, 2022 (3pm Europe time)
Speaker: Frédéric Lavancier (Univ. Nantes, France)
Title: Spatial birth-death-move processes: basic properties and estimation of their intensity functions.
Abstract: Various spatio-temporal data record the time of birth and death of individuals, along with their spatial trajectories during their lifetime, whether through continuous-time observations or discrete-time observations. The data at hand can be viewed as a random set of points, the cardinality and the position of which evolve stochastically through time. Natural applications include epidemiology, individual-based modelling in ecology, spatio-temporal dynamics observed in bio-imaging, and computer vision. To model this kind of data, we introduce spatial birth-death-move processes, where the birth and death dynamics depends on the current spatial state of all alive individuals and where individuals can move during their lifetime according to a continuous Markov process. We present some of the basic probabilistic properties of these processes and we consider the non-parametric estimation of their birth and death intensity functions. We prove the consistency of kernel estimators in presence of continuous-time or discrete-time observations, under fairly simple conditions. We moreover discuss how we can take advantage in practice of structural assumptions made on the intensity functions and we explain how data-driven bandwidth selection can be conducted, despite the unknown (and sometimes undefined) second order moments of the estimators. We finally apply our statistical method to the analysis of the spatio-temporal dynamics of proteins involved in exocytosis in cells.
This is a joint work with Ronan Le Guével (Rennes 2).
Links: [#paper, #preprint, slides ][#recording]
Wed., December 1, 2021 (3pm Europe time)
Speaker: Marc Genton (King Abdullah University of Science and Technology (KAUST), Saudi Arabia) [stsds]
Title: Large-Scale Spatial Data Science with ExaGeoStat
Abstract: Spatial data science aims at analyzing the spatial distributions, patterns, and relationships of data over a predefined geographical region. For decades, the size of most spatial datasets was modest enough to be handled by exact inference. Nowadays, with the explosive increase of data volumes, High-Performance Computing (HPC) can serve as a tool to handle massive datasets for many spatial applications. Big data processing becomes feasible with the availability of parallel processing hardware systems such as shared and distributed memory, multiprocessors and GPU accelerators. In spatial statistics, parallel and distributed computing can alleviate the computational and memory restrictions in large-scale Gaussian process inference and prediction. In this talk, we will describe cutting-edge HPC techniques and their applications in solving large-scale spatial problems with the new software ExaGeoStat and its R version ExaGeoStatR.
Links: [#recording] [#slides]
Wed., November 10, 2021 (3pm Europe time)
Speaker: Ute Hahn (Aarhus University, Denmark)
Title: Mark and pair correlation function can characterize parents of cluster point processes in space-time
Abstract: Cluster point processes in space-time can be seen as spatial cluster point process with marks in time. For such processes, we derive a relation between Stoyan's mark correlation function and the pair correlation function that can be used to retrieve information on the spatial parent point process and on the daughter clusters. Research was motivated by data from Photoactivated Localization Microscopy (PALM), a technique that promises to reconstruct locations of single molecules. PALM data are spatial point patterns marked with a time mark. These space time points correspond to localizations of photons emitted at a given time from proteins that were labelled with fluorophores. Fluorophores emit multiple photons, yielding a cluster process in space-time, where protein locations constitute the spatial parent point process. Protein locations being the data of interest, the clusters themselves are considered as artifacts. The mark-pair-correlation device can be used to fit a physically motivated space-time cluster model to such data and thus to correct for artifacts. Joint work with Louis Gammelgaard Jensen.
Links: [#paper] [#recording] [#slides]
Wed., October 27 2021 (3pm Europe time)
Speaker: Abdollah Jalilian (Isfahan University, Iran)
Title: Assessing similarities between spatial point patterns with a Siamese neural network discriminant model
Abstract: We discuss identifying and quantifying structural similarities among observed point patterns from several known types. Consider, for example, spatial point patterns of more than 100 species in a tropical rainforest study plot that are observed at different time instances. It is important to make inference about the underlying similarities among these species and distinguish between random and structural differences among observed point patterns from different species. To this end, we use deep convolutional neural networks and employ a Siamese framework to build a discriminant model for distinguishing structural differences between observed spatial point patterns of different types. Through a simulation study and data analysis, we illustrate the adequacy and generality of a Siamese network discriminant model in one-shot learning classification of spatial point patterns, compared with common dissimilarities based on intensity and K functions. This is based on joint work with Jorge Mateu.
Links: [#recording] [#slides]
Wed., October 6, 2021 (3pm Europe time)
Speaker: Claudia Redenbach (TU Kaiserslautern, Germany)
Title: Anisotropy analysis of spatial point patterns
Abstract: A spatial point pattern is called anisotropic if its spatial structure depends on direction. Several methods for anisotropy analysis have been introduced in the literature. This talk gives an overview of nonparametric methods for anisotropy analysis of (stationary) point patterns in 2D and 3D. We discuss methods based on nearest neighbour and second order summary statistics as well as spectral and wavelet analysis. Typical statistical tasks include testing for isotropy, estimating preferred directions, and estimating the transformation matrix under the assumption of geometric anisotropy. The talk is based on joint work with Tuomas Rajala, Martina Sormani, and Aila Särkkä.
Links: [ #paper1, #paper2, #paper3][#Recording][#Slides]
Wed., September 15, 2021 (3pm Europe time)
Speaker: Jesper Møller (Aalborg Univ., Denmark)
Title: Two short talks: 1. Should we condition on n? 2. Repulsive point process priors for mixture models: where MCMC for doubly-intractable distributions become tractable!
Abstract 1: We discuss the practice of directly or indirectly assuming a model for the number of points when modelling spatial point patterns even though it is rarely possible to validate such a model in practice since most point pattern data consist of only one pattern. We therefore explore the possibility to condition on the number of points instead when fitting and validating spatial point process models. In a simulation study with different popular spatial point process models, we consider model validation using global envelope tests based on functional summary statistics. We find that conditioning on the number of points will for some functional summary statistics lead to more narrow envelopes and thus stronger tests and that it can also be useful for correcting for some conservativeness in the tests when testing composite hypothesis. However, for other functional summary statistics, it makes little or no difference to condition on the number of points. When estimating parameters in popular spatial point process models, we conclude that for mathematical and computational reasons it is convenient to assume a distribution for the number of points.
Abstract 2: Repulsive mixture models have recently gained popularity for Bayesian cluster detection. Compared to more traditional mixture models, repulsive mixture models produce a smaller number of well separated clusters. The most commonly used methods for posterior inference either require to fix a priori the number of components or are based on reversible jump MCMC computation. We present a general framework for mixture models, when the prior of the `cluster centres' is a finite repulsive point process depending on a hyperparameter, specified by a density which may depend on an intractable normalizing constant. By investigating the posterior characterization of this class of mixture models, we derive a MCMC algorithm which avoids the well-known difficulties associated to reversible jump MCMC computation. In particular, we use an ancillary variable method, which eliminates the problem of having intractable normalizing constants in the Hastings ratio. The ancillary variable method relies on a perfect simulation algorithm, and we demonstrate this is fast because the number of components is typically small. In several simulation studies and an application on sociological data, we illustrate the advantage of our new methodology over existing methods, and we compare the use of a determinantal or a repulsive Gibbs point process prior model.
Wed., May 5, 2021 (3pm Europe time)
Speaker: Anne Marie Svane (Aalborg University, Denmark)
Title: Central limit theorems for Ripley’s K-function and persistence diagrams
Abstract: Ripley’s K-function is a standard tool for summarizing the second order structure of a stationary point process. More advanced geometric structures of the point process is captured by the persistence diagram from topological data analysis. In this talk, we present functional central limit theorems for both the K-function and persistence diagram that apply for a class of point processes that are m-dependent when conditioning on a suitable sigma-algebra, as well as certain Gibbs point processes. This is joint work with Christophe Biscio, Nicholas Chenavier, and Christian Hirsch.
Wed., April 21, 2021 (3pm Europe time)
Speaker: Édith Gabriel (INRAE, Avignon, France)
Title: Predicting the intensity function of censored point processes
Abstract: Seismic networks provide data that are used as basis both for public safety decisions and for scientific research. Their configuration affects the data completeness, which in turn, critically affects several seismological scientific targets (e.g., earthquake prediction, seismic hazard...). How to map earthquakes density in seismogenic areas from censored data or even in areas that are not covered by the network? We propose to predict the spatial distribution of earthquakes from the knowledge of presence locations and geological relationships, taking into account any interactions between records. Namely, in a more general setting, we aim to estimate the intensity function of a point process, conditional to its censored realization, as in geostatistics for continuous processes. We define a predictor as the best linear unbiased combination of the observed point pattern. We show that the weight function associated to the predictor is the solution of a Fredholm equation of second kind. Both the kernel and the source term of the Fredholm equation are related to the second order characteristics of the point process through the pair correlation function. Results are presented and illustrated on simulated nonstationary processes, using continuous covariates or the realization of additional point processes, and real data for mapping Greek Hellenic seismicity in a region with unreliable and incomplete records.
Links: [relevant paper: ] [recording]
Wed., April 7, 2021 (3pm Europe time)
Speaker: Christian Hirsch (University of Groningen, Netherlands)
Title: Maximum likelihood estimation in stochastic channel models
Abstract: In this talk, I will present Monte Carlo maximum likelihood estimation as a novel approach in the context of calibration and selection of stochastic channel models in signal processing of wireless networks. First, considering a Turin channel model with an inhomogeneous arrival rate as a prototypical example, I will explain how the general statistical methodology is adapted and refined for the specific requirements and challenges of stochastic multipath channel models. Then, I will illustrate the advantages and pitfalls of the method on the basis of simulated data and apply the calibration method to wideband signal data from indoor channels. Finally, I will outline possible extensions to Bayesian inference and clustered arrival models. Based on joint work with Ayush Bharti, Troels Pedersen, Rasmus Waagepetersen.
Links: [relevant paper: #paper] [recording] [slides]
Wed., March 24, 2021 (3pm Europe time)
Speaker: Clément Dombry (Univ. Franche-Comté, France)
Title: Probabilities of concurrent extremes
Abstract: The statistical modeling of spatial extremes has been an active area of recent research with a growing domain of applications. Much of the existing methodology, however, focuses on the magnitudes of extreme events rather than on their timing. To address this gap, this article investigates the notion of extremal concurrence. Suppose that daily temperatures are measured at several synoptic stations.We say that extremes are concurrent if record maximum temperatures occur simultaneously, that is, on the same day for all stations. It is important to be able to understand, quantify, and model extremal concurrence. Under general conditions, we showthat the finite sample concurrence probability converges to an asymptotic quantity, deemed extremal concurrenceprobability.Using Palm calculus,we establish general expressions for the extremal concurrence probability through the max-stable process emerging in the limit of the component-wisemaxima of the sample. Explicit forms of the extremal concurrence probabilities are obtained for various max-stable models and several estimators are introduced. In particular, we prove that the pairwise extremal concurrence probability for max-stable vectors is precisely equal to the Kendall’s τ . The estimators are evaluated from simulations and applied to study temperature extremes in the United States. Results demonstrate that concurrence probability can be used to study, for example, the effect of global climate phenomena such as the El Niño Southern Oscillation (ENSO) or global warming on the spatial structure and areal impact of extremes. Joint works with M.Ribatet and S.Stoev.
Links: [relevant paper: #paper1 #paper2 , slides , recording ]
Wed., March 10, 2021 (3pm Europe time)
Speaker: Jiří Dvořák (Department of Probability and Mathematical Statistics, Charles University, Prague, CZ)
Title: Nonparametric testing of the dependence structure among points-marks-covariates in spatial point patterns
Abstract: We investigate testing of the hypothesis of independence between a covariate and the marks in a marked point process. It would be rather straightforward if the (unmarked) point process were independent of the covariate and the marks. In practice, however, such an assumption is questionable, and possible dependence between the point process and the covariate or the marks may lead to incorrect conclusions. Hence we propose to investigate the complete dependence structure in the triangle points-marks-covariates together. We take advantage of the recent development of the nonparametric random shift methods, namely the new variance correction approach, and propose tests of the null hypothesis of independence between the marks and the covariate and between the points and the covariate. Joint work with T. Mrkvička, J. Mateu and J. González.
Links: [recording, slides, relevant papers: #paper1, #paper2]
Wed., February 24, 2021 (3pm Europe time)
Speaker: Peter Diggle (Lancaster Medical School, Lancaster University)
Title: Geostatistics, point process and Covid-19 monitoring
Abstract: In epidemiology, it is often the case that data presented as a set of measurements at each of a set of fixed locations have been derived from an underlying point process that is never explicitly defined. In this talk, I will point out this connection, discuss some practical implication and describe how we are using point process models to monitor aspects of the Covid-19 epidemic in the UK.
Links: [slides ]
Wed., February 10, 2021 (3pm Europe time)
Speaker: Mari Myllymäki (Luke, Natural Ressources Instititute Finland)
Title: New applications and software of global envelopes
Abstract: Global envelopes are nowadays quite often used in testing null models for spatial processes by means of different summary functions, because they provide a formal test and provide suggestions for alternative models through graphical interpretation of the test results. In this talk, I discuss the global envelopes for functional test statistics, which are discretized to m highly correlated hypotheses, and the control of multiple testing. While the global envelopes were first developed to control the family-wise error rate, also control of false discovery rate can be introduced. I also illustrate the methodology on various applications and discuss the R package GET that implements global envelopes.
Links: [relevant papers: #paper1, #paper2 , #paper3 , #paper4 , #paper5 ] [recording] [slides]
Wed., January 27, 2021 (3pm Europe time)
Speaker: Alan Gelfand (Duke University, USA)
Title: Bayesian Analysis of Spatial Point Patterns
Abstract: Spatial point patterns arise in many contexts. Examples include: ecological processes, e.g., the pattern of trees in a forest; spatial epidemiology, pattern in disease cases, perhaps different patterns for cases vs. controls; syndromic surveillance to identify disease outbreaks, e.g.,clustering of cases; the evolution/growth of a city, i.e., urban development. There is a rich probabilistic literature on stochastic process modeling for point patterns. With regard to spatial point patterns, more recently, there has been consequential attention paid to the applied side - fitting of models and data analysis. From the Bayesian perspective, this area of spatial analysis has lagged in development behind geostatistical/point referenced effort using Gaussian processes and the lattice/areal unit effort using Markov random fields. However, recently, we have seen substantial progress in Bayesian fitting of spatial point pattern models. Given such models can be fitted, the contribution here concerns development of a strategy for Bayesian inference with regard to model adequacy, model comparison, and full inference under a selected model. Our approach rests on posterior simulation, as in most of contemporary Bayesian analysis. Here, we need posterior simulation of point patterns from which we can extract posterior inference for features. For us, the bottom line is: if you can simulate point patterns, you can carry out the foregoing inference. Much of our work draws upon variants of the Georgii-Nguyen-Zessin (GNZ) formula which, for us, immediately supplies posterior Monte Carlo integrations for many features of interest. We illustrate the above using examples involving nonhomogeneous Poisson processes, log Gaussian Cox processes, and Strauss processes.
Links: [relevant paper: #monograph] [slides] [recording]
Wed., January 13, 2021 (3pm Europe time)
Speaker: JF Coeurjolly (Univ. Grenoble Alpes, France)
Title: Regularization methods and point processes.
Abstract: First-order analysis of a point pattern usually consists in estimating the intensity function or the (Papangelou) conditional intensity function. Standard parametric estimation methods rely upon composite likelihood, quasilikelihood, pseudo-likelihood, logistic regression likelihood. These problems have been widely treated in the literature and can be considered as solved in many situations for (moderate) inhomogeneous point patterns both from practical and theoretical perspectives. These last years, high-dimensional point patterns have emerged in various applications. By high dimension we mean in this talk that we observe a single point process with a large number of spatial covariates. Given the standard literature on high-dimensional statistics, it appears natural to penalise likelihood type estimating equations to simultaneously estimate and select parameters. Such natural ideas appear very natural and have already been considered in the literature. I will discuss how these ideas apply to the problem of estimating intensity and conditional intensity functions, how the methods can be efficiently implemented and will present some theoretical results.
Joint works with I. Ba, A. Choiruddin, F. Letué, F. Cuevas Pacheco, M.-H. Descary and R. Waagepetersen.
Wed., December 16, 2020 (3pm Europe time)
Speaker: Janine Illian (University of Glasgow, UK) [recording]
Title: The corona crisis in Glasgow — a statistician’s perspective
Abstract: Epidemiological modelling and statistical data analysis have played a central role in the handling of the Covid-19 crisis world-wide. In the talk I will discuss how local statistical expertise within Glasgow University has been used to help us learn about the virus. This includes highlighting issues of data quality, involvement in modelling approaches, international networking activities as well as providing modelling expertise both locally and internationally.
Links: [relevant paper: ]
Wed., December 2, 2020 (3pm Europe time)
Speaker: Ottmar Cronie (University of Gothenburg, Sweden)
Title: Cross-validation for point processes
Abstract: Cross-validation (CV) is ubiquitous in statistics and data science. The essential idea behind CV is to split the dataset into two parts some k times, using some suitable splitting/partitioning mechanism. One of the split parts is referred to as ’training data’, and is used to carry out the estimation, whereas the other part is referred to as ’validation data’, and is used as hold-out data. The fit of the training data-generated model is evaluated (in some suitable way) using the validation data. Typical examples of CV methods in classical statistics include leave-one-out CV and k-fold CV.
In this talk we propose different approaches to CV in the context of point patterns, which respect the dependence structure of the underlying point process. Having introduced our CV approaches, we proceed by looking closer at how they may be exploited in different statistical settings. More specifically, we introduce general frameworks for i) estimation through prediction error minimisation, and ii) penalised fitting. Each of these approaches is based on a sequence of k training and validation data pairs, which have been generated using our CV approaches. We then illustrate how these two approaches may be applied in various statistical settings.
This is work in progress. This is joint work with C. Biscio, A. Choiruddin and M. Moradi.
Wed., November 18, 2020 (3pm Europe time)
Speaker: Rick Paik Schoenberg (University of California at Los Angeles, USA) [recording]
Title: Nonparametric estimation of space-time Hawkes and recursive processes, with applications to earthquakes and epidemics
Abstract: Following a review of space-time Hawkes processes, this talk will explore the extension to cases where the productivity is variable. We focus especially on the recursive model, where the productivity is inversely related to the conditional intensity, and the completely unparameterized case where each point may have its own productivity. Properties of estimators for these models are explored, and the methods are applied to seismological and epidemic datasets where variable productivity is observed.
Links: [relevant papers: #paper1 , #paper2 , #paper3 ] [slides]
Wed., November 4, 2020 (3pm Europe time)
Speaker: Aila Särkkä (Chalmers University of Technology and University of Gothenburg, Sweden)
Title: Spatial point process models for sweat glands observed with noise
Abstract: Peripheral neuropathy is damage to nerves or disease affecting nerves and is expected to affect some fibers, which innervate sweat glands. Therefore, to evaluate the severity of nervous system impairment, one can measure the sweat gland function. Dr. Kennedy’s group at the University of Minnesota has developed a sensitive sweat test that measures the secretion of sweat from individual sweat glands and shows the locations and distribution of active sweat glands. The test is performed by recording a one minute video at the rate of one frame/sec. We have access to 15 such videos, five from healthy control subjects, five subjects with suspected peripheral neuropathy, and five subjects with diagnosed peripheral neuropathy. The sweat patterns are regarded as realizations of spatial point processes. We present two point process models for the activation of sweat glands by using the videos mentioned above. Several image analysis steps were needed to extract the point patterns from the videos and as a result, some incorrectly identified sweat gland locations may be present in the data. We discuss how to take such errors into account either by including an error term in the point process model or using an estimation procedure that is robust with respect to the errors. (joint work with Mikko Kuronen and Mari Myllymäki, Natural Resources Institute Finland, and Adam Loavenbruck, University of Minnesota)
Wed., October 21, 2020 (3pm Europe time)
Speaker: Ganggang Xu (Miami Herbert Business School, USA)
Title: Semi-parametric Learning of Structured Temporal Point Processes
Abstract: We propose a general framework of using multi-level log-Gaussian Cox process to model repeatedly observed point processes with complex structures; such type of data have become increasingly available in various areas including medical research, social sciences, economics and finance due to technological advances. A novel nonparametric approach is developed to estimate the covariance functions of the latent Gaussian processes efficiently and consistently at all levels. To predict the functional principal component scores, we propose a consistent estimation procedure by maximizing the conditional likelihood of super-positions of point processes. We further extend our procedure to the bivariate point process case in which potential correlations between the processes can be assessed. Asymptotic properties of the proposed estimators are investigated, and the effectiveness of our procedures is illustrated through a simulation study and an application to a stock trading dataset.
Wed., October 7, 2020 (3pm Europe time)
Speaker: Marie-Colette Van Lieshout (Centre for Mathematics and Computer Science CWI and at the University of Twente)
Title: Bandwidth selection for kernel estimators of spatial intensity functions
Abstract: The analysis of a spatial point pattern usually involves estimating the intensity function, that is, the likelihood of finding a point as a function of location. Sometimes the scientific context suggests a parametric form for the intensity function, perhaps in terms of covariate information. More often, non-parametric estimation is called for. In this case, kernel estimators can be used. They involve one crucial parameter: the bandwidth.
In this talk, I will describe a new bandwidth selection method and compare its efficacy to that of existing methods by means of simulation. The new method is based on an optimality criterion motivated by the Campbell formula applied to the reciprocal intensity function. It is fully non-parametric, does not require knowledge of higher order moments, and is not restricted to a specific class of point process. Also, it is computationally straightforward and does not require numerical approximation of integrals.
Next, I wil discuss asymptotic expansions of the mean squared error when independent copies of the point process are superposed. I will show that the optimal bandwidth is of the order $n^{-1/(d+4)}$ under appropriate smoothness conditions on the kernel and true intensity function. Moreover, the Abramson principle can be applied to define adaptive kernel estimators. The optimal adaptive bandwidth turns out to be of the order $n^{-1/(d+8)}$ under appropriate smoothness conditions.
This talk is partially based on joint work with Ottmar Cronie.
Wed., September 23, 2020 (3pm Europe time)
Speaker: Denis Allard (INRA Avignon)
Title: Simulating space-time random fields with nonseparable Gneiting-type covariance functions
Abstract: Two algorithms are proposed to simulate space-time Gaussian random fields with a covariance function belonging to an extended Gneiting class. In both cases, the simulated random field is constructed as a weighted sum of space-time cosine waves, with a Gaussian spatial frequency vector and a uniform phase. The difference lies in the way the temporal component is handled. The first algorithm relies on a spectral decomposition in order to simulate a temporal frequency conditional upon the spatial one, while in the second algorithm the temporal frequency is replaced by an intrinsic random field whose variogram is proportional to the conditionally negative definite function associated with the temporal structure. Both algorithms are scalable as their computational cost is proportional to the number of space-time locations, which may be unevenly spaced in space and/or in time. They are illustrated and validated through synthetic examples.
Joint work with Xavier Emery (U. Chile), Céline Lacaux, (Avignon Université) and Christian Lantuéjoul (Mines ParisTech).
Wednesday, September 9, 2020
Speaker: Adrian Baddeley (Curtin University, Australia)
Title: Diffusion smoothing for spatial point patterns
Abstract: Our problem is to estimate the intensity function of a spatial point process when all the points are known to lie inside a bounded region. The traditional kernel methods exhibit artefacts which are physically unrealistic. These artefacts can be avoided by using "diffusion smoothing", in which the smoothing kernel is the heat kernel on the spatial domain. We develop diffusion smoothing into a practical statistical methodology for two-dimensional spatial point pattern data. We clarify the advantages and disadvantages of diffusion smoothing over Gaussian kernel smoothing. Adaptive smoothing, where the smoothing bandwidth is spatially-varying, can be performed by adopting a spatially-varying diffusion rate, or using lagged arrival times. Practical applications are demonstrated.
Links: [recording (starting 10 minutes late, sorry for the inconvenience)] [slides ]
Tuesday, July 7, 2020
Speaker: Rasmus Waagepetersen (Aalborg University, Denmark)
Title: Globally intensity-reweighted estimators for K- and pair correlation functions
Abstract: We introduce new estimators of the inhomogeneous $K$-function and the pair correlation function of a spatial point process as well as the cross K-function and the cross pair correlation function of a bivariate spatial point process under the assumption of second-order intensity-reweighted stationarity. These estimators rely on a 'global' normalization factor which depends on an aggregation of the intensity function, whilst the existing estimators depend 'locally' on the intensity function at the individual observed points. The advantages of our new global estimators over the existing local estimators are demonstrated by theoretical considerations and a simulation study.
Links: [recording] [slides] [relevant paper: #paper1 ]