14 Oct 2025 / 16:00 (UTC)
Barak Zackay (Weizmann Institute, Rehovot, Israel)
Advanced statistics and gravitational wave astrophysics
Abstract: Gravitational wave astrophysics has opened a new observational window into the universe, allowing us to probe cataclysmic events such as binary black hole mergers with unprecedented precision. Since the landmark detections by LIGO and Virgo, the field has rapidly evolved—not only in instrumentation but also in the data analysis frameworks that underlie discovery. In this talk, I will begin with an accessible introduction to gravitational wave detection, focusing on the physical principles and the experimental challenges of extracting minute signals from noisy data. I will then delve into how advanced—and occasionally novel—applications of statistical methodology have led to a substantial increase in sensitivity. Notably, these methods have enabled a twofold increase in the number of confidently identified binary black hole mergers. Next, I will explore the computational tools and algorithmic strategies that have streamlined parameter estimation for compact binary mergers. By finding a convenient sufficient statistic, we have reduced the parameter estimation runtime from weeks to minutes. As we incorporated increasingly rich physical models into the analysis, the runtime was again weeks. By developing even more computational techniques, we have yet again, reduced the runtime to minutes, opening up new possibilities. Time permitting, I will conclude with a critical discussion of hierarchical Bayesian population analyses. While widely adopted, these methods failed in subtle and instructive ways when applied to gravitational wave catalogs.
11 Nov 2025 / 16:00 (UTC)
Junming Diao (Mississippi State Univ., USA)
TBI (Advanced IoT for Radio Telescope)
Abstract: TBI
9 September 2025 / 8:00 (UTC)
Arwa Dabbech (Heriot-Watt U., Edimburg, UK)
Imaging the radio sky beyond CLEAN: from optimisation theory to deep learning
Abstract: Owing to the sheer amounts of data they provide, modern radio telescopes can map the radio sky over large fields of view and wide frequency bandwidths with unprecedented sensitivity and resolution. The underlying image formation is a challenging inverse problem, calling for efficient imaging algorithms that are able to scale to large data volumes and image dimensions and deliver high-precision reconstruction. For over five decades, the CLEAN algorithm has been the standard for radio-interferometric imaging thanks to its simplicity and computational efficiency, albeit at the expense of limited resolution and dynamic range of its reconstruction. This talk provides an overview of modern image reconstruction algorithms, from optimisation theory to deep learning, which have demonstrated superior imaging precision to CLEAN both in simulation and on real data. These include proximal algorithms with handcrafted regularisation, plug-and-play algorithms using learned denoisers, and the most recent R2D2 paradigm, which can be seen as a learned version of CLEAN with minor cycles substituted with a deep neural network (DNN) whose training is iteration-specific. R2D2 not only improves imaging precision but also accelerates reconstruction, opening the door for fast, high-resolution, high-dynamic range imaging in radio astronomy.
8 July 2025 / 16:00 (UTC)
Abby Azari (University of Alberta/Alberta Machine Intelligence Institute, Alberta, Canada)
Integrating Machine Learning for Planetary Science: Perspectives for the Next Decade
Abstract: Planetary science has seen roughly half as many applications of machine learning as compared to other fields, like Earth science and astrophysics.¹ This presentation will introduce the challenges in machine learning for planetary science that are potentially related to these limited applications, before delving into a highlighted example of addressing these challenges. This example will focus on the development of a first continuous solar wind estimation upstream from Mars. This virtual solar wind monitor, or vSWIM² was trained and assessed on NASA MAVEN spacecraft data and is generated from Gaussian process regression, a type of machine learning vSWIM is used to provide predictions, and uncertainties on these predictions, at various temporal resolutions. I will conclude with a discussion of scientific studies enabled by vSWIM, including comparative magnetospheric studies, and provide an outlook for the future of machine learning in planetary science.
10 June 2025 / 8:00 (UTC)
Dattaraj Dhuri (New York University, Abu Dhabi)
Machine Learning Approaches for Improved Forecasting of Space Weather Phenomena and Uncovering Underlying Physics
Abstract: Solar magnetic fields are most prominently characterized by the solar cycle — an approximately 11-year periodic variation in the total number/area of sunspots. However, the physical processes driving the emergence of new sunspots or active regions (ARs), as well as the evolution of AR magnetic fields that lead to major solar flares and coronal mass ejections (CMEs), remain insufficiently understood. These events are primary drivers of severe space weather, posing significant risks to both Earth-based and spaceborne infrastructure. Additionally, the continuously emitted solar wind may crucially influence the atmospheres of Earth as well as Mars. Modern solar missions provide high-cadence, high-resolution observations of photospheric magnetic fields, velocity fields, and atmospheric intensities, enabling unprecedented data-driven analyses via advanced Machine Learning (ML) algorithms. In this talk, I will present the application of diverse ML techniques for forecasting solar flares, solar wind speed, and the emergence of new sunspots, along with the development of a data-driven model for proton auroras on Mars. Furthermore, I will discuss the use of model interpretation and explainability methods to extract physically meaningful insights from these ML models, including identifying pre-emergence magnetic field signatures, triggers for solar eruptions, and evaluating the robustness of the Martian aurora model under varying solar wind and seasonal conditions.
13 May 2025 / 16:00 (UTC)
David Hogg (New York University, USA)
Can machine learning be used to make measurements?
Abstract: Machine learning (ML) is now a critical tool in the astrophysics toolkit. That said, ML methods, which often involve enormously over-parameterized models, which are optimized to the sole objective of matching the data, and which contain strong implicit priors, can strongly bias scientific results. On the other hand, most measurements in astrophysics involve some or many nuisance components, where the flexibility of ML methods becomes an intellectual strength; the flexibility can be used to make causal separations and causal arguments more conservative. I give examples on both sides of this, showing places where ML regressions produce extremely biased scientific results, and places where ML methods successfully remove biases introduced by instruments and foregrounds. I will mention examples from cosmology, stellar astrophysics, and exoplanets.
8 April 2025 / 8:00 (UTC)
Louise Mousset & Erwan Allys (ENS, Paris, France)
Generative models and component separation with Scattering Transforms for astrophysical images
Abstract: New statistical descriptions related to the so-called Scattering Transforms have recently achieved attractive results for several astrophysical applications. These statistics share ideas with convolutional neural networks, but do not need to be learned, allowing very efficient characterization of non-Gaussian processes from a very small amount of data. In addition, they can be used to build generative models of data and form the basis of new non-Gaussian component separation techniques. In this seminar we will give a general introduction to these statistics and show some applications developed in our group, mostly on 2D planar data. In particular, we will discuss recent results obtained with Scattering Transforms on spherical images, which are all the more relevant for upcoming cosmological studies, such as LiteBIRD for the polarization of the cosmic microwave background, or the Vera C. Rubin Observatory and the Euclid Space Telescope for the study of large-scale structures in the Universe.
11 Mar 2025 / 16:00 (UTC)
Henry Leung (Toronto U., Canada)
Estimating Probability Densities with Transformer and Denoising Diffusion
Abstract: Transformers are often the go-to architecture to build foundation models that ingest a large amount of training data. But these models do not estimate the probability density distribution when trained on regression problems, yet obtaining full probabilistic outputs is crucial to many fields of science, where the probability distribution of the answer can be non-Gaussian and multimodal. In this talk, I will discuss our work on training a probabilistic model using a denoising diffusion head on top of the Transformer that provides reasonable probability density estimation even for high-dimensional inputs. I will show our Transformer+Denoising Diffusion model can infer labels accurately with reasonable distributions in a variety of inference tasks by training it on a large dataset of astronomical observations and measured labels of stars within our Galaxy.
11 Feb 2025 / 8:00 (UTC)
Jessica Whitney (Univ. Coll. London, UK)
Generative modelling for mass-mapping with fast uncertainty quantification
Abstract: Understanding the nature of dark matter in the Universe is an important goal of modern cosmology. A key method for probing this distribution is via weak gravitational lensing mass-mapping - a challenging ill-posed inverse problem where one infers the convergence field from observed shear measurements. Upcoming stage IV surveys, such as those made by the Vera C. Rubin Observatory and Euclid satellite, will provide a greater quantity and precision of data for lensing analyses, necessitating high-fidelity mass-mapping methods that are computationally efficient and that also provide uncertainties for integration into downstream cosmological analyses. In this talk I will discuss MMGAN, a novel mass-mapping method based on a regularised conditional generative adversarial network (GAN) framework, which generates approximate posterior samples of the convergence field given shear data. I will discuss the model, how the architecture was built to prevent mode collapse, and finally show results applied to real observational data. MMGAN significantly outperforms the Kaiser-Squires technique and achieves similar reconstruction fidelity to alternative state-of-the-art deep learning approaches.
10 Dec 2024 / 8:00 (UTC)
Suzanne Aigrain (Oxford U., UK)
Gaussian process models for stellar variability in exoplanet searches
Abstract: Over the past decade, Gaussian Process regression has emerged as a powerful tool to characterise stellar signals in photometric and spectroscopic searches for exoplanets. In particular, multi-output latent Gaussian Process models (sometimes referred to simply as “multi-GPs”) are among the most powerful tools to separate the reflex motion caused by planets from the more complex signatures of stellar activity in radial velocity surveys. Doing this is key to enable these surveys to reach their full potential and detect planets similar to the Earth around nearby, Sun-like stars, paving the way for biosignature searches. During my talk I will outline the physical and statistical basis of these models, and illustrate their strengths and weaknesses using practical applications to data from the Sun and from more active stars, as well as outline the potential for future improvements in the context of ground-based radial velocity and space-based transit searches. Such models potentially have a much wider field of application, wherever we need to model contemporaneous observations of multiple, noisy variables that are related to each other via affine transformations such as addition, multiplication, translation, integration or differentiation.
12 Nov 2024 / 16:00 (UTC)
Mario Pasquato (Montreal University, Canada)
Causal discovery: what it can do for us astronomers
Abstract: Astronomy is an observational science. As such it cannot rely on experiments to settle debates about the direction of causation whenever a correlation between variables is found. Thanks to recent advances in the field of causal discovery it is however possible to infer causal relations from observational data alone, at least when certain conditions are met. I will cover a few algorithmic approaches to causal discovery and show that the application of these new algorithms can shed light on long standing conundrums in astronomy. As an example I will discuss the relation between super-massive black holes and their host galaxies, in which the direction of the causal connection has been debated for decades.
8 Oct 2024 / 8:00 (UTC)
Julia Lascar (CEA, Saclay, France)
Hyperspectral fusion and source separation with spectral variabilities for X-ray astrophysics
Abstract: In astrophysics, X-ray telescopes can collect cubes of data called hyperspectral images, which have two spatial dimensions and one spectral dimension. The analysis of such data is key to understanding the physics of high-energy extended sources, such as supernova remnants. However, several challenges make this process difficult, such as the presence of Poisson noise, or the high spectral variability of the data. This seminar will present two algorithms to better exploit X-ray hyperspectral images. The first separates mixed sources while including the spectral variability of each source. The second fuses data from two generations of X-ray telescopes. Results will be shown on toy-models built from supernova remnant simulations, as well as on real data of supernova remnants (Cassiopeia A, the Crab nebula).
10 Sep 2024 / 16:00 (UTC)
Alex Szalay (Johns Hopkins U., Baltimore, USA)
AI-ready data in astrophysics
Abstract: Astronomy has been leading the open data movement. This is becoming today even more relevant as the value of data is exponentially increasing. Companies are spending hundreds of millions on training large language models and building computational facilities containing hundreds of thousands of GPUs. At the same time, academic institutions are increasingly struggling to identify areas where they can remain competitive. The talk will discuss these issues and will talk about how astronomy (and astronomers) need to become more agile, focus on increasingly more scalable experiments, and make the collected data AI-ready.
13 Aug 2024 / 08:00 (UTC)
Yuan-Sen Ting (The Australian National University, Australia)
Expediting Astronomical Discovery with Large Language Models: Progress, Challenges, and Future Directions
Abstract: The vast and interdisciplinary nature of astronomy, coupled with its open-access ethos, makes it an ideal testbed for exploring the potential of Large Language Models (LLMs) in automating and accelerating scientific discovery. In this talk, we present our recent progress in applying LLMs to tackle real-life astronomy problems. We demonstrate the ability of LLM agents to perform end-to-end research tasks, from data fitting and analysis to iterative strategy improvement and outlier detection, mimicking human intuition and deep literature understanding. However, the cost-effectiveness of closed-source solutions remains a challenge for large-scale applications involving billions of sources. To address this issue, we introduce our ongoing work at AstroMLab on training lightweight, open-source specialized models and our effort to benchmark these models with carefully curated astronomy benchmark datasets. We will also discuss our effort to construct the first LLM-based knowledge graph in astronomy, leveraging citation-reference relations. The open-source specialized LLMs and knowledge graph are expected to guide more efficient strategy searches in autonomous research pipelines. While many challenges lie ahead, we explore the immense potential of scaling up automated inference in astronomy, revolutionizing the way astronomical research is conducted, ultimately accelerating scientific breakthroughs and deepening our understanding of the Universe.
9 July 2024 / 16:00 (UTC)
Elena Cuoco (European Gravitational Observatory, Pisa, Italy)
Machine learning for Gravitational Wave Transient Astrophysics
Abstract: Gravitational wave transient astrophysics has entered an exciting era with the advent of advanced gravitational wave detectors. These detectors have opened a new window to the cosmos, allowing us to observe and study astrophysical events with unprecedented precision Artificial Intelligence (AI) has emerged as a groundbreaking technology with the potential to revolutionize various scientific fields. In this seminar, we will explore the impact of AI on gravitational wave transient astrophysics. By training machine learning models to recognize subtle patterns and signals, researchers can improve the efficiency and accuracy of detection algorithms, leading to rapid identification and categorization of transient events. The seminar will present how Machine Learning can help the detection and classification of gravitational wave transient signals.
11 June 2024 / 08:00 (UTC)
Sarvesh Gharat (Indian Institute of Technology Bombay, India)
Gamma Ray AGNs: Estimating Redshifts and Blazar Classification using Neural Networks with smart initialization and self-supervised learning
Abstract: Redshift estimation and the classification of gamma-ray AGNs represent crucial challenges in the field of gamma-ray astronomy. Recent efforts have been made to tackle these problems using traditional machine-learning methods. However, the simplicity of existing algorithms, combined with their basic implementations, underscores an opportunity and a need for further advancement in this area. Our approach begins by implementing a Bayesian model for redshift estimation, which can account for uncertainty while providing predictions with the desired confidence level. Subsequently, we address the classification problem by leveraging intelligent initialization techniques and employing soft voting. Additionally, we explore several potential self-supervised algorithms in their conventional form. Lastly, in addition to generating predictions for data with missing outputs, we ensure that the theoretical assertions put forth by both algorithms mutually reinforce each other.
14 Mai 2024 / 16:00 (UTC)
Javier Carrón Duque (Instituto de Física Teórica, Madrid, Spain)
Going Beyond Gaussianity and Isotropy in Cosmology: Minkowski Functionals and other statistics
Abstract: Homogeneous and isotropic Gaussian fields are fully described by their 2pt correlation function. However, these assumptions are not always met in the study of cosmological fields. In order to extract further information, the cosmological community has adopted a rich variety of statistics. One of the most interesting is Minkowski Functionals (MFs), powerful statistical tools used to describe the geometry and topology of observable fields. They have seen diverse applications, such as blind tests of non-Gaussianity in the Cosmic Microwave Background (CMB), enhancing parameter constraints, and analyzing Large Scale Structure (LSS). In this talk, I will demystify MFs, focusing on their mathematical underpinnings and demonstrating their versatility in cosmological research. I will particularly highlight recent advancements in applying MFs to CMB polarization, which opens new avenues for testing Gaussianity and isotropy. Finally, I'll introduce 'Pynkowski,' an accessible public Python package we developed for computing MFs and other statistics across various data types, also including theoretical predictions. This talk aims to elucidate the role of MFs and similar higher–order statistics in modern cosmology and encourage further exploration of their potential.
09 Apr 2024 / 08:00 (UTC)
Ming-Zhe Han (Purple Mountain Observatory, China)
Parametrized Neutron Star Equation of State with Neural Networks
Abstract: Neutron star equation of state (EoS) is the key to study the properties of cold dense matter. However, the first principle calculations of the EoS have very large uncertainties and are model dependent. Therefore, to use the multi-messenger data of NS to constrain the EoS, people usually use phenomenological models to describe NS EoSs. Parametric models with specific function form are easy to handle, while they are hard to describe some special EoSs. I will introduce a so called nonparametric EoS model based on the feed forward neural network, which can cover more EoS parameter space than the parametric ones. Then I will introduce how to use these models to constrain the NS EoS given the multi-messenger data of NS under the Bayesian framework.
12 Mar 2024 / 16:00 (UTC)
François Lanusse (CEA/CNRS, France)
A New Era of Multi-Modal Self-Supervised Learning for Astrophysics
Abstract: Deep Learning has seen a recent shift in paradigm, from training specialized models on dedicated datasets, to so-called Foundation Models, trained in a self-supervised manner on vast amounts of data and then adapted to solve specific tasks with state-of-the-art performance. This new paradigm has been exceptionally successful not only for large language models (LLMs) but in other domains such as vision models. However applications of this new approach in astrophysics are still very scarce, for reasons ranging from new architectures to the (surprising) lack of availability of suitable large scale datasets. In this talk, I will discuss our recent work on deploying such a Foundation Model approach in the context of representation learning for astronomical photometric and spectroscopic observations of galaxies. Our aim is to embed these inhomogeneous observations (e.g. different types of measurements, different instruments, etc...) into a shared latent space, in a completely self-supervised manner. These embeddings can then be used for a variety of downstream applications (e.g. redshift estimation, morphology classification) with very simple machine learning methods and reach near optimal performance. More specifically, I will present our AstroCLIP method which allows us to align embeddings between data modalities, and our more recent results on building highly effective image embeddings based on a vision transformer architecture. I will also comment on these results from an information theory point of view, and conclude on future prospects for this approach in the context of upcoming large scale galaxy surveys.
13 Feb 2024 / 08:00 (UTC)
Alan Heavens (Imperial College London)
Extreme Lossless Data Compression for Simulation-Based Inference
Abstract: Simulation-based inference (SBI) is a technique that is growing in popularity in astrostatistics, as the complexities of systematic errors, selection effects and so on often make a likelihood-based approach unfeasible. SBI (or likelihood-free inference, LFI) has its challenges, since the basic premise to obtain the Bayesian posterior is to run many simulations of the data, and record the parameters of those simulations that match the observed data. With M parameters and N data points, each simulation provides a point in an (M+N)-dimensional space, and if N is large, there is little hope of generating a simulation that closely matches all N data points (e.g. N is tens of millions for a Planck CMB map). Hence the data have to be massively compressed, and even the standard compressions, to power spectrum estimates for example, is not sufficient and SBI may one impossible. N needs to be reduced as much as possible, ideally to M. I will show how analytic techniques of Extreme Data Compression such as MOPED, and neural-network-based compression can effectively reduce N to M, and show that SBI can then do essentially loss-free inference from this typically very small set of numbers. For example, Type 1A supernova cosmology can be done with just 3 numbers.
09 Jan 2024 / 16:00 (UTC)
Reed Essick (Canadian Institute for Theoretical Astrophysics)
Adventures with Directed Acyclic Graphs in Gravitational Wave Astrophysics
Abstract: Hierarchical Bayesian inference has become a standard tool within Gravitational-Wave astrophysics, and the conditional dependencies assumed within the hierarchy can be conveniently expressed as a directed acyclic graph (DAG). I will introduce DAGs as useful shorthand for such models and show how they can clarify conflicting assumptions that are sometimes made in the literature. In particular, I will discuss the implications of the fact that all physical detection processes can only depend on latent variables through the observed data. Time permitting, I will also discuss a coarse-graining procedure and its application to outlier searches.
12 Dec 2023 / 08:00 (UTC)
Shay Zucker (Tel Aviv University, Israel)
New Approach to Periodicity Detection
Abstract: Periodicity is pervasive in astronomy in many forms: from rotation of asteroids, through the orbits of binary stars, to stellar orbits around the Galactic Center. Thus, detecting periodicity in astronomical data, which are often sparse and unevenly sampled, has always been a staple of astronomical research. Astronomers use an arsenal of methods to perform the task, none of them is perfect, and most of them rely on some arbitrary assumptions or parameters. The most popular ones are usually inspired by Fourier theory and essentially search for sinusoidal periodicities. This talk will present a novel approach to periodicity detection – the Phase Distance Correlation periodogram (PDC) - which is nonparametric, model-independent and computationally elegant. PDC is an application of some very recent developments in statistics, and it opens up new horizons in the field of periodicity detection. It can easily be extended to detect periodicities of new and unknown types, in various modalities of data, not necessarily in Astronomy. The talk will introduce the basic ideas of PDC, highlight its novelty, and demonstrate its advantages in some types of data.
14 Nov 2023 / 16:00 (UTC)
Johannes Buchner (MPE, Germany)
Exploring the space of parametric spaces in the space sciences
Abstract: Today, Monte Carlo-based Bayesian inference engines are the workhorses of astronomical research spanning cosmology, exoplanet characterization, and transients such as supernovas, etc. In large astrophysical surveys, such as the eROSITA all-sky survey, visual inspection of millions of fits is infeasible. Therefore, robust inference engines are paramount. I give a brief introduction into nested sampling and an overview of current research field into nested sampling methods from a systematic literature review. Then, I will present recent theoretical analyses on the reliability of one particularly robust nested sampling algorithm, MLFriends, implemented in the popular UltraNest package. Armed with confidence from theory and practice, we explore recent novel applications in high-energy astrophysics that take advantage of the unique properties of nested sampling. These include the study of heavily obscured black holes, hierarchical Bayesian models for over-dispersion to measure intrinsic source variability with eROSITA, and black hole demographics from multi-wavelength surveys. I close with an outlook to future research and the intersection of Bayesian inference and machine learning.
10 Oct 2023 / 08:00 (UTC)
Arman Shafieloo (Korea Astronomy and Space Science Institute, Korea)
On model selection, validation and reconstruction in the context of physical cosmology
Abstract: I revisit the old subject of comparing Bayesian and frequentist approaches in model selection and validation in the context of physical cosmology. I discuss caveats of both approaches dealing with unknowns and how using reliable non-parametric methods of reconstruction can help validating models without comparing them with each other. I discuss the importance of this subject in the era of Big precision data and how inaccuracies due to systematics or wrong assumptions can result to unreal discoveries.
12 Sep 2023 / 16:00 (UTC)
Stefano Rinaldi (Pisa University, Italy)
One mixture to rule them all: the infinite Gaussian mixture model applied to gravitational-wave astrophysics
Abstract: An astrophysical black hole is the compact object left after the explosion of a massive star. Stellar objects undergo several processes during their life and are influenced by the environment in which they are, e.g. dense clusters or isolated binary systems - the so-called formation channels. Similarly, the poorly understood physics regulating mass-loss in stellar winds, or the common envelope evolution, all leave an imprint on the black mass, spin and redshift distribution. In this talk, I will describe how we can use Bayesian non-parametric methods, which are powerful tools to perform inference without the need to specify a model, to infer the black hole population. In particular, I will present (H)DPGMM, a non-parametric model based on the Dirichlet Process Gaussian Mixture Model, and FIGARO, its implementation. Using such methods, features in the black hole distribution will arise naturally without the need of including them in the model, leaving astrophysicists tasked with explaining them in terms of formation channels and astrophysical processes.
11 July 2023 / 08:00 (UTC)
Yu-Yen Chang (National Chung Hsing University, Taiwan)
Unveiling Galaxies and AGNs by Machine Learning
Abstract: Modern astrophysicists use multi-wavelength observations to investigate galaxy evolution in our universe. With large surveys, we have investigated stellar masses, star formation rates, dust properties, AGN contributions, as well as morphology and shapes of galaxies. Recently, the increasing amount of astronomical data has led to a need for machine-learning methods. In this talk, I will show our recent work on classifying AGN host galaxies with HSC survey, as well as galaxy merger stages with MaNGA data by machine learning. These results can apply to future all-sky surveys, and help us to understand galaxy formation and evolution over cosmic time.
13 June 2023 / 16:00 (UTC)
Christian Robert (Paris Dauphine University, Paris)
Bayesian approaches to inferring the number of components in a mixture
Abstract: Estimating the model evidence - or marginal likelihood of the data - is a notoriously difficult task for finite and infinite mixture models and we reexamine here different Bayesian approaches and Monte Carlo techniques advocated in the recent and not so recent literature, as well as novel approaches based on Geyer (1994) reverse logistic regression technique, Chib (1995) algorithm, and Sequential Monte Carlo (SMC). Applications are numerous. In particular, testing for the number of components in a finite mixture model or against the fit of a finite mixture model for a given dataset has long been and still is an issue of much interest, albeit yet missing a fully satisfactory resolution. Using a Bayes factor to find the right number of components K in a finite mixture model is known to provide a consistent procedure. We furthermore establish the consistence of the Bayes factor when comparing a parametric family of finite mixtures against the nonparametric 'strongly identifiable' Dirichlet Process Mixture (DPM) model.
9 May 2023 / 08:00 (UTC)
Jesus Torrado (University of Padua & INFN, Italy)
Machine-learning Bayesian inference with Gaussian Processes
Abstract: Machine-learning can make Bayesian inference possible for extremely slow pipelines, which would require months with traditional Monte Carlo. I will present the GPry algorithm for fast Bayesian inference of general (non-Gaussian) posteriors with a moderate number of parameters. Gpry is intended as a drop-in replacement for traditional likelihood-based Monte Carlo methods, without need for pre-training or GPUs. Our algorithm is based on generating a Gaussian Process surrogate model of the log-posterior, aided by a Support Vector Machine classifier that excludes extreme or non-finite values, and a GP prior that accounts for the expected dynamical range of the posterior in different dimensionalities. An active learning scheme allows us to reduce the number of required posterior evaluations by two orders of magnitude compared to traditional Monte Carlo inference.
11 April 2023 / 16:00 (UTC)
Daniel Mortlock (Imperial College London & Stockholm University)
Bayesian inference of astronomical populations
Abstract: Much of astrophysics is based on the study of objects at the population level, for which constrains are provided by catalogues selected from astronomical surveys. The complicated combination of measurement noise, sample selection, catalogue compression and (possibly) follow-up observations means that the inference of astronomical populations is a rich statistical problem. In this talk I look at the Bayesian approach to this task, focussing on a range of examples, including: high-redshift quasar selection; redshift distributions from galaxy surveys; and Hubble constant measurements from both supernovae and gravitational wave observations of compact mergers.
14 March 2023 / 08:00 (UTC)
Takahiro Nishimichi (Yukawa Institute, Kyoto University, Japan)
Cosmological inference based on emulators
Abstract: Extracting cosmological parameters from large scale structure is a central problem in observational cosmology. Unlike cosmic microwave background radiation, the large scale structure is a nonlinear process and hence one need to resort to expensive numerical simulations to generate accurate forward models, with which one can perform bayesian inference. Given that one has to explore typically multi-dimensional cosmological parameter space, a naive use of numerical simulations in inference problems is prohibitively costly. Emulation is one solution to this problem. The idea is that after accumulating a large, but realistic number of simulations with the cosmological parameters varied, simple statistical models, such as Gaussian Processes or Neural Networks, can be built such that they can work as a cheaper surrogate model. In the talk, I will cover our recent attempts to analyze observational data from Sloan Digital Sky Survey as well as Subaru Hyper Suprime-Cam using emulators. I will also mention our ongoing attempts to couple the process of emulator building and statistical inference to efficiently accumulate knowledge constrained by observational data.
14 February 2023 / 16:00 (UTC)
Francisco Villaescusa-Navarro (Simons Foundation, USA)
Simulation based inference for cosmology
Abstract: In this talk I will discuss the importance of performing parameter inference in cosmology together with the problems associated with such a task. I will then motivate the usage of deep learning to surpass the problems associated with traditional inference methods. Methods such as simulation-based inference require the usage of very large datasets to properly train the models and capture the full distribution. I will then present the simulations of the CAMELS project, the largest set of state-of-the-art cosmological hydrodynamic simulations run to-date. Next, I will show some examples of how these methods allow us to extract cosmological and astrophysical information from very small scales at the field-level without knowing the likelihood of the data. I will then discuss how these methods may open a new window that will allow us to study cosmology and astrophysics in a unified and more accurate way.
10 January 2023 / 08:00 (UTC)
Yuan-Sen Ting (Australian National University, Australia)
Galaxy Merger Reconstruction with Generative Graph Neural Networks
Abstract: A key yet unresolved question in modern-day astronomy is how galaxies formed and evolved. The quest to understand how galaxies evolve has led many semi-analytic models to infer the galaxy properties from their merger history. However, most classical approaches rely on studying the global connection between dark matter haloes and galaxies, often reducing the study to crude summary statistics. The recent advancement in graph neural networks might open up many new possibilities; graphs are a natural descriptor of galaxy progenitor systems – any progenitor system at a high redshift can be regarded as a graph, with individual progenitors as nodes on the graph. In this presentation, I will discuss the power of generative graph neural networks to connect high-redshift progenitor systems with local observables. We showed that based on equivariant graph normalizing flow, our model could robustly recover the progenitor systems, including their masses, merging redshifts and pairwise distances at redshift z = 2 conditioned on their z = 0 properties. In addition, the probabilistic nature of our model enables other downstream tasks, including detecting anomalies in galaxy configuration and identifying subtle correlations of the progenitor features.
13 December 2022 / 16:00 (UTC)
Jessi Cisewski Kehe (University of Wisconsin, USA)
Getting something out of nothing: topological data analysis for cosmology
Abstract: The transference from data to information is a key component of many areas of research in astronomy and cosmology. This process can be challenging when data exhibit complicated spatial structures, such as the large-scale structure (LSS) of the Universe. Methods that target shape-related features may be helpful for summarizing qualitative properties that are not retrieved with standard techniques. Topological data analysis (TDA) provides a framework for quantifying shape-related properties of data. Persistent homology is a popular TDA tool that offers a procedure to represent, visualize, and interpret complex data by extracting topological features which may be used to infer properties of the underlying structures. Persistent homology is used to find different dimensional holes in a dataset across different scales, where zero-dimensional holes are clusters, one-dimensional holes are closed loops, two-dimensional holes are voids, and so on. The information is summarized in a persistence diagram, which may be used for further analysis such as visualization, inference, or classification. I will give an overview of persistent homology and discuss its use in some cosmology applications, such as discriminating LSS under varying cosmological assumptions.
8 November 2022 / 08:00 (UTC)
Aaron Robotham (University of Western Australia, Australia)
Exploring the Limits of the Bayesian Universe: How to Tackle Breadth and Depth
Abstract: In the last 10 years it is notable that students are much more enthused about projects involving “machine learning”, but it is important we do not lose perspective on the scientific insights still offered by a comprehensive and pragmatic application of Bayesian principles. Here I will discuss the work my group has undertaken over the last 7 years to build up a fully generative model of galaxies that has culminated in the Bayesian modelling software ProFuse (Robotham+ 2022). The positive is that encoding our knowledge and ignorance in a Bayesian manner has opened up new insights to physical processes that form galaxies, the negative is that this approach has a high barrier of entry which can be a poor fit to a modern ~3 year PhD.
11 October 2022 / 16:00 (UTC)
Jason McEwen (University College London, UK)
Bayesian model selection for likelihood-based and simulation-based inference
Abstract: In the study of cosmology, where we seek to uncover an understanding of the fundamental physical processes underlying the origin, content, and evolution of our Universe, we are not blessed with the ability to perform experiments -- rather, we have only one Universe to observe. In this scenario, while we are of course interested in estimating the parameters of models describing the physical processes observed, we are often most interested in selecting the best underlying model, which has given rise to the prevalence of Bayesian model selection in cosmology and astrophysics. While I will motivate recent developments in Bayesian model selection from problems in cosmology and astrophysics, I will mostly focus on new methodological advances. I will discuss new approaches that leverage ideas across statistics, optimization and machine learning to bring to bear the respective strengths of these paradigms to the highly computationally challenging problem of Bayesian model selection. In particular, I will review the learnt harmonic mean estimator for both likelihood-based and simulation-based inference and the proximal nested sampling framework for high-dimensional model selection.
13 September 2022 / 08:00 (UTC)
Shiro Ikeda (Institute of Statistical Mathematics, Japan)
Data Science and Imaging the Black Hole Shadow
Abstract: In April 2019, the EHTC (Event Horizon Telescope collaboration) released the first image of the M87 black hole shadow and this May, the black hole shadow image of our Milky Way galaxy was released. The EHTC has more than 300 members from different backgrounds and countries. I have been involved in this project as a data scientist for more than 8 years and collaborated with EHTC members to develop a new imaging method. The EHT is a huge VLBI (very long baseline interferometer), which is different from optical telescopes in that a lot of computation is required to obtain a single image. Black hole imaging is also very interesting from the data scientific viewpoint. In this talk, I will explain how the new imaging technique has been developed and the final images were created through our discussions.
9 August 2022 / 8:00 (UTC)
Eric Thrane (Monash University, Australia)
The population properties of merging compact binaries from gravitational waves
Abstract: With the publication of the third gravitational-wave transient catalog (GWTC-3), the LIGO and Virgo Collaborations have confidently identified 90 signals from merging compact binaries. By analysing the morphology of each gravitational waveform, we are able to work out the masses and spins of the black holes and neutron stars that source these signals. By studying the distributions of black-hole mass, spin, and distance, we are painting a picture of the population properties of compact mergers, providing clues about the fate of massive stars and telling us how and where binary black holes are assembled. In this talk, I describe how we use Bayesian hierarchical modelling to study merging black holes. I emphasise the importance of model checking to avoid faulty conclusions from model misspecification.
14 June 2022 / 16:00 (UTC)
Roberto Trotta (SISSA, Italy)
A general-purpose method for supervised learning under covariate shift with applications to observational cosmology [PDF] [VIDEO]
Abstract: Supervised machine learning will be central in the analysis of upcoming large-scale sky surveys. However, selection bias for astronomical objects yields labelled training data that are not representative of the unlabelled target data distribution. This affects the predictive performance with unreliable target predictions and poor generalization. I will present StratLearn, a novel and statistically principled method to improve supervised learning under such covariate shift conditions, based on propensity score stratification. In StratLearn, learners are trained on subgroups ("strata") of the data conditional on the propensity scores, leading to improved covariate balance and much-reduced bias in the model fit. This general-purpose method has promising applications in observational cosmology, improving upon existing conditional density estimation of galaxy redshift from Sloan Data Sky Survey (SDSS) data; in the classification of Supernovae (SNe) type Ia from photometric data, it obtains the best reported AUC on the SNe photometric classification challenge. If time allows, I'll discuss the embedding of such a classification into a full analysis of SNe data to estimate cosmological parameters.
10 May 2022 / 8:00 (UTC)
Takahiko Matsubara (KEK, Japan)
Weakly non-Gaussian formulas of cosmological random fields
Abstract: In cosmology, various kinds of random fields play important roles, including 3D distributions of galaxies and other astronomical objects, 2D distributions of cosmic microwave background radiations and weak lensing fields, etc. The features of non-Gaussianity in these fields contain a lot of cosmological information. In this talk, I will present a method to analytically describe the effects of weak non-Gaussianity in field statistics, such as the peak abundance, peak correlations, Minkowski functionals, etc.
12 April 2022 / 16:00 (UTC)
Josh Speagle (Toronto University, Canada)
Statistical Challenges in Stellar Parameter Estimation from Theory and Data
Abstract: Understanding how the Milky Way fits into the broader galaxy population requires studying the properties of other galaxies as well as our own. While it is possible to observe the structure of other galaxies directly, understanding the structure of our own Galaxy from within requires inferring the 3-D positions, velocities, and other properties of billions of stars. In this talk, I will discuss some of the statistical challenges in inferring stellar parameters from modern photometric surveys such as Gaia and SDSS, focusing in particular on issues with existing theoretical stellar models, the complex nature of parameter uncertainties, and scalability to large datasets. I will then describe some ongoing work trying to solve these problems using a combination of physics-inspired but data-driven calibrations along with a host of inference approaches including gradient-based optimization, grid-based searches, importance sampling, and nested sampling.
8 March 2022 / 8:00 (UTC)
Renate Meyer (University of Auckland, New Zealand)
Bayesian Nonparametric Spectral Analysis for Gravitational Wave Astronomy
Abstract: The new era of gravitational wave astronomy truly began on September 14, 2015 with the sensational first direct observation of gravitational waves, when LIGO recorded the signature of the merger of two black holes. In the subsequent three observing runs of the LIGO/Virgo network, gravitational waves from 90 compact binary mergers have been announced. Moreover, the future space-based observatory LISA will open the low-frequency window on gravitational waves and will be sensitive to a vast range of sources including the white dwarf binaries in our Milky Way and mergers of supermassive black holes at the centre of galaxies. Beyond signal detection, a major challenge has been the development of statistical methodology for estimating the physical waveform parameters and quantifying their uncertainties. Bayesian methods and MCMC have played a key role in this new era of astrophysics. I will review the statistical methods that enabled the estimation of the waveform parameters. This challenge has also been a key driver for new theoretical and methodological advancements in statistics. The call for a more robust instrumental noise characterization aiming at a simultaneous estimation of noise characteristics and gravitational wave parameters has triggered ongoing research into Bayesian nonparametric analysis of time series. Starting with nonparametric Bayesian approaches to spectral density estimation of univariate Gaussian stationary time series, I will review novel extensions to multivariate, non-Gaussian, and locally stationary time series.
8 February 2022 / 16:00 (UTC)
Dan Foreman-Mackey (Flatiron Institute, USA)
Methods for scalable probabilistic inference
Abstract: Most data analysis pipelines in astrophysics now have some steps that require detailed probabilistic modeling. As datasets get larger and our research questions get more ambitious, we are often pushing the limits of what our statistical frameworks are capable of. In this talk, I will discuss recent (and not so recent) developments in the field probabilistic programming that enable rigorous Bayesian inference with large datasets, and high-dimensional or computationally expensive models. In particular, I will highlight some scalable methods for time series analysis using Gaussian Processes, and some of the open source tools and computational techniques that have the potential to be broadly useful for accelerating inference in astrophysics.
11 January 2022 / 8:00 (UTC)
Makoto Uemura (Hiroshima University, Japan)
Follow-up observations of galactic transients with astroinformatics
Abstract: Methods such as Bayesian inference and machine learning have recently become readily available, and are used not only on state-of-the-art data, but also in various aspects of astronomical research. Our group has a 1.5-m optical telescope in Hiroshima, Japan, which is used for time-domain astronomy. I will talk about the applications of astroinformatics tools for the follow-up observations of galactic transients. The topics include the discriminative model of transients, decision making based on the information theory, and reconstruction of the geometrical structure of the accretion disk.
14 December 2021 / 16:00 (UTC)
Torsten Enßlin (Max-Planck-Institute for Astrophysics, Germany)
Information field theory, from astronomical imaging to artificial intelligence
Abstract: Turning the raw data of an instrument into high-fidelity pictures of the Universe is a central theme in astronomy. Information field theory (IFT) describes probabilistic image reconstruction from incomplete and noisy data exploiting all available information. Astronomical applications of IFT are galactic tomography, gamma- and radio- astronomical imaging, and the analysis of cosmic microwave background data. This talk introduces into the basic ideas of IFT, highlights its astronomical applications, and explains its relation with contemporary artificial intelligence.
9 November 2021 / 8:00 (UTC)
Ilya Mandel (Monash University, Australia)
Astrostatistics in Gravitational-wave Astronomy
Abstract: Modern astronomical data sets often raise challenges associated with selection biases, accounting for confusion between backgrounds and foregrounds, and performing inference on big data with complex, multi-parameter models. I will discuss some of the techniques that we used to attack these problems, illustrating them with results from gravitational-wave observations of merging black holes … and a bit further afield.
12 October 2021 / 16:00 (UTC)
Jeffrey D. Scargle (NASA Ames Research Center, US)
Adventures in Astronomical Time Series Analysis
Abstract: Welcome to a tour of the volatile, highly active Universe — in stark contrast to earlier serene '"clockwork’’ visions. Innovative data analysis techniques have illuminated explosive physical processes animating these systems. Examples include a Fourier transform suited to the irregular sampling characteristic of much astronomical data, but time domain techniques will be emphasized for these applications: gamma-ray activity in the Crab Nebula, gamma-ray bursts, active galactic nuclei, and gravitational waves. I hope this talk will change some of the ways you carry out statistical data analysis.