The Center for Time Domain Informatics hosts a series of talks under the auspices of the NSF CDI grant “Real-time Classification of Massive Time-series Data Streams." The seminar series brings in speakers that work on the interface of astronomy, statistics, and computer science. Many speakers stay at UC Berkeley for several days. If you would like to arrange to meet the speaker, contact Joey Richards.
Upcoming Talks:
Title: Wavelet Spectral Analysis for Irregularly Sampled Time Series
Speaker: Debashis Mondal University of Chicago, Dept. of Statistics Tuesday, April 10, 2012 | 2:00 - 3:00 PM | 1011 Evans Hall Abstract:
Examples of irregularly sampled time series abound in many areas of science, but their analyses introduce numerous statistical challenges. For example, the standard wavelet variance analysis, which has emerged as an accepted statistical approach for studying the variability of time series, is intended to be applied only to regularly sampled time series, and can not easily cope with irregular or unevenly sampled data. After a brief review of the existing approaches to analysis of irregularly sampled time series, we will explore two new statistical approaches to this problem. First, we will discuss approximate scale-based analysis of variance for time series based upon the so-called Slepian wavelets. In many ways, this approach is comparable to the multitaper spectral approach based on the notion of generalized Slepian sequences and others. Slepian wavelets arise as eigenfunctions of an energy maximization problem in a pass band of frequencies. For irregularly sampled time series data, we will extend the notion of dyadic scales, and derive corresponding statistical theory for Slepian-based wavelet variances. We will show via a simulation study how our method adapts to sampling times with mild irregularities. Second, we will consider a general framework for estimating wavelet variances for irregularly sampled time series. Here, we will extend the work of Mondal and Percival (2010), and propose new inference procedures. We will demonstrate potential use of our methods on a light curve data from variable stars. If time permits, we will discuss situations where wavelet approaches might have an edge over more traditional spectral approaches, such as the famous Lomb-Scargle periodogram, the multitaper spectral analysis, and the work of Masry.
This is joint work with Don Percival.
Title: Measuring the Undetectable: Finding faint rare objects in large astronomical surveys Speaker: David Hogg Associate Professor of Physics, NYU Friday, April 13, 2012 | 11:00 AM - 12:00 PM | 1011 Evans Hall
Standard astronomical practice (make catalogs, search in catalogs, follow up with image analysis or new data) prevents us from making many important kinds of discoveries in large archived astronomical surveys. I will show that we can measure the proper motions or variabilities of sources that are too faint to be detected at any of the imaging epochs in a multi-epoch survey (like SDSS Stripe 82 or LSST). I will show methodologies we are pursuing to measure the properties of stellar populations that are unresolved (because of confusion) or gravitational lenses that are unresolved (because of poor PSF). If we can find ways to avoid the lossy step of making catalogs, we might be able to enormously amplify the scientific return from the next generation of astronomical imaging surveys. Warning: Some of my proposals may appear unrealistic!
Title: Uncovering the Morphological Properties of Galaxies at High Redshift
 Speaker: Peter Freeman Carnegie Mellon University, Dept. of Statistics Tuesday, March 13, 2012 | 1:00 - 2:00 PM | 1011 Evans Hall Abstract:
A thorough investigation of cosmological theories of hierarchical structure formation requires the accurate and precise identification of galaxy morphologies as a function of redshift. One aspect of any such investigation is the determination of the galaxy merger rate and its time evolution. Astronomers identify mergers by finding complex substructures within a galaxy's project brightness profile, such as double nuclei. Because visual classification is time consuming both in development of infrastructure and implementation, and because working with full images is computationally inefficient, astronomers fall back on using nonparametric summary statistics in their attempt to detect mergers. However, they are finding that established statistics that work well in the local Universe do not work as efficiently at high redshift. In this talk, I will discuss new summary statistics that we have developed at Carnegie Mellon that markedly improve merger detection, as well as new avenues for morphological description that we are beginning to explore.
(Special Mini-CDI Seminar) Title: Exploring the Dark Universe: Computational, Statistical, and Data Challenges
Speaker: Katrin Heitmann Intelligence & Space Research Division, Los Alamos Nat. Lab Wednesday, March 14, 2012 | 1:00 - 2:00 PM | 1011 Evans Hall Abstract:
Cosmology -- the study of the origin, evolution, and constituents of the Universe -- is in a scientifically very exciting phase. Two decades of surveying the sky have culminated in the celebrated ``Cosmological Standard Model''. Yet, two of its key pillars, dark matter and dark energy -- together accounting for 95% of the mass-energy of the Universe -- remain mysterious. Deep fundamental questions demand answers: What is dark matter made of? Why is the Universe's expansion rate accelerating? Should general relativity be modified? What is the nature of primordial fluctuations? What is the exact geometry of the Universe? To address these burning questions, survey capabilities are being exponentially improved. Next-generation observatories will open new routes to understand the true nature of the ``Dark Universe''. These observations will pose tremendous challenges on many fronts -- from the sheer size of the data that will be collected (more than a hundred Petabytes) to its modeling and interpretation. The interpretation of the data requires sophisticated simulations on the world's largest supercomputers. The cost of these simulations, the uncertainties in our modeling abilities, and the fact that we have only one Universe that we can observe opposed to carrying out controlled experiments, all come together to create a major test for computational, statistical, and data analysis methods. In this talk I will give a very brief introduction to the Dark Universe and outline the challenges ahead. To combat these challenges, close cross-disciplinary collaborations between physicists, statisticians, and computer scientists will be crucial. I will discuss two examples of successful collaborative work and propose new tasks where cosmologists urgently need help from the data and statistics community.
Title: Weighing the Dark Sky
 Speaker: Ethan Anderes Assistant Professor, UC Davis Dept. of Statistics Thursday, March 8, 2012 | 12:30 - 1:30 PM | 1011 Evans Hall Abstract:
This talk presents a new estimation method for mapping dark matter density from observed CMB intensity and polarization fields. Our method uses Bayesian techniques to estimate the average curvature of the lensing gravitational potential over small local regions. These local curvatures are then used to construct an estimate of a low pass filter of the projected dark matter density. By utilizing Bayesian/likelihood methods one can easily overcome problems with missing and/or non-uniform pixels and problems with partial sky observations (E and B mode mixing, for example). Moreover, our methods are local in nature which allow us to easily model spatially varying beams and are highly parallelizable. We note that our estimates do not rely on the typical Taylor approximation which is used to construct estimates of the gravitational potential by Fourier coupling. This work is based on collaboration with Lloyd Knox (Physics, UC Davis) and Alexander van Engelen (Physics, McGill).
Title: Machine Learning Methods for Real Time and Archival Classification of Astronomical Transients and Variables
Speaker: Umaa Rebbapragada Principal Investigator, JPL Machine Learning and Instrument Autonomy (MLIA) Group Thursday, March 1, 2012 | 12:30 PM - 2:00 PM | 1011 Evans Hall Abstract:
This talk presents machine learning techniques for archival and real time classification of astronomical transients and variables. These methods were developed as part of collaborations with the Australian Square Kilometre Array Pathfinder's (ASKAP) Variable and Slow Transients (VAST) survey and the Palomar Transient Factory. VAST is an unprecedented wide-field survey that will enable novel scientific discoveries related to known and unknown classes of radio transients and variables. Archival (offline) classification occurs in the data archive in order to enable source type queries from end users. Real time (online) classification occurs during real time processing in order to trigger appropriate follow up when transient phenomena are detected. Both tasks require automated methods to classify sources in the time domain. In order to estimate classification performance in both settings, and determine best practices prior to the launch of ASKAP's BETA in 2012, we performed a study of machine learning techniques on simulated VAST light curves. Through this study, we identify candidate light curve characterizations and classification algorithms, and study performance under different observing strategies and levels of noise in both the offline and online settings. Our results show that the choice of light curve characterization influences classification performance more strongly than learning algorithm selection, and that a combination of feature sets yields best performance. The Palomar Transient Factory (PTF) is a fully-automated synoptic sky survey that has demonstrated real-time discovery of astronomical transient events. I will briefly discuss preliminary results on the binary classification of optical transient and variable sources from PTF as real or bogus. Talk Slides
Title: Data Mining to Perform Novel Science on Large Astronomical Datasets
 Speaker: Nick Ball Assistant Research Officer, Herzberg Institute for Astrophysics
Tuesday, February 14, 2012 | 11 AM - 12:30 PM | 1011 Evans Hall Abstract:
I will give an overview of my work since 2004 on using data mining to perform novel science on large astronomical datasets, focusing on (1) Morphological galaxy classification in the Sloan Digital Sky Survey (SDSS) using artificial neural networks; (2) Star-galaxy separation in the SDSS using decision trees; (3) Photometric redshifts of SDSS and Galaxy Evolution Explorer quasars using k nearest neighbors; and (4) Separation of galaxies that are Virgo members from those in the background using unsupervised clustering in the Next Generation Virgo Cluster Survey. Several of these represent somewhat pioneering studies that have much relevance to the current and future era of terascale and petascale data. For each study, I will provide a brief review of the result, then relate the result to more recent developments and possible future directions. Finally, I will provide a few general remarks on the current state of Astroinformatics, and its future prospects, from the point of view of an astronomer who utilizes data mining.
Title: Mapping the Galactic Halo in the Era of Wide-Area Surveys 
Speaker: Branimir Sesar Postdoctoral Scholar, Astronomy Department, Caltech Tuesday, January 24, 2012 | 1:00 PM - 2:30 PM | 1011 Evans Hall Studies of the Galactic stellar halo can help constrain the formation history of the Milky Way and galaxy formation processes in general. In the past few years, these studies have benefited greatly from the wealth of data provided by wide-area, multi-wavelength, and multi-epoch surveys such as SDSS, LINEAR, and PTF. I will present an analysis of Galactic halo structure and substructure traced by main-sequence and RR Lyrae stars selected from these wide-area surveys and will outline some of the challenges and solutions to handling such large data sets in astronomy.
Title: Modeling stellar variability and correlated noise in photometric time-series with Gaussian processes
|