Past Seminars - Spring 2018

4pm-5pm Thursday January 18, 2018 TBA

Speaker: Dr. Amy Braverman, Jet Propulsion Laboratory, California Institute of Technology

Title: An Uncertainty Quantification Framework for Remote Sensing Retrievals

Abstract:

Remote sensing data sets produced by NASA and other space agencies are the result of complex algorithms that infer geophysical state from observed radiances using retrieval algorithms. The processing must keep up with the downlinked data flow, and this necessitates computational compromises that affect the accuracies of retrieved estimates. The algorithms are also limited by imperfect knowledge of physics and of ancillary inputs that are required. All of this contributes to uncertainties that are generally not rigorously quantified by stepping outside the assumptions that underlie the retrieval methodology. In this talk we discuss a practical framework for uncertainty quantification that can be applied to a variety of remote sensing retrieval algorithms. Ours is a statistical approach that uses Monte Carlo simulation to approximate the sampling distribution of the retrieved estimates. We will discuss the strengths and weaknesses of this approach, and provide a case-study example from the Orbiting Carbon Observatory 2 mission.


About the speaker: Dr. Braverman is a Principal Statistician at the Jet Propulsion Laboratory. Her research interests include information-theoretic approaches for the analysis of massive data sets, data fusion methods for combining heterogeneous, spatial and spatio-temporal data, and statistical methods for the evaluation and diagnosis of climate models, particularly by comparison to observational data. She is elected ASA fellow and has published article in top journals including JASA and Technometrics.

Past Seminars - Fall 2017

UC-OBAIS Seminar Series: 11:00am-12:30pm Friday November 17, 2017 SAP Theater (in Lindner Hall basement level)

Speaker: Dr. Yuhong Yang, Department of Statistics, University of Minnesota Twin Cities

Title: Cross-Validation for Optimal and Reproducible Statistical Learning

Abstract:

In data mining and statistical learning, we frequently encounter the task of comparing different methods/algorithms to reach a final choice for pure prediction or a scientific understanding/interpretation of a regression relationship. Cross-validation provides a powerful tool to address the matter. Unfortunately, there are seemingly widespread misconceptions on its use, which can lead to unreliable conclusions. In this talk, we will address the subtle issues involved and present results of minimax optimal regression learning and consistent selection of the best method for the data. In addition, we will propose proper cross-validation tools for model selection diagnostics that will cry foul at an impressive-looking but not really reproducible outcome from a sparse-pattern-hunting method in the wild west of learning with a huge number of covariates.

This presentation is based on:

Yongli Zhang and Yuhong Yang (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, vol. 187, 95-112.

Ying Nan and Yuhong Yang (2014) Variable Selection Diagnostics Measures for High-Dimensional Regression. Journal of Computational and Graphical Statistics, vol. 23, 636-656.

About the speaker: Professor Yuhong Yang is Professor of Statistics at the University of Minnesota Twin Cities. His research interests include model selection and model combining, high-dimensional data analysis, multi-armed bandits with covariates, forecast combinations, and statistics in personalized medicine. He has been the Associate editors for Annals of Statistics, Journal of Statistical Planning and Inference, Annals of Institute of Statistical Mathematics, Statistics Surveys. He is a fellow of Institute of Mathematical Statistics.



SPECIAL EVENT 9am-10am Thursday Oct 19, 2017 Rm 120 WCharlton

Speaker: Dr. Grant Weller, Savvysherpa

Title: Data Science Jobs in Industry: What to Look For (and what Savvysherpa can offer)

Abstract:

“Data scientist” was recently named the Top Job in America by recruiting website Glassdoor.com, and demand for these positions is projected to exceed the supply of talent by over 50% in 2018. However, data science positions are not all created equal: there is high variability in job function, skill requirements, and culture among industry data scientists. In this talk, I hope to provide guidance on the industry job search process by posing a set of questions to identify if a given role is a good fit for you.

The second part of this presentation will turn the focus to Savvysherpa. I will give background on our company, provide some examples of our data science team’s work, and attempt to answer the questions I previously posed, from our perspective.

About the speaker: Dr. Weller is a Senior Scientist at Savvysherpa, Inc. in Minneapolis, Minnesota. He received PhD in Statistics from the Department of Statistics at Colorado State University at Fort Collins. Prior to jointing Savvysherpa, he was a Visiting Assistant Professor in the Department of Statistics at Carnegie Mellon University. He is broadly interested in the use of statistical and computational methodology to solve problems in science, business, and the intersection of the two.


11:00am-12:00pm Tuesday Oct 3, 2017 Rm 757 Baldwin

Speaker: Dr. Vishesh Karwa, Department of Statistics, Ohio State University

Title: Differential privacy and Statistical Inference

Abstract:

Differential privacy has emerged as a powerful tool to reason rigorously about privacy and confidentiality issues. In its purest form, differential privacy limits direct access to raw data, allowing interaction only through a noisy interface. This requires modified approaches to statistical inference, even for basic problems such as estimating confidence intervals for a one dimensional distribution.

In this talk, I will introduce the definition of differential privacy, followed by some of its key properties. I will then present a framework for performing statistical inference under the constraint of differential privacy and its connections to measurement error and missing data methods. The primary focus will be on constructing conservative finite sample differentially private confidence intervals for the mean of a normal population. These intervals serve as important building blocks for more complex inference and data analysis problems. If time permits, I will also describe examples of sharing social network data under the constraint of differential privacy.

About the speaker: Dr. Karwa received PhD in Statistics from the Pennsylvania State University in 2014. Vishesh Karwa joined the Department of Statistics in 2017. Prior to joining Ohio State, he spent two years at Harvard in the department of statistics and department of computer science as a Post Doctoral fellow and one year at CMU as a research scientist. His addresses the challenges in performing statistical inference using complex and/or massive data such as networks, high-dimensional contingency tables, and data that are missing or incomplete.


11:30am-12:30pm Tuesday September 26, 2017 Rm 309 Braunstein

Speaker: Dr. Qingcong Yuan , Department of Statistics, Miami University

Title: A New Class of Measures for Testing Independence with Its Applications in High Dimensional Data

Abstract:

A new class of measures for testing independence between two random vectors, using characteristic functions is proposed. By choosing a particular weight function in the class, we study a new independence index and its property. Sample versions and their asymptotic properties using different estimation approaches are developed. We demonstrate the advantage of our methods via simulations and real data analysis. In particular, we develop a two-stage sufficient variable screening method, which works especially well for categorical response categorical responses. Simulation examples and real data analysis are provided to illustrate the effective use of our method in to illustrate the effective use of our method in high dimensional data analysis.

About the speaker: Dr. Yuan received PhD in Statistics from University of Kentucky in 2017. Her research focuses on informational measures to reflect the dependency between two random vectors, sufficient variable selection and sufficient dimension reduction for high dimensional data.