The Schedule

"When something is important enough, you do it even if the odds are not in your favor."

~ Elon Musk, Engineer

Day 1

November 30th

13:30 - 14:30

Veridical data science: the practice of responsible data analysis and decision making

Veridical data science aims at responsible, reliable, reproducible, and transparent data analysis and decision-making. Predictability, computability, and stability (PCS) are three core principles towards veridical data science. They embed the scientific principles of prediction and replication in data-driven decision making while recognizing the central role of computation. Based on these principles, the PCS framework consists of a workflow and documentation (in R Markdown or Jupyter Notebook) for the entire data science life cycle from problem formulation, data collection, data cleaning to modeling and data result interpretation and conclusions.

Employing the PCS framework in causal inference and analyzing data from clinical trial VIGOR, we developed staDISC for stable discovery of interpretable subgroups via calibration for precision medicine. The subgroups discovered by staDISC using the VIGOR data is validated to a good extent with the APPROVe study.

18:00 - 19:00

Dimension Reduction and Variable Selection for High-Dimensional Multivariate Linear Regression

Narayanaswamy Balakrishnan

This talk will consist of two parts. In the first part, I will discuss reduced rank regression with matrix projections for high-dimensional multivariate linear regression, and present some technical results, simulation study and a case study illustrating the results and methods. In the second part of the talk, I will discuss envelope-based reduced rank regression for high-dimensional multivariate linear regression and present the corresponding results, and make some comparative comments with the first part.

Short Talks and Posters

Day 2

December 1st

18:00 - 19:00

Vision Data Science

Victor Patrangenaru

At each point of time, we live in a 3D space, thus natural scenes data records should be 3D, while in fact data is stored as 1D or 2D images: satellite pictures, showing that we are living in a thin layer of air, first and second generation DNA sequences, initially stored as images, and digital cameras images offer such examples of data. Emulating bilateral colored human vision, machine vision is based on 3D projective shape retrieval of scenes, from their RGB camera images. Once the 3D information is extracted, data may be represented on certain metric spaces, that often have a smooth structure, or that of a stratified space, thus opening the formidable doors to the realm of geometric and algebraic topological data analysis of 3D scenes extracted from image data. A few basic examples of 3D machine vision analysis is presented here; this is joint work with Rob Paige (MST), Daniel Osborne(FAMU), Mingfei Qiu, Ruite Guo, K. David Yao, David Lester, Yifang Deng, Seunghee Choi and Michael Crane.

Short Talks and Posters

Day 3

December 2nd

18:00 - 19:00

Making Sense of Noisy Data: Why and How?

Grace Y. Yi

Thanks to the advancement of modern technology in acquiring data, massive data with diverse features and big volume are becoming more accessible than ever. The impact of big data is significant. While the abundant volume of data presents great opportunities for researchers to extract useful information for new knowledge gain and sensible decision making, big data present great challenges. A very important, sometimes overlooked challenge is the quality and provenance of the data. Big data are not automatically useful; big data are often raw and involve considerable noise.

Typically, the challenges presented by noisy data with measurement error, missing observation and high dimensionality are particularly intriguing. Noisy data with these features arise ubiquitously from various fields including health sciences, epidemiological studies, environmental studies, survey research, economics, and so on. In this talk, I will discuss some issues induced from noisy data and how these features may challenge inferential procedures.

Short Talks and Posters

Day 4

December 3rd

18:00 - 19:30

Do you trust this computer?

Look for the Youtube link in Gather "Data" Town!

Short Talks and Posters

Day 5

December 4th

12:30 - 13:30

Panel Discussion

Andriy Bezuhlyy

Prof. Ronald Friedman

Jim Kunce

Robert E. Neher

Tammy Toscos

Prof. Yvonne Zubovic

Prof. Mark Daniel Ward

Short Talks and Posters

Page updated

Google Sites

Report abuse

The Schedule

"When something is important enough, you do it even if the odds are not in your favor."

~ Elon Musk, Engineer

November 30th

Veridical data science: the practice of responsible data analysis and decision making

Dimension Reduction and Variable Selection for High-Dimensional Multivariate Linear Regression

December 1st

Vision Data Science

December 2nd

Making Sense of Noisy Data: Why and How?

December 3rd

Do you trust this computer?

December 4th

Panel Discussion

Contact us: aselvite@pfw.edu & klfoster@bsu.edu | 2101 E. Coliseum Blvd., Fort Wayne IN, USA 4680 | 260-481-6475