Talk Abstracts

Opening Keynote Talk: Data Science: Endless Possibilities

Dr. Hoda Eldardiry (Virginia Tech)

There is an abundance of data available in various application domains. Over the past two decades, I got to experience how learning from data can help provide very useful insights within a broad range of applications. In this talk, I will share some examples. I will also discuss how one can start from a real-world problem and formulate a data science problem -- in simple terms. This includes exploring "what-is-needed" and "what-is-possible" throughout the process.

Closing Keynote Talk: Accelerating Medical Research with Real-World Data and Scientific Insight

Dr. Rebecca Hubbard (University of Pennsylvania)

The availability of vast databases of health information derived from real-world data (RWD) such as electronic health records (EHR) has generated enormous enthusiasm for conducting research with RWD. As illustrated during the current pandemic, RWD have the potential to provide timely answers to urgent scientific questions. However, there are important limitations to RWD including lack of gold-standard outcomes for many conditions and heterogeneity in data quality across patients that can result in erroneous findings if not handled carefully. In this presentation, we consider both the potential and challenges for conducting research using RWD, focusing on the motivating example of early-life risk factors for pediatric type 2 diabetes. Past studies investigating early-life risk factors for type 2 diabetes have used longitudinal cohort data painstakingly collected over the course of decades. In contrast, EHR data from routine clinical care can provide longitudinal information on early-life risk factors as well as subsequent health outcomes for large cohorts of children without requiring any additional data collection. In this study, we introduce a Bayesian joint phenotyping and BMI trajectory model to address data quality challenges in an EHR-based study of early-life BMI and type 2 diabetes in adolescence. We demonstrate that RWD coupled with modern methodologic approaches can improve efficiency and timeliness of studies of childhood exposures and rare health outcomes. The overarching objective of this presentation is to illustrate the use of RWD to produce timely and reliable evidence, guided by knowledge of the scientific context and rigorous methodology.

Tutorial Talk: ADEPT: A Framework to Explain Complex Data Science Concepts to Non-Data Scientists

Dr. Jennifer Van Mullekom (Virginia Tech, Director of SAIG)

Are you ADEPT at explaining complex data science concepts to non-data scientists? If you want to improve your communication skills then this is the workshop for you. Last week I was in a meeting where Virginia Tech data scientists were presenting results to a company that is very early in its data science journey. The comment from one of the project leaders was “Well, we barely understand some of the concepts you are discussing so there is no way that our lower level operations team will get this---especially since English is their second language.” This comment illustrates the importance of communication skills in team-based data science. The end users of your modeling and analysis have to understand how the information you provide them was derived and how it should and shouldn’t be used. You will have ongoing team dialogues throughout projects that rely on some level of shared understanding of data science concepts.

But aren’t great communicators “just born with it”? Certainly, some people are naturally talented at communication but everyone can learn tips and techniques to improve their communication skills. ADEPT is one such framework. ADEPT stands for analogy, diagram, example, plain English, and technical definition. Data scientists excel at the technical definition element of ADEPT but have little understanding of how to develop the other four components. This workshop includes an introduction to the framework, examples, and a group activity where you get to design your own ADEPT framework for the data science concept of your choice.

Tutorial Talk: Introduction to R for Data Science

Dr. Leanna House (Virginia Tech)

Dr. House will present foundations for coding in R via RStudio. At a pace that is accessible to the audience, the tutorial will start with 1) how to open, navigate, save, and close R, and 2) continue with the remaining concepts, as time permits: creating objects in R and observing both the difference and importance of object classes; import data; subset data; identify, replace, and/or remove missing data values; summarize data quantitatively; create basic graphs, such as bar graphs, scatterplots, and parallel plots.


Poster session topics by Virginia Tech students:

(check back.... these topics will be added)