Seminars

Many seminars during the year will be delivered by the Purdue University Fort Wayne LDS Team on topics of interest to students, researchers, and local and global communities. The regular time of the seminars is Wednesday from 1:30pm to 2:30pm.

Title: The Laboratory of Data Science: Visiting Programs, Academic Programs, and Community Programs

Date: Tuesday 3rd October 2023 @ 1pm

Alessandro Maria Selvitella

Department of Mathematical Sciences

Purdue University Fort Wayne


Title: Introduction to Business Intelligence

Date: Thursday 19th October 2023 @ 2pm

Adolfo Coronado

Department of Computer Science

Purdue University Fort Wayne


Title: Thresholds and dynamical changes indicated in historical food web data from hydrologically modified rivers

Date: Tuesday 31st October 2023 @ 1pm - KT 226

Jeff Anderson

Department of Mathematical Sciences

Purdue University Fort Wayne

Abstract. Constructed to maintain navigable channel depths, extensive systems of locks and dams have essentially converted the upper Mississippi River (UMR) and lower Ohio River (LOR) into sequences of managed reservoirs. Amidst resulting significant changes to both rivers in the past 100 years, LOR remains mostly a constricted main channel, with few islands, backwaters, and floodplain lakes, while UMR retains such complexity with fewer side channels. Studies of stable carbon isotopes from museum samples provide insights on possible impacts to the balance of food web reliance on bottom (benthic) and water column (pelagic) resources. A previously introduced model on alternate states of shallow lakes, from clear and weedy to turbid and weedless, is adapted for pre- and post-dam data on stable carbon isotopes from fish, snails, and mussels versus stage height recorded at the downstream end of both study areas. Analysis of equilibria and corresponding dynamics suggest decreases in variability, stability, and thresholds of potential ecological importance. While the numerical work has been conducted through a time of dramatic increases in computing power, these results have been consistent through successive model updates.

Title: An Effective Meaningful Way to Evaluate Survival Models

Date: Tuesday 20th February 2024 @2pm EST

Shi-ang Qi

Department of Computing Science

University of Alberta

Abstract. One straightforward metric to evaluate a survival prediction model is based on the Mean Absolute Error (MAE) – the average of the absolute difference between the time predicted by the model and the true event time, over all subjects. Unfortunately, this is challenging because, in practice, the test set includes (right) censored individuals, meaning we do not know when a censored individual actually experienced the event. In this paper, we explore various metrics to estimate MAE for survival datasets that include (many) censored individuals. Moreover, we introduce a novel and effective approach for generating realistic semi-synthetic survival datasets to facilitate the evaluation of metrics. Our findings, based on the analysis of the semi-synthetic datasets, reveal that our proposed metric (MAE using pseudoobservations) is able to rank models accurately based on their performance, and often closely matches the true MAE – in particular, is better than several alternative methods.

Title: Predicting Individual Survival Distributions using ECG: A Deep Learning Approach utilizing Features Extracted by a Learned Diagnostic Model

Date: Tuesday 12th March 2024 @ 2pm EDT

Weijie Sun

Department of Computing Science

University of Alberta

Abstract. In the field of healthcare, individual survival prediction is important for personalized treatment planning. This study presents machine learning algorithms for predicting Individual Survival Distributions (ISD) using electrocardiography (ECG) data in two different formats. The models, which predict time until death, are developed and evaluated on a large, population-based cohort from Alberta, Canada. Our results demonstrate that models trained on raw ECG waveforms significantly outperform those trained on traditional ECG measurements in several metrics, including concordance index, hinge L1 loss, margin L1 loss, and margin truncated L1 loss. Additionally, the integration of predicted probabilities from wide-range diagnostic tasks not only enhances our ISD models' performance but also makes them significantly superior to other models across all evaluation metrics in individual survival prediction tasks. This innovative approach highlights the potential to leverage insights from diagnostic models for prognostic tasks, such as individual survival prediction. These findings could have far-reaching implications for the development of personalized treatment plans and open new avenues for future research in survival prediction using ECGs.

Title: On the Speed and Memory Scalability of Spectral Clustering

Date:  Tuesday 12th March 2024 @ 3.30pm EST

Room: Kettler Hall - KT G52

Gabriel Chen

Department of Mathematics and Statistics

Hope College

Abstract. Spectral clustering has emerged as a very effective clustering approach; however, it is computationally expensive when applied to large data sets. As a result, there has been considerable effort in the machine learning community to develop fast, approximate spectral clustering algorithms that are scalable in time and/or memory. Notably, most of those methods use a small set of landmark points selected from the given data. In this talk we present two landmark-based scalable spectral clustering algorithms that are developed based on novel document-term and bipartite graph models. We demonstrate their superior performance on some benchmark data sets. Finally, we also mention some recent work in the setting of massive data sets which cannot be fully loaded into computer memory.

Title: Policing Mental Illness on Social Media: A case study in data science ethics

Date:  Wednesday 13th March 2024 @ noon EST

Room: Kettler Hall - KT 241

Nina Atanasova

Department of Philosophy and Religious Studies

Cleveland State University

Title: Building and Exploring the Human Reference Atlas with Virtual Reality

Date:  Wednesday 10th April 2024 @ noon EST

Room: Kettler Hall - KT 247 & WebEx

Andreas "Andi" Bueckle

Cyberinfrastructure for Network Science Center & Luddy School of Informatics, Computing, and Engineering

Indiana University Bloomington

Abstract. The Human Reference Atlas (HRA, https://humanatlas.io), funded by the NIH Human Biomolecular Atlas Program (HuBMAP, https://commonfund.nih.gov/hubmap) and other projects, engages 17 international consortia to create a spatial reference of the healthy adult human body at single-cell resolution. The specimen, biological structure, and spatial data that define the HRA are disparate in nature and benefit from a visually explicit method of data integration. Virtual reality (VR) offers unique means to enable users to explore complex data structures in a three-dimensional (3D) immersive environment. On a 2D desktop application, the 3D spatiality and real-world size of the 3D reference organs of the atlas is hard to understand. If viewed in VR, the spatiality of the organs and tissue blocks mapped to the HRA can be explored in their true size and in a way that goes beyond traditional 2D user interfaces. Added 2D and 3D visualizations can then provide data-rich context. In this paper, we present the HRA Organ Gallery, a VR application to explore the atlas in an integrated VR environment. Presently, the HRA Organ Gallery features 65 3D reference organs, 729 published and mapped tissue blocks from 307 demographically diverse donors and 19 providers that link to 6,000+ datasets; it also features prototype visualizations of cell type distributions and 3D protein structures. We present two HRA User Stories (Improving Cell Type Annotations and Predicting Spatial Origin of Tissue), and we outline our plans to support two biological use cases with the HRA Organ Gallery: on-ramping novice and expert users to HuBMAP data available via the Data Portal (https://portal.hubmapconsortium.org), and quality assurance/quality control (QA/QC) for HRA data providers. A third use case for telling Embedded Data Stories is in preparation. We welcome expert feedback at this presentation. This abstract is adapted from the publication at https://doi.org/10.3389/fbinf.2023.1162723. 


Title: How "Data Science, AI, and Digital Twins" could help us predict the future?

Date:  Thursday 25th April 2024 @ 7pm EST

Room: Neff Hall - Room 101

Guang Lin

Department of Mathematics

Purdue University

Abstract. During this talk Dr. Lin will discuss examples of data science, AI, and digital twins, and their applications in life science, and engineering applications.

Title: TBA

Date: TBA

Reza Nemati

Department of Computer and Data Sciences

Case Western Reserve University

Title: TBA

Date: TBA

Bernd Buldt

Department of Mathematical Sciences

Purdue University Fort Wayne


Title: TBA

Date: TBA

Todor Cooklev

Department of Electrical and Computer Engineering

Purdue University Fort Wayne


Title: TBA

Date: TBA

Carl Drummond

Department of Physics

Purdue University Fort Wayne


Title: TBA

Date: TBA

Derek Brown

Department of Mathemtical Sciences

Purdue University Fort Wayne


Title: TBA

Date: TBA

Jemila Hamid

Departments of Mathematics and Statistics

University of Ottawa

Coming Soon!