Seminars
Many seminars during the year will be delivered by the Purdue University Fort Wayne LDS Team on topics of interest to students, researchers, and local and global communities. The regular time of the seminars is Wednesday from 1:30pm to 2:30pm.
Title: The Laboratory of Data Science: Visiting Programs, Academic Programs, and Community Programs
Date: Tuesday 3rd October 2023 @ 1pm
Alessandro Maria Selvitella
Department of Mathematical Sciences
Purdue University Fort Wayne
Title: Introduction to Business Intelligence
Date: Thursday 19th October 2023 @ 2pm
Adolfo Coronado
Department of Computer Science
Purdue University Fort Wayne
Title: Thresholds and dynamical changes indicated in historical food web data from hydrologically modified rivers
Date: Tuesday 31st October 2023 @ 1pm - KT 226
Jeff Anderson
Department of Mathematical Sciences
Purdue University Fort Wayne
Abstract. Constructed to maintain navigable channel depths, extensive systems of locks and dams have essentially converted the upper Mississippi River (UMR) and lower Ohio River (LOR) into sequences of managed reservoirs. Amidst resulting significant changes to both rivers in the past 100 years, LOR remains mostly a constricted main channel, with few islands, backwaters, and floodplain lakes, while UMR retains such complexity with fewer side channels. Studies of stable carbon isotopes from museum samples provide insights on possible impacts to the balance of food web reliance on bottom (benthic) and water column (pelagic) resources. A previously introduced model on alternate states of shallow lakes, from clear and weedy to turbid and weedless, is adapted for pre- and post-dam data on stable carbon isotopes from fish, snails, and mussels versus stage height recorded at the downstream end of both study areas. Analysis of equilibria and corresponding dynamics suggest decreases in variability, stability, and thresholds of potential ecological importance. While the numerical work has been conducted through a time of dramatic increases in computing power, these results have been consistent through successive model updates.
Title: An Effective Meaningful Way to Evaluate Survival Models
Date: Tuesday 20th February 2024 @2pm EST
Shi-ang Qi
Department of Computing Science
University of Alberta
Abstract. One straightforward metric to evaluate a survival prediction model is based on the Mean Absolute Error (MAE) – the average of the absolute difference between the time predicted by the model and the true event time, over all subjects. Unfortunately, this is challenging because, in practice, the test set includes (right) censored individuals, meaning we do not know when a censored individual actually experienced the event. In this paper, we explore various metrics to estimate MAE for survival datasets that include (many) censored individuals. Moreover, we introduce a novel and effective approach for generating realistic semi-synthetic survival datasets to facilitate the evaluation of metrics. Our findings, based on the analysis of the semi-synthetic datasets, reveal that our proposed metric (MAE using pseudoobservations) is able to rank models accurately based on their performance, and often closely matches the true MAE – in particular, is better than several alternative methods.
Title: Predicting Individual Survival Distributions using ECG: A Deep Learning Approach utilizing Features Extracted by a Learned Diagnostic Model
Date: Tuesday 12th March 2024 @ 2pm EDT
Weijie Sun
Department of Computing Science
University of Alberta
Abstract. In the field of healthcare, individual survival prediction is important for personalized treatment planning. This study presents machine learning algorithms for predicting Individual Survival Distributions (ISD) using electrocardiography (ECG) data in two different formats. The models, which predict time until death, are developed and evaluated on a large, population-based cohort from Alberta, Canada. Our results demonstrate that models trained on raw ECG waveforms significantly outperform those trained on traditional ECG measurements in several metrics, including concordance index, hinge L1 loss, margin L1 loss, and margin truncated L1 loss. Additionally, the integration of predicted probabilities from wide-range diagnostic tasks not only enhances our ISD models' performance but also makes them significantly superior to other models across all evaluation metrics in individual survival prediction tasks. This innovative approach highlights the potential to leverage insights from diagnostic models for prognostic tasks, such as individual survival prediction. These findings could have far-reaching implications for the development of personalized treatment plans and open new avenues for future research in survival prediction using ECGs.
Title: On the Speed and Memory Scalability of Spectral Clustering
Date: Tuesday 12th March 2024 @ 3.30pm EST
Room: Kettler Hall - KT G52
Gabriel Chen
Department of Mathematics and Statistics
Hope College
Abstract. Spectral clustering has emerged as a very effective clustering approach; however, it is computationally expensive when applied to large data sets. As a result, there has been considerable effort in the machine learning community to develop fast, approximate spectral clustering algorithms that are scalable in time and/or memory. Notably, most of those methods use a small set of landmark points selected from the given data. In this talk we present two landmark-based scalable spectral clustering algorithms that are developed based on novel document-term and bipartite graph models. We demonstrate their superior performance on some benchmark data sets. Finally, we also mention some recent work in the setting of massive data sets which cannot be fully loaded into computer memory.
Title: Policing Mental Illness on Social Media: A case study in data science ethics
Date: Wednesday 13th March 2024 @ noon EST
Room: Kettler Hall - KT 241
Nina Atanasova
Department of Philosophy and Religious Studies
Cleveland State University
Title: Building and Exploring the Human Reference Atlas with Virtual Reality
Date: Wednesday 10th April 2024 @ noon EST
Room: Kettler Hall - KT 247 & WebEx
Andreas "Andi" Bueckle
Cyberinfrastructure for Network Science Center & Luddy School of Informatics, Computing, and Engineering
Indiana University Bloomington
Abstract. The Human Reference Atlas (HRA, https://humanatlas.io), funded by the NIH Human Biomolecular Atlas Program (HuBMAP, https://commonfund.nih.gov/hubmap) and other projects, engages 17 international consortia to create a spatial reference of the healthy adult human body at single-cell resolution. The specimen, biological structure, and spatial data that define the HRA are disparate in nature and benefit from a visually explicit method of data integration. Virtual reality (VR) offers unique means to enable users to explore complex data structures in a three-dimensional (3D) immersive environment. On a 2D desktop application, the 3D spatiality and real-world size of the 3D reference organs of the atlas is hard to understand. If viewed in VR, the spatiality of the organs and tissue blocks mapped to the HRA can be explored in their true size and in a way that goes beyond traditional 2D user interfaces. Added 2D and 3D visualizations can then provide data-rich context. In this paper, we present the HRA Organ Gallery, a VR application to explore the atlas in an integrated VR environment. Presently, the HRA Organ Gallery features 65 3D reference organs, 729 published and mapped tissue blocks from 307 demographically diverse donors and 19 providers that link to 6,000+ datasets; it also features prototype visualizations of cell type distributions and 3D protein structures. We present two HRA User Stories (Improving Cell Type Annotations and Predicting Spatial Origin of Tissue), and we outline our plans to support two biological use cases with the HRA Organ Gallery: on-ramping novice and expert users to HuBMAP data available via the Data Portal (https://portal.hubmapconsortium.org), and quality assurance/quality control (QA/QC) for HRA data providers. A third use case for telling Embedded Data Stories is in preparation. We welcome expert feedback at this presentation. This abstract is adapted from the publication at https://doi.org/10.3389/fbinf.2023.1162723.
Title: How "Data Science, AI, and Digital Twins" could help us predict the future?
Date: Thursday 25th April 2024 @ 7pm EST
Room: Neff Hall - Room 101
Guang Lin
Department of Mathematics
Purdue University
Abstract. During this talk Dr. Lin will discuss examples of data science, AI, and digital twins, and their applications in life science, and engineering applications.
Title: TBA
Date: TBA
Reza Nemati
Department of Computer and Data Sciences
Case Western Reserve University
Title: TBA
Date: TBA
Bernd Buldt
Department of Mathematical Sciences
Purdue University Fort Wayne
Title: TBA
Date: TBA
Todor Cooklev
Department of Electrical and Computer Engineering
Purdue University Fort Wayne
Title: TBA
Date: TBA
Carl Drummond
Department of Physics
Purdue University Fort Wayne
Title: TBA
Date: TBA
Derek Brown
Department of Mathemtical Sciences
Purdue University Fort Wayne
Title: TBA
Date: TBA
Jemila Hamid
Departments of Mathematics and Statistics
University of Ottawa
Coming Soon!