Math & Stat Student Research Presentation Day

Image Source: Direct Numerical Simulation of Shear Flow Turbulence

History: The Math & Stat Department will be hosting Student Research Presentation Day in May 2025 to showcase and celebrate the research endeavors of its undergraduate and graduate students. It will be held towards the end of each spring semester in all future iterations. All students are welcome to attend.

Day: Saturday, May 10, 2025

Time: 9:00am-3:00pm

Location: Hunter West Faculty Dining Room, 8th Floor

(Accessible from 7th Floor, stairs near escalator)

Conference Schedule

8:30-9:00am Opening Remarks (Breakfast)

9:00-10:00am Session 1

9:00-9:20 Austin Ho

9:20-9:40 Randy Lu

9:40-10:00 Connor Song

10:00-10:30am Break

10:30-11:30am Session 2

10:30-10:50 Michael Pallante

10:50-11:10 Jonathan Jaimangal

11:10-11:30 Gabe Levine

11:30am-1:00pm Poster Presentations (Lunch)

Proma Ahmed (Topological Hydrodynamics)

Wenxuan Yu (Analog Forecasting Methods for Dynamical Systems)

Louis Pearson (Stability Theorems in Bayesian Estimation)

Danielle Enterman (Spatial-Temporal Statistics of NYC Pollution)

1:00-1:30pm Plenary Talk (by Thomas Mathew)

1:30-2:30pm Session 3

1:30-1:50 Luca Benga

1:50-2:10 Ross Lauterbach

2:10-2:30 Ryan Vaz

2:30-2:45pm Closing Remarks

Plenary Talk

Thomas Mathew (Tech Lead, Cisco Systems)

Title: Mixture Models and Network Threats

Abstract: Security teams managing large computer networks can face upwards of hundreds of security alerts in a day. The volume of alerts leads to alert fatigue and an inability to properly triage on alerts. Security teams would like a system that helps prioritize and rank the importance of alerts. An algorithm that segments alerts based on behavioral characteristics provides the starting point to solve the security team's dilemma. Today's talk presents one such method to segment network traffic via a multinomial mixture model. We'll go over modelling choices that led to selecting a mixture model for the data as well as an analysis of the parameter estimation required to build the model. The multinomial mixture model readily captures the count based nature of the alerts and identifies segments that represent behavioral patterns of interest. We identify six behavioral patterns that are present in all large corporate networks and represent different types of network threats (botnets, ransomware, etc).

Session 1

Austin Ho (Pure Math MA, Graduating Spring 2026)

Title: “New” Shell Models for Turbulence

Abstract: The Navier-Stokes equation (NSE) describes the motion of viscous, incompressible fluids. In this talk, we will discuss a class of toy models – called "shell models" that aim to reduce the complexity of the NSE. Specifically, shell models truncate the Fourier-transformed version of the NSE while still preserving some of its physical properties. We present two new shell models and the methodology of how they were obtained. We will also probe into their structural properties and compare the new models against the existing variants.

Randy Lu (Math/Stat & Computer Science BA/MA, Applied Math, Graduating Spring 2025)

Title: Generalizing a Dyadic Euler Model

Abstract: The Euler equations are partial differential equations modeling the motion of incompressible inviscid fluids including their turbulence. Due to the difficulty in studying them 'simpler' models have been devised; one such class of model is an infinite system of ordinary differential equations called a dyadic model. We investigate one example that preserves some important properties of the Euler equations such as non-linearity and energy conservation and show one way to generalize it while maintaining some of its key properties.

Yuang Connor Song (Statistics MA, Alumni Winter 2021)

Title: A Speed-based Estimator of Signal-to-Noise Ratios

Abstract: We present an innovative method to measure the signal-to-noise ratio (SNR) in a Brownian motion model. That is, the ratio of the mean to the standard deviation of the Brownian motion. Our method is based on the method of moments estimation of the drawdown and drawup speeds in a Brownian motion model, where the drawdown process is defined as the current drop of the process from its running maximum and the drawup process is the current rise of the process above its running minimum. The speed of a drawdown of K units (or a drawup of K units) is then the time between the last maximum (or minimum) of the process and the time the drawdown (or drawup) process hits the threshold K. Our estimator only requires the record values of the process and the times at which deviations from the record values exceed a certain threshold, whereas the uniformly minimum-variance unbiased estimator (UMVUE) requires the entire path of the process. We derive the asymptotic distributions of both estimators and compare them.

Session 2

Michael Pallante (Pure Math MA, Graduating Spring 2025)

Title: Parameter Recovery in Linear Dynamical Systems

Abstract: This is a presentation of part of a thesis on parameter identifiability and recovery in linear dynamical systems. Here we present the chapter which analyzes a nudging algorithm that recovers all n-squared parameters of an n-dimensional homogeneous system using continuous observations of the system state along multiple trajectories. We prove convergence of the algorithm, and in the case n=2 we provide numerical simulations that support our results.

Jonathan Jaimangal (Applied Math MA, Graduating Spring 2025)

Title: Discrete Prolate Spheroidal Sequences

Abstract: In many real-world settings (such as communications, radar, and audio processing) we often seek signals that are well-concentrated in both time and frequency. While the Uncertainty Principle prevents perfect localization, Discrete Prolate Spheroidal Sequences (DPSS) provide the best possible trade-off. This talk introduces DPSS through the Discrete Fourier Transform and the construction of the Prolate Matrix, formed by combining time-limiting and band-limiting operations. The eigenvectors of this matrix (DPSS) achieve near-optimal concentration and have broad applications in real-world problems. We explore the mathematical structure behind these sequences, the role of the time-bandwidth product, and how eigenvalue behavior remains stable across dimensions. This work was carried out under the supervision of Professor Azita Mayeli (The Graduate Center and QCC), in collaboration with Luis Gomez-Reyes and Proma Tanjim Ahmed (Hunter College).

Gabe Levine (Applied Math MA, Graduating Spring 2025)

Title: Ensemble Machine Learning Models for Singing Voice Deepfake Detection

Abstract: Tools to create music with generated vocals (i.e. prompt based music generation, vocal swapping) have greatly improved in both quality and ease of use, leading to an explosion of music with synthetic vocals. While there is robust research for detecting synthetic speech and vocal spoofing, identifying synthetic singing voices presents a unique set of challenges and is a growing area of research. In this presentation, we evaluate existing synthetic speech detection models on a new generated vocal dataset and also propose a Machine Learning ensemble approach as a solution.

Session 3

Luca Benga (Hunter High School, Class of 2025)

Title: Mathematical models for speed climbing applied to data collected on competitors in recent World Cup events

Abstract: Speed climbing is one of the newest Olympic sports, debuting at the 2020 Tokyo Olympics. With many races decided by hundredths of a second, speed climbing quickly gained recognition as the fastest sport at the Paris 2024 Olympics. Speed climbing appeals to data scientists since it uses a standardized 15-meter wall, making it easy to compare times and strategies across a vast array of competitions and competitors. Surprisingly, however, to the best of our knowledge, there has been little rigorous analysis of a professional level race. In this paper, we model data compiled from the 2023 World Cup events in Wujiang, China and Salt Lake City, USA, analyzing both numerical and categorical variables. Examples of quantitative variables include the reaction time displayed in the video for each athlete, along with the total time, or split times, obtained by running the recording for each athlete frame by frame and estimating the exact point at which each section is reached. An example of a binary variable is the skips strategy, which draws attention to the holds each athlete omits on their run. Another example of a categorical variable is the round designation - either round 1 or round 2 - which refers to the order of athletes' runs. We explored these variables extensively, built several general linear models for athlete performance and used model selection to determine the best predictive models. We found that reaction times are normally distributed and appear to be very weakly correlated from one race to another. Counter-intuitively, however, they appear to have minimal bearing on the race result, despite making up a portion of the overall time. Another interesting observation is that many athletes attempt a more aggressive skip strategy in their second run, omitting a greater number of holds. This is either because they either already recorded a viable time for qualification in Round 1 and can afford the risk, or because they felt the need for substantial improvement. In ongoing work, we have been focusing on expanding the analysis, using data from additional World Cup events for both men and women.

Ross Lauterbach (Statistics MA, Graduating Spring 2025)

Title: Quantifying and Comparing NBA Player Career Momentum Using Statistical Methods

Abstract: Momentum is one of the most widely referenced yet poorly defined concepts in sports. In the NBA, commentators and fans routinely describe players as “heating up” or “catching fire,” often attributing shifts in performance to an intangible momentum factor. Despite its prominence in narrative and analysis, momentum is a measure that has been hard to verify empirically. This paper introduces a statistical approach to capture player momentum over the course of an NBA career using smoothed performance trajectories. By constructing game-by-game momentum data and powerful visualizations, we aim to identify sustained periods of elevated or diminished performance and quantify the uncertainty around them. We also take a deep dive into methods of calculation and modeling using momentum.

Ryan Vaz (Applied Math MA, Graduating Spring 2026)

Title: Optimizing the Analysis of Missing Data in Medical Datasets with a Temporal dependance: a Topological Data Analysis Approach

Abstract: Incomplete and missing data pose significant challenges for developing reliable machine learning models, particularly in temporal contexts. This paper examines the complexities of missing data mechanisms and their impact on predictive modeling accuracy. Traditional imputation methods, while commonly used, often introduce biases that can compromise model performance, especially in non-stationary datasets characterized by trends and seasonality. We present a novel methodology that synergistically integrates Topological Data Analysis (TDA) with machine learning and connectionist algorithms to improve the management of missing data, free from the constraints of stationarity assumptions. This approach is aimed at significantly enhancing the performance of predictive analytics.

Page updated

Google Sites

Report abuse