Organizers: Peter Craigmile, Bora Ferlengez, Vincent Martinez, Indranil SenGupta
Time: Thursday, 4:20-5:20pm EST
Location: Hunter East 1042 (and Zoom)
Zoom: link Passcode: HMSC2025
Current Schedule (Autumn 2025)
September 25 (Meet the Faculty) (Recording)
Be'eri Greenfeld, Hunter College (Department of Mathematics and Statistics)
Title: How Complicated Can Words Be?
Abstract: Consider an infinite sequence of digits, bits, or letters from the ABC. Can we efficiently quantify how complicated it is? We will discuss asymptotic measures of complexity for infinite words and formal languages --- a fundamental topic with deep connections to computer science, dynamical systems, number theory, algebra, and more. Time permitting, we will highlight recent research projects involving students and discuss potential directions for future work, with an open invitation for students to get involved.
October 2
(Yom Kippur)
October 9
Max Weinreich, Harvard University (Department of Mathematics)
Title: Chaos and structure in mathematical billiards
Abstract: Mathematical billiards is a mathematical model for anything that bounces: light, molecules, or the cue ball in the game of pool. This talk will explore how dynamicists quantify the intuitive notion of "chaos", or entropy, for billiard systems. Classical conjectures of Birkhoff and Ivrii predict that most billiards are chaotic. I will present my results towards these conjectures in the case of billiard tables bounded by algebraic curves, where the dynamics can even be studied over the complex numbers.
October 16 (Recording)
William Christensen, Brigham Young University (Department of Statistics)
Title: The Marginal Mahalanobis Method for Detecting Cellwise Outliers
Abstract: In many modern applications, there is a growing need to identify specific problematic entries within a dataset, referred to as cellwise outliers. These differ from the more commonly studied casewise outliers, which focus on identifying entire rows in the dataset as anomalous. While numerous statistical methods exist for detecting casewise outliers (also called anomaly detection or exception mining), relatively few methods address the challenge of pinpointing problematic values within individual observations. We propose a Mahalanobis distance-based chi-squared test statistic designed to detect cellwise outliers. Using Monte Carlo simulations, we evaluate the performance of our method against existing approaches across datasets generated from various multivariate distributions. Our results demonstrate that the proposed method is computationally efficient and often outperforms competing techniques in accurately identifying cellwise outliers under a wide range of conditions.
October 23 (Recording)
Jonathan Stanfill, Ohio State University (Department of Mathematics)
Title: Contour integrals and zeta-functions
Abstract: zeta-functions often encode surprising information about the structure of number systems and even nature. Because of this, the study of zeta-functions is important in both mathematics and physics. The most famous zeta-function is the Riemann zeta-function, which appears to encode deep information about the prime numbers. However, zeta-functions can more generally be associated with sequences of complex numbers. This talk will address how contour integrals can be used to study zeta-functions and why this perspective is so powerful. This is based on recent joint work with Guglielmo Fucci (East Carolina University) and Mateusz Piorkowski (KTH Royal Institute of Technology).
October 30
Blanca Marmolejo, Bluetab-an IBM company (Data Engineer)
Title: TBA
Abstract: TBA
November 6
Shuhan Tang, Novartis (Global Drug Development, Advanced Quantitative Sciences)
Title: From Design to Decision: The Biostatistician’s Role in Drug Development
Abstract: Biostatisticians play a pivotal role in the design, conduct, analysis, and interpretation of clinical trials. This presentation explores the multifaceted responsibilities of biostatisticians throughout the clinical development lifecycle. From protocol development and statistical analysis planning to data monitoring and regulatory submission, biostatisticians ensure scientific rigor, data integrity, and compliance with regulatory standards. Emphasis will be placed on the application of ICH E9(R1) principles, including estimand framework and sensitivity analyses, as well as the integration of innovative trial designs and real-world evidence. Through case examples and best practices, this session highlights how biostatisticians contribute to evidence generation, decision-making, and the development of new therapies for patients worldwide.
November 13
Marc Scott, New York University (Department of Applied Statistics, Social Science, and Humanities)
Title: The use of history in life course studies: new ideas for an old problem
Abstract: The life course perspective considers the entire history of individuals as the primary unit of analysis. There is empirical evidence and socio-behavioural theory supporting the notion that history matters for later outcomes. The sequence analysis community of researchers has established robust methods for organising individual historical pathways into typologies that inform narratives of the life course. These are often linked to social position, providing deeper understanding of the constraints and variation in how lives evolve. Yet these "types" are also commonly viewed as rough proxies for something more subtle and potentially predictive of later life outcomes, such as health, income or labour force attachment. In this study, we approach the question of "what matters" in life course histories using two methods: clustering to produce typologies and a mathematical projection based on categorical functional data analysis (CFDA). We compare and contrast the performance of these methods on a known dataset to uncover strengths and limitations of each methodology, informing best practices and new methodological research.
November 20
Abhijit Campanerkar, College of Staten Island (Department of Mathematics)
Title: Graphs, growth and geometry
Abstract: We study the growth rate of the number of spanning trees of a sequence of planar graphs that diagrammatically converge to a planar lattice graph. A surprising fact about the spanning tree entropy for many planar lattice graphs is that its value is closely related to hyperbolic geometry. We conjecture sharp upper and lower bounds for the spanning tree entropy of any planar lattice graph. We explain the context and recent progress for our conjecture, which lies at the intersection of hyperbolic geometry, knot theory, number theory, probability, and graph theory.
November 27
(Thanksgiving)
December 4
David Newstein, Statistical Consultant (Freelance)
Title: A Comparison of Regression Models: An Example from Cardiovascular Disease Epidemiology
Abstract: Electron beam computer tomography (EBCT) has enabled the rapid imaging of the coronary arteries, revealing calcium deposits in these arteries. The amount of this coronary artery calcium (CAC) can be quantified numerically. The presence and amount of CAC deposits is highly correlated with the amount of atherosclerotic plaque present in the coronary arteries. In addition to having prognostic value for individual patients, these scans can be used for epidemiologic purposes, with the aim of identifying novel risk factors for coronary heart disease (CHD) and cardiovascular disease (CVD). In this case, multiple regression models have been utilized to remove the effects of confounders and thus identify “independent” risk factors for CHD and CVD. The type of regression models that have generally been utilized have entailed log transforming the calcium score and then using linear OLS regression models, with potential risk factors and covariates considered as independent variables in the models. These types of models are also known as the lognormal regression models. It is known that the standard deviation of measurement error, with regard to the calcium score measured by EBCT, is proportional to the true value of the calcium score, with the exception of small scores, mainly due to the partial volume effect of EBCT. In other words, as the true amount of CAC increases, the amount of error that the EBCT measured value of CAC has, increases proportionally to this true amount. Because of this, another type of regression model, which also utilizes a constant coefficient of variation assumption, the gamma log link model, may be more effective in identifying novel risk factors for CAC and CVD then the lognormal regression model. In addition to analyses of epidemiologic data, simulations are used to assess the validity of this statistical conclusion.
December 11
Saad Mouti, University of New Haven (Department of Mathematics)
Title: TBA
Abstract: TBA