Association of Women in Mathematics at UCSD

COLLOQUIUM

Upcoming Talks

Every talk is scheduled for Thursday at 4:00pm PST, unless stated otherwise. Please email Ruth Luo at ruluo(at)ucsd.edu to join the mailing list.


Optimal transport in machine learning: In this talk, I will give an introduction to optimal transport, which has evolved as one of the major frameworks to meaningfully compare distributional data. The focus will mostly be on machine learning, and how optimal transport can be used efficiently for clustering and supervised learning tasks. Applications of interest include image classification as well as medical data such as gene expression profiles.


Past Talks


From Queueing Theory to Modern Stochastic Networks: A Mathematical Perspective: Waiting in a queue (or a line) for some type of service is a commonplace experience. People do this at a grocery store, bank, amusement park or DMV, to give just a few examples. These customer service systems feature inherent randomness, which impacts performance. Modern computer and communications systems, manufacturing processes, transportation systems and even biological networks experience similar stochastic effects. Stochastic network theory is the study of the performance and optimal control of such systems. At the beginning of the talk, I will show a few simple examples of where mathematics plays an integral role in illuminating system behavior. Following that I will discuss some of the mathematical challenges associated with analyzing the performance of modern networks. Finally, I will end by discussing work-in- progress related to modern call centers that is joint with Amy Ward (U. Chicago Booth) and Yueyang Zhong (U. Chicago Booth).



Learning survival from electronic medical/health records (EMR/EHR) data using high dimensional claims codes: Our work was motivated by the analysis projects using the linked US SEER-Medicare database to study mortality in men of age 65 years or older who were diagnosed with prostate cancer. Such data sets contain up to 100,000 human subjects and over 20,000 claim codes. For studying mortality the number of deaths are the ‘effective' sample size, so here we are in the situation of 'p>n' which is referred to as having high-dimensional predictors. In addition, a patient might die of cancer, or of other causes such as heart disease etc. These are referred to as competing risks. How to best perform prediction which inevitably involves variable selection for this type of complex survival data had not been previously investigated. Interest may also lie in comparing treatments such as radical prostatectomy versus conservative treatment. In this case the data were obviously not randomized with regard to the treatment assignments, and confounding most likely exists, possibly even beyond the commonly captured clinical variables in the SEER database. We will showcase research work done by our former PhD students from the UCSD Math Dept to account for such unobserved confounding, as well as efforts to make use of the high dimensional claims codes which have been shown to contain rich information about the patients’ survival.




Effective results in number theory: Ever since ancient Greece people have been fascinated by prime numbers, which ultimately led to the prime number theorem. We will talk about how L-functions got involved in the proof of PNT and what it would take to make the result effective. After that, we will talk about other counting problems in number theory.




Some fun facts about cubics: Cubic hypersurfaces are the zero sets of homogeneous polynomials of degree 3. They have been, still are, and probably will be for quite some time, the subject of a lot of research. I will survey a few well-known and fun facts about cubic hypersurfaces and will also mention some open problems.



Stochastic Networks: Bottlenecks, Entrainment and Reflection Stochastic models of complex networks with limited resources arise in a wide variety of applications in science and engineering, e.g., in manufacturing, transportation, telecommunications, computer systems, customer service facilities, and systems biology. Bottlenecks in such networks cause congestion, leading to queueing and delay. Sharing of resources can lead to entrainment effects. Understanding the dynamic behavior of such modern stochastic networks present challenging mathematical problems.


This talk will describe some developments in this area. A key feature will be dimension reduction, resulting from entrainment due to resource sharing. An example of bandwidth sharing in a data network will be featured.



Branching Brownian motion and the evolution of populations undergoing selection: Branching Brownian motion (BBM) is a random particle system which incorporates both the tree-like structure and the diffusion process. BBM has a natural interpretation as a population model. In this talk, we will see how one variant model of BBM, BBM with an inhomogeneous branching rate can be used to study the evolution of populations undergoing selection. We will provide a mathematically rigorous justification for the biological observation that the distribution of the fitness levels of individuals in a population evolves over time like a traveling wave with a profile defined by the Airy function. This talk is based on joint work with Jason Schweinsberg.


What is a moduli space?: Often, a goal of mathematics is to classify objects of a particular type. In algebraic geometry, the objects are usually of some geometric interest: manifolds, varieties, vector bundles, etc; and after fixing several discrete invariants (like the dimension of the object), we try to classify all the objects with those invariants. This leads to a notion of moduli space, i.e. a space parametrizing all of these objects. We will do several examples and mention both the usefulness and difficulty of these problems! No background in algebraic geometry is required.


Lattice Enumeration for Tower NFS: a 521-bit Discrete Logarithm Computation: The Tower variant of the Number Field Sieve (TNFS) is known to be asymptotically the most efficient algorithm to solve the discrete logarithm problem in finite fields of medium characteristics, when the extension degree is composite. A major obstacle to an efficient implementation of TNFS is the collection of algebraic relations, as it happens in dimensions greater than 2. This requires the construction of new sieving algorithms which remain efficient as the dimension grows. In the work I will present, we overcome this difficulty by considering a lattice enumeration algorithm which we adapt to this specific context. We also consider a new sieving area, a high-dimensional sphere, whereas previous sieving algorithms for the classical NFS considered an orthotope. Our new sieving technique leads to a much smaller running time, despite the larger dimension of the search space, and even when considering a larger target, as demonstrated by a record computation we performed in a 521-bit finite field GF(p^6). The target finite field is of the same form as finite fields used in recent zero-knowledge proofs in some blockchains. This is the first reported implementation of TNFS.


  • 3/31/22 at 12:00pm PST - Teresa Rexin

From Trees to Forests: Decision Tree Based Models Explained: Decision tree-based models are a popular tool for use in prediction and regression machine learning problems. In this talk, we will provide an overview of decision tree models and ensemble methods, including (but not limited to) random forests and XGBoost. We'll also discuss considerations of building such models and some applications. This talk does not require any background knowledge in machine learning.


  • 4/28/22 at 12:00pm PST - Mingjie Chen

Orienteering with one endomorphism: Supersingular isogeny-based cryptosystems are strong contenders for post-quantum cryptography standardization. Such cryptosystems rely on the hardness of path-finding on supersingular isogeny graphs. The path-finding problem is known to reduce to the endomorphism ring problem. Can path-finding be reduced to knowing just one endomorphism? In this talk, we give explicit classical and quantum algorithms for path-finding to an initial curve using the knowledge of one endomorphism. An endomorphism gives an orientation of a supersingular elliptic curve. We use the theory of oriented supersingular isogeny graphs and algorithms for taking ascending/descending/horizontal steps on such graphs.


On convergence of the cavity and Bolthausen’s TAP iterations to the local magnetization: The cavity and TAP equations are high-dimensional systems of nonlinear equations of the local magnetization in the Sherrington-Kirkpatrick model. In the seminal work, Bolthausen introduced an iterative scheme that produces an asymptotic solution to the TAP equations if the model lies inside the Almeida-Thouless transition line. However, it was unclear if this asymptotic solution coincides with the true local magnetization. In this work, motivated by the cavity equations, we introduce a new iterative scheme and establish a weak law of large numbers. We show that our new scheme is asymptotically the same as the so-called Approximate Message Passing (AMP) algorithm that has been popularly adapted in compressed sensing, Bayesian inferences, etc. Based on this, we confirm that our cavity iteration and Bolthausen’s scheme both converge to the local magnetization as long as the overlap is locally uniformly concentrated. This is a joint work with Wei-Kuo Chen (University of Minnesota).