Title: Using Bernoulli processes to find frequent items in a data stream using low memory
Abstract: In the 'frequent items' problem one sees a sequence of items in a stream (e.g. a stream of words coming into a search query engine like Google) and wants to report a small list of items containing all frequent items. We would like algorithms for this problem that use memory substantially sublinear in the length of the stream.
We describe a new state-of-the-art solution to both problems. We make use of chaining methods to control the suprema of Rademacher processes to develop an algorithm BPTree with provably near-optimal memory consumption.
Based on joint works with Vladimir Braverman, Stephen Chestnut, Nikita Ivkin, Zhengyu Wang, and David P. Woodruff.
Time: 10:00 a.m.
Title: Some nonlinear random matrix models.
Abstract: We consider $M= \frac{1}{m} YY^*$ with $Y=f(WX)$ where $W$ and $X$ are random rectangular matrices with i.i.d. centered entries. The function $f$ is applied pointwise and can be seen as an activation function in (random) neural networks.
We compute the asymptotic empirical distribution of this ensemble in the case where $W$ and $X$ have sub-Gaussian tails and $f$ is real analytic. This extends a result of \cite{pennington2017nonlinear} where the case of Gaussian matrices $W$ and $X$ is considered.
This is joint work with L. Benigni.