Title: Sustainable Frequency Estimation
Abstract: Modern data science systems rely on streaming analytics to monitor datacenters, analyze network traffic, and power real-time pipelines. Frequency estimation sketches such as Count-Min are a key building block, offering compact summaries that reduce memory, computation, and data movement—major sources of energy consumption. Yet, in practice, these sketches remain wasteful: they over-allocate memory under skew due to uniformly-sized counters, and their error grows with unbounded streams, forcing costly over-provisioning to maintain accuracy.
In this talk, we present Sublime, a framework that rethinks frequency estimation. Sublime uses dynamically extensible counters to eliminate wasted space under workload skew, and expands the sketch over time to keep error in check as streams grow. The result is a new class of sketches whose memory footprint and error both scale sublinearly with the stream—delivering tighter accuracy with fewer resources.