networking

  • Summary cache: a scalable wide-area web cache sharing protocol. Li Fan, Pei Cao, Jussara Almeida, Andrei Z. Broder. SIGCOMM 98.
     Introduces the idea of a counting Bloom filter, to support dynamic insertions and deletions in a Bloom filter. This usage is a special case of the Count-Min sketch, but the algorithm for the maintaining the sketch is quite similar.

  • Robust Aggregation in Sensor Networks. George Kollios, John Byers, Jeffrey Considine, Marios Hadjieleftheriou, and Feifei Li.  IEEE Data Eng Bull, 28(1), 05.
Combines Count-Min sketch with approximate distinct element counters to allow summaries to be merged multiple times and count only the number of distinct occurrences.
"We show how the Count-Min sketch can be made duplicate-insensitive for exploiting multi-path routing in sensor networks. Even though other frequency counting techniques have been proposed in the past, the Count-Min sketch is robust, small in size, and provides error guarantees."
Uses sketch techniques to identify (volume) anomalies in network data.  "Our proposed framework is based on 2 data summary architecture: a Multi-Layer Reversible Sketch (MLRS) and a Count-Min Sketch (CMS)..."
Proposes a framework for sharing summaries within a sensor network (when summaries may be duplicated, retransmitted, or travel along multiple paths to the destination), and observes that Count-Min sketch can be applied in this setting:
"Although the Count-Min Sketch has been proposed in the context of a single stream, it can be extended to be used in the synopsis diffusion framework by re-placing the duplicate-sensitive counter of each array cell with an ODI Sum synopsis..."
"Our online flow duration estimation is similar to the Count-Min sketch devised by [2] for heavy hitter detection (minimum value of the hashed entries), but our (Bloom Filter) update uses the
maximum of its current hash entry and an estimate from past, which is different from Count-Min sketch and standard CBF."
Defines techniques for private computation of complex statistics, based on private computation of sums and counts, and argues that since Count-Min sketch is based on such primitives, the whole structure can be used as part of private computations.
Uses sketch data structures within an algorithm to identify network users with the largest deviation from the norm: "Within the bucket, we use a sketch, such as the Count-Min sketch, to keep track of the number of items belonging to different streams."
Comments