- Spectral Bloom filters. Saar Cohen, Yossi Matias. SIGMOD 03. This paper independently introduces a Count-Min-like sketch for multisets (under the name "spectral Bloom filters"), and presents a number of applications in databases. Also, some heuristics for improving the accuracy of the sketch are discussed.
- Statistical Analysis of Sketch Estimators. Florin Rusu, Alin Dobra. SIGMOD 07.
- Graph-based synopses
for relational selectivity estimation. Joshua Spiegel, Neoklis Polyzotis. SIGMOD 06.
- Approximate Counts and Quantiles over Sliding Windows. Arvind Arasu, Gurmeet Singh Manku. PODS 04.
- Finding global icebergs over distributed data sets. Qi (George) Zhao, Mitsunori Ogihara, Haixun Wang, Jun (Jim) Xu. PODS 06.
- False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams. Wei Wang, Haifeng Jiang, Hongjun Lu, Jeffrey Xu Yu, VLDB 2004.
- Graph-based synopses for relational selectivity estimation, Joshua Spiegel, Neoklis Polyzotis, SIGMOD 2006.
- Power-Conserving Computation of Order-Statistics over Sensor Networks, Michael Greenwald, Sanjeev Khanna PODS 2004.
Observes that Count-Min sketch can be used to approximate quantiles of data within a distributed sensor (tree) network. - Mining Top-K Frequent Itemsets Through Progressive Sampling, Andrea Pietracaprina, Matteo Riondato, Eli Upfal, Fabio Vandin. Data Mining and Knowledge Discovery, Volume 21, Number 2, 2010.
Uses a Count-Min sketch to build an approximation of the distribution of the frequencies of the itemsets in a dataset. The innovative contribution is a method for building this approximation without querying the sketch for each itemset. |