PAPER DIGEST
Most Influential SIGMOD 2016 Paper · 2026-03 edition

Augmented Sketch: Faster And More Accurate Stream Processing

Pratanu Roy; Arijit Khan; Gustavo Alonso

Venue
ACM SIGMOD Conference (SIGMOD) 2016
Recognition
Most Influential SIGMOD 2016 Paper (Rank No. 11)
Edition
2026-03
Impact factor
4
Certificate ID
1787a108453087aa

Abstract

Approximated algorithms are often used to estimate the frequency of items on high volume, fast data streams. The most common ones are variations of Count-Min sketch, which use sub-linear space for the count, but can produce errors in the counts of the most frequent items and can misclassify low-frequency items. In this paper, we improve the accuracy of sketch-based algorithms by increasing the frequency estimation accuracy of the most frequent items and reducing the possible misclassification of low-frequency items, while also improving the overall throughput. Our solution, called Augmented Sketch (ASketch), is based on a pre-filtering stage that dynamically identifies and aggregates the most frequent items. Items overflowing the pre-filtering stage are processed using a conventional sketch algorithm, thereby making the solution general and applicable in a wide range of contexts. The pre-filtering stage can be efficiently implemented with SIMD instructions on multi-core machines and can be further parallelized through pipeline parallelism where the filtering stage runs in one core and the sketch algorithm runs in another core.

Download PDF certificate