PAPER DIGEST
Most Influential SIGMOD 2003 Paper · 2026-03 edition

Spectral Bloom Filters

Saar Cohen; Yossi Matias

Venue
ACM SIGMOD Conference (SIGMOD) 2003
Recognition
Most Influential SIGMOD 2003 Paper (Rank No. 11)
Edition
2026-03
Impact factor
7
Certificate ID
7aed5f3ed33cbc72

Abstract

A Bloom Filter is a space-efficient randomized data structure allowing membership queries over sets with certain allowable errors. It is widely used in many applications which take advantage of its ability to compactly represent a set, and filter out effectively any element that does not belong to the set, with small error probability. This paper introduces the Spectral Bloom Filter (SBF), an extension of the original Bloom Filter to multi-sets, allowing the filtering of elements whose multiplicities are below a threshold given at query time. Using memory only slightly larger than that of the original Bloom Filter, the SBF supports queries on the multiplicities of individual keys with a guaranteed, small error probability. The SBF also supports insertions and deletions over the data set. We present novel methods for reducing the probability and magnitude of errors. We also present an efficient data structure and algorithms to build it incrementally and maintain it over streaming data, as well as over materialized data with arbitrary insertions and deletions. The SBF does not assume any a priori filtering threshold and effectively and efficiently maintains information over the entire data-set, allowing for ad-hoc queries with arbitrary parameters and enabling a range of new applications.

Download PDF certificate