PAPER DIGEST
Most Influential SIGMOD 1997 Paper · 2026-03 edition

Beyond Market Baskets: Generalizing Association Rules To Correlations

Sergey Brin; Rajeev Motwani; Craig Silverstein

Venue
ACM SIGMOD Conference (SIGMOD) 1997
Recognition
Most Influential SIGMOD 1997 Paper (Rank No. 2)
Edition
2026-03
Impact factor
9
Certificate ID
5fe760b6d55f7bde

Abstract

One of the most well-studied problems in data mining is mining for association rules in market basket data. Association rules, whose significance is measured via support and confidence, are intended to identify rules of the type, “A customer purchasing item A often also purchases item B.” Motivated by the goal of generalizing beyond market baskets and the association rules used with them, we develop the notion of mining rules that identify correlations (generalizing associations), and we consider both the absence and presence of items as a basis for generating rules. We propose measuring significance of associations via the chi-squared test for correlation from classical statistics. This leads to a measure that is upward closed in the itemset lattice, enabling us to reduce the mining problem to the search for a border between correlated and uncorrelated itemsets in the lattice. We develop pruning strategies and devise an efficient algorithm for the resulting problem. We demonstrate its effectiveness by testing it on census data and finding term dependence in a corpus of text documents, as well as on synthetic data.

Download PDF certificate