PAPER DIGEST
Most Influential SIGIR 2004 Paper · 2026-03 edition

Cluster-based Retrieval Using Language Models

Xiaoyong Liu; W. Bruce Croft

Venue
ACM SIGIR Conference (SIGIR) 2004
Recognition
Most Influential SIGIR 2004 Paper (Rank No. 5)
Edition
2026-03
Impact factor
7
Certificate ID
1a1e341b583fec85

Abstract

Previous research on cluster-based retrieval has been inconclusive as to whether it does bring improved retrieval effectiveness over document-based retrieval. Recent developments in the language modeling approach to IR have motivated us to re-examine this problem within this new retrieval framework. We propose two new models for cluster-based retrieval and evaluate them on several TREC collections. We show that cluster-based retrieval can perform consistently across collections of realistic size, and significant improvements over document-based retrieval can be obtained in a fully automatic manner and without relevance information provided by human.

Download PDF certificate