PAPER DIGEST
Most Influential SIGIR 2009 Paper · 2026-03 edition

Reducing Long Queries Using Query Quality Predictors

Giridhar Kumaran; Vitor R. Carvalho

Venue
ACM SIGIR Conference (SIGIR) 2009
Recognition
Most Influential SIGIR 2009 Paper (Rank No. 13)
Edition
2026-03
Impact factor
5
Certificate ID
e2e13eda2b765b6a

Abstract

Long queries frequently contain many extraneous terms that hinder retrieval of relevant documents. We present techniques to <i>reduce</i> long queries to more effective shorter ones that lack those extraneous terms. Our work is motivated by the observation that perfectly reducing long TREC description queries can lead to an average improvement of 30% in mean average precision. Our approach involves transforming the reduction problem into a problem of learning to rank all sub-sets of the original query (sub-queries) based on their predicted quality, and selecting the top sub-query. We use various measures of query quality described in the literature as features to represent sub-queries, and train a classifier. Replacing the original long query with the top-ranked sub-query chosen by the ranker results in a statistically significant average improvement of 8% on our test sets. Analysis of the results shows that query reduction is well-suited for moderately-performing long queries, and a small set of query quality predictors are well-suited for the task of ranking sub-queries.

Download PDF certificate