The Automatic Generation Of Extended Queries
Abstract
In the extended vector space model, each document vector consists of a set of subvectors representing the multiple concepts or concept classes present in the document. Typical information concepts, in addition to the usual content terms or descriptors, include author names, bibliographic links, <i>etc.</i> The extended vector space model is known to improve retrieval effectiveness. However, a major impediment to the use of the extended model is the construction of an extended query. In this paper, we describe a method for automatically extending a query containing only content terms (a single concept class) to a representation containing multiple concept classes. No relevance feedback is involved. Experiments using the CACM collection resulted in an average precision 34% better than that obtained using the standard single-concept term vector model.