Relevance Score Normalization For Metasearch
Abstract
Given the ranked lists of documents returned by multiple search engines in response to a given query, the problem of <i> metasearch</i> is to combine these lists in a way which optimizes the performance of the combination. This problem can be naturally decomposed into three subproblems: (1) <i>normalizing</i> the relevance scores given by the input systems, (2) <i>estimating</i> relevance scores for unretrieved documents, and (3) <i>combining</i> the newly-acquired scores for each document into one, improved score.Research on the problem of metasearch has historically concentrated on algorithms for <i>combining</i> (normalized) scores. In this paper, we show that the techniques used for <i>normalizing</i> relevance scores and <i>estimating</i> the relevance scores of unretrieved documents can have a significant effect on the overall performance of metasearch. We propose two new normalization/estimation techniques and demonstrate empirically that the performance of well known metasearch algorithms can be significantly improved through their use.