PAPER DIGEST
Most Influential SIGIR 1992 Paper · 2026-03 edition

A System For Retrieving Speech Documents

Ulrike Glavitsch; Peter Schä uble

Venue
ACM SIGIR Conference (SIGIR) 1992
Recognition
Most Influential SIGIR 1992 Paper (Rank No. 15)
Edition
2026-03
Impact factor
4
Certificate ID
4098569e2306457c

Abstract

An information retrieval model is presented for the retrieval of speech documents, i.e. audio recordings containing speech. The indexing vocabulary consists of indexing features that have the following characteristics. First, they are easy to recognize by speech recognition methods. Second, the number of different indexing features is small such that a reasonable amount of training data is sufficent to train the hidden Markov models that are used by the speech recognition process. Third, the retrieval method based on such indexing features achieves an acceptable retrieval effectiveness as shown by experiments on text collections. Fourth, these indexing features cannot only be identified in speech documents but also in text documents. From the last characteristic follows that speech documents and text documents can be retrieved simultaneously. Analogously, the queries may contain either speech or text. Thus, we have a simple multimedia retrieval model where two different medias are indexed coherently. We also describe a prototype retrieval system under development.

Download PDF certificate