PAPER DIGEST
Most Influential SIGIR 1992 Paper · 2026-03 edition

Latent Semantic Indexing Is An Optimal Special Case Of Multidimensional Scaling

Brian T. Bartell; Garrison W. Cottrell; Richard K. Belew

Venue
ACM SIGIR Conference (SIGIR) 1992
Recognition
Most Influential SIGIR 1992 Paper (Rank No. 12)
Edition
2026-03
Impact factor
5
Certificate ID
01490cb5e85f57f4

Abstract

Latent Semantic Indexing (LSI) is a technique for representing documents, queries, and terms as vectors in a multidimensional real-valued space. The representtions are approximations to the original term space encoding, and are found using the matrix technique of Singular Value Decomposition. In comparison Multidimensional Scaling (MDS) is a class of data analysis techniques for representing data points as points in a multidimensional real-valued space. The objects are represented so that inter-point similarities in the space match inter-object similarity information provided by the researcher. We illustrate how the document representations given by LSI are equivalent to the optimal representations found when solving a particular MDS problem in which the given inter-object similarity information is provided by the inner product similarities between the documents themselves. We further analyze a more general MDS problem in which the interdocument similarity information, although still in inner product form is arbitrary with respect to the vector space encoding of the documents.

Download PDF certificate