PAPER DIGEST
Most Influential NEURIPS 2008 Paper · 2026-03 edition

A Scalable Hierarchical Distributed Language Model

Andriy Mnih; Geoffrey E. Hinton

Venue
NEURIPS 2008
Recognition
Most Influential NEURIPS 2008 Paper (Rank No. 5)
Edition
2026-03
Impact factor
9
Certificate ID
015cea153967542f

Abstract

Neural probabilistic language models (NPLMs) have been shown to be competitive with and occasionally superior to the widely-used n-gram language models. The main drawback of NPLMs is their extremely long training and testing times. Morin and Bengio have proposed a hierarchical language model built around a binary tree of words that was two orders of magnitude faster than the non-hierarchical language model it was based on. However, it performed considerably worse than its non-hierarchical counterpart in spite of using a word tree created using expert knowledge. We introduce a fast hierarchical language model along with a simple feature-based algorithm for automatic construction of word trees from the data. We then show that the resulting models can outperform non-hierarchical models and achieve state-of-the-art performance.

Download PDF certificate