PAPER DIGEST
Most Influential SIGMOD 2012 Paper · 2026-03 edition

Probase: A Probabilistic Taxonomy For Text Understanding

Wentao Wu; Hongsong Li; Haixun Wang; Kenny Q. Zhu

Venue
ACM SIGMOD Conference (SIGMOD) 2012
Recognition
Most Influential SIGMOD 2012 Paper (Rank No. 1)
Edition
2026-03
Impact factor
8
Certificate ID
175982d0186ba25d

Abstract

Knowledge is indispensable to understanding. The ongoing information explosion highlights the need to enable machines to better understand electronic text in human language. Much work has been devoted to creating universal ontologies or taxonomies for this purpose. However, none of the existing ontologies has the needed depth and breadth for universal understanding. In this paper, we present a universal, probabilistic taxonomy that is more comprehensive than any existing ones. It contains 2.7 million concepts harnessed automatically from a corpus of 1.68 billion web pages. Unlike traditional taxonomies that treat knowledge as black and white, it uses probabilities to model inconsistent, ambiguous and uncertain information it contains. We present details of how the taxonomy is constructed, its probabilistic modeling, and its potential applications in text understanding.

Download PDF certificate