PAPER DIGEST
Most Influential CIKM 2004 Paper · 2026-03 edition

Hierarchical Document Categorization With Support Vector Machines

Lijuan Cai; Thomas Hofmann

Venue
ACM Conference on Information and Knowledge Management (CIKM) 2004
Recognition
Most Influential CIKM 2004 Paper (Rank No. 3)
Edition
2026-03
Impact factor
6
Certificate ID
66774779d58a47ad

Abstract

Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques like Support Vector Machines and related large margin methods have been successfully applied for this task, albeit the fact that they ignore the inter-class relationships. In this paper, we propose a novel hierarchical classification method that generalizes Support Vector Machine learning and that is based on discriminant functions that are structured in a way that mirrors the class hierarchy. Our method can work with arbitrary, not necessarily singly connected taxonomies and can deal with task-specific loss functions. All parameters are learned jointly by optimizing a common objective function corresponding to a regularized upper bound on the empirical loss. We present experimental results on the WIPO-alpha patent collection to show the competitiveness of our approach.

Download PDF certificate