PAPER DIGEST
Most Influential ICML 2004 Paper · 2026-03 edition

Integrating Constraints And Metric Learning In Semi-supervised Clustering

Mikhail Bilenko; Sugato Basu; Raymond J. Mooney

Venue
International Conference on Machine Learning (ICML) 2004
Recognition
Most Influential ICML 2004 Paper (Rank No. 8)
Edition
2026-03
Impact factor
8
Certificate ID
8795708c318580e4

Abstract

Semi-supervised clustering employs a small amount of labeled data to aid unsupervised learning. Previous work in the area has utilized supervised data in one of two approaches: 1) constraint-based methods that guide the clustering algorithm towards a better grouping of the data, and 2) distance-function learning methods that adapt the underlying similarity metric used by the clustering algorithm. This paper provides new methods for the two approaches as well as presents a new semi-supervised clustering algorithm that integrates <i>both</i> of these techniques in a uniform, principled framework. Experimental results demonstrate that the unified approach produces better clusters than both individual approaches as well as previously proposed semi-supervised clustering algorithms.

Download PDF certificate