PAPER DIGEST
Most Influential SIGIR 2009 Paper · 2026-03 edition

Fast Nonparametric Matrix Factorization For Large-scale Collaborative Filtering

Kai Yu; Shenghuo Zhu; John Lafferty; Yihong Gong

Venue
ACM SIGIR Conference (SIGIR) 2009
Recognition
Most Influential SIGIR 2009 Paper (Rank No. 15)
Edition
2026-03
Impact factor
4
Certificate ID
e8fd3adb93d18ed6

Abstract

With the sheer growth of online user data, it becomes challenging to develop preference learning algorithms that are sufficiently flexible in modeling but also affordable in computation. In this paper we develop <i>nonparametric matrix factorization methods</i> by allowing the latent factors of two low-rank matrix factorization methods, the singular value decomposition (SVD) and probabilistic principal component analysis (pPCA), to be data-driven, with the dimensionality increasing with data size. We show that the formulations of the two nonparametric models are very similar, and their optimizations share similar procedures. Compared to traditional parametric low-rank methods, nonparametric models are appealing for their flexibility in modeling complex data dependencies. However, this modeling advantage comes at a computational price--it is highly challenging to scale them to large-scale problems, hampering their application to applications such as collaborative filtering. In this paper we introduce novel optimization algorithms, which are simple to implement, which allow learning both nonparametric matrix factorization models to be highly efficient on large-scale problems. Our experiments on EachMovie and Netflix, the two largest public benchmarks to date, demonstrate that the nonparametric models make more accurate predictions of user ratings, and are computationally comparable or sometimes even faster in training, in comparison with previous state-of-the-art parametric matrix factorization models.

Download PDF certificate