Exploring The Upper Limits of Text-Based Collaborative Filtering Using Large Language Models: Discoveries and Insights

Ruyu Li, Wenhao Deng, Yu Cheng, Zheng Yuan, Jiaqi Zhang, Fajie Yuan

Venue: ACM Conference on Information and Knowledge Management (CIKM) 2025
Recognition: Most Influential CIKM 2025 Paper (Rank No. 2)
Edition: 2026-03
Impact factor: 3
Certificate ID: 8e2d83f344095333

Abstract

Text-based collaborative filtering (TCF) has emerged as the prominent technique for text and news recommendation, employing language models (LMs) as text encoders to represent items. However, the current landscape of TCF models mainly relies on the utilization of relatively small or medium-sized LMs. The potential impact of using larger, more powerful language models (such as these with over 100 billion parameters) as item encoders on recommendation performance remains uncertain. Can we anticipate unprecedented results and discover new insights?To address this question, we undertake a comprehensive series of experiments aimed at exploring the performance limits of the TCF paradigm. Specifically, we progressively augment the scale of item encoders, ranging from one hundred million to one hundred billion parameters, in order to reveal the scaling limits of the TCF paradigm. Moreover, we investigate whether these exceptionally large LMs have the potential to establish a universal item representation for the recommendation task, thereby revolutionizing the traditional ID paradigm, which is considered a significant obstacle to developing transferable ''one model fits all'' recommender models. Our study not only demonstrates positive results but also uncovers unexpected negative outcomes, illuminating the current state of the TCF paradigm within the community. These findings will evoke deep reflection and inspire further research on text-based recommender systems.

Download PDF certificate