Optimization Methods for Personalizing Large Language Models Through Retrieval Augmentation

Alireza Salemi; Surya Kallumadi; Hamed Zamani

Venue: ACM SIGIR Conference (SIGIR) 2024
Recognition: Most Influential SIGIR 2024 Paper (Rank No. 11)
Edition: 2026-03
Impact factor: 4
Certificate ID: 48e800ab37cf1234

Abstract

This paper studies retrieval-augmented approaches for personalizing large language models (LLMs), which potentially have a substantial impact on various applications and domains. We propose the first attempt to optimize the retrieval models that deliver a limited number of personal documents to large language models for the purpose of personalized generation. We develop two optimization algorithms that solicit feedback from the downstream personalized generation tasks for retrieval optimization--one based on reinforcement learning whose reward function is defined using any arbitrary metric for personalized generation and another based on knowledge distillation from the downstream LLM to the retrieval model. This paper also introduces a pre- and post-generation retriever selection model that decides what retriever to choose for each LLM input. Extensive experiments on diverse tasks from the language model personalization (LaMP) benchmark reveal statistically significant improvements in six out of seven datasets.

Download PDF certificate