PAPER DIGEST
Most Influential SIGMOD 2024 Paper · 2026-03 edition

Vexless: A Serverless Vector Data Management System Using Cloud Functions

Yongye Su; Yinqi Sun; Minjia Zhang; Jianguo Wang

Venue
ACM SIGMOD Conference (SIGMOD) 2024
Recognition
Most Influential SIGMOD 2024 Paper (Rank No. 12)
Edition
2026-03
Impact factor
3
Certificate ID
264937412a925f05

Abstract

Cloud functions, exemplified by AWS Lambda and Azure Functions, are emerging as a new computing paradigm in the cloud. They provide elastic, serverless, and low-cost cloud computing, making them highly suitable for bursty and sparse workloads, which are quite common in practice. Thus, there is a new trend in designing data systems that leverage cloud functions. In this paper, we focus on vector databases, which have recently gained significant attention partly due to large language models. In particular, we investigate how to use cloud functions to build high-performance and cost-efficient vector databases. This presents significant challenges in terms of how to perform sharding, how to reduce communication overhead, and how to minimize cold-start times.In this paper, we introduce Vexless, the first vector database system optimized for cloud functions. We present three optimizations to address the challenges. To perform sharding, we propose a global coordinator (orchestrator) that assigns workloads to Cloud function instances based on their available hardware resources. To overcome communication overhead, we propose the use of stateful cloud functions, eliminating the need for costly communications during synchronization. To minimize cold-start overhead, we introduce a workload-aware Cloud function lifetime management strategy. Vexless has been implemented using Azure Functions. Experimental results demonstrate that Vexless can significantly reduce costs, especially on bursty and sparse workloads, compared to cloud VM instances, while achieving similar or higher query performance and accuracy.

Download PDF certificate