PAPER DIGEST
Most Influential SIGMOD 2009 Paper · 2026-03 edition

Scalable Join Processing On Very Large RDF Graphs

Thomas Neumann; Gerhard Weikum

Venue
ACM SIGMOD Conference (SIGMOD) 2009
Recognition
Most Influential SIGMOD 2009 Paper (Rank No. 6)
Edition
2026-03
Impact factor
5
Certificate ID
5fd0d8154650b603

Abstract

With the proliferation of the RDF data format, engines for RDF query processing are faced with very large graphs that contain hundreds of millions of RDF triples. This paper addresses the resulting scalability problems. Recent prior work along these lines has focused on indexing and other physical-design issues. The current paper focuses on join processing, as the fine-grained and schema-relaxed use of RDF often entails star- and chain-shaped join queries with many input streams from index scans. We present two contributions for scalable join processing. First, we develop very light-weight methods for sideways information passing between separate joins at query run-time, to provide highly effective filters on the input streams of joins. Second, we improve previously proposed algorithms for join-order optimization by more accurate selectivity estimations for very large RDF graphs. Experimental studies with several RDF datasets, including the UniProt collection, demonstrate the performance gains of our approach, outperforming the previously fastest systems by more than an order of magnitude.

Download PDF certificate