PAPER DIGEST
Most Influential SIGCOMM 2011 Paper · 2026-03 edition

Managing Data Transfers In Computer Clusters With Orchestra

Mosharaf Chowdhury; Matei Zaharia; Justin Ma; Michael I. Jordan; Ion Stoica

Venue
ACM SIGCOMM Conference (SIGCOMM) 2011
Recognition
Most Influential SIGCOMM 2011 Paper (Rank No. 3)
Edition
2026-03
Impact factor
8
Certificate ID
70a450de6f3b6cf2

Abstract

Cluster computing applications like MapReduce and Dryad transfer massive amounts of data between their computation stages. These transfers can have a significant impact on job performance, accounting for more than 50% of job completion times. Despite this impact, there has been relatively little work on optimizing the performance of these data transfers, with networking researchers traditionally focusing on per-flow traffic management. We address this limitation by proposing a global management architecture and a set of algorithms that (1) improve the transfer times of common communication patterns, such as broadcast and shuffle, and (2) allow scheduling policies at the transfer level, such as prioritizing a transfer over other transfers. Using a prototype implementation, we show that our solution improves broadcast completion times by up to 4.5X compared to the status quo in Hadoop. We also show that transfer-level scheduling can reduce the completion time of high-priority transfers by 1.7X.

Download PDF certificate