Groupe d’études et de recherche en analyse des décisions

Optimizing Data Placement for Distributed Computation

Lukasz Golab University of Waterloo, Canada


I will discuss the following problem: given a set of data items, a set of tasks that reference these data items, and a set of servers with finite storage and processing capacities, allocate the data items to the servers in a way that minimizes the amount of data that needs to be transferred among servers during task execution. This problem arises in many practical scenarios including cloud databases. I will show that this problem can be reduced to the well-studied graph partitioning problem, which is NP-hard, but for which efficient approximation algorithms exist. I will also discuss how to handle load balancing and data replication. This is joint work with Marios Hadjieleftheriou (AT&T), Howard Karloff (Yahoo!) and Barna Saha (AT&T).

Ce séminaire est organisé conjointement avec la section Montréal de la SCRO et financièrement appuyé par le programme de conférenciers itinérants de la SCRO.