Inferring and Concealing Access Patterns in Distributed Systems

Students: Rick Skowyra, Vatche Ishakian, William Blair, Jarib Rahman
PIs: Azer Bestavros

In a distributed setting, it is often desirable to infer (and/or hide) information regarding the access patterns to specific services from potential adversaries. In this project we focus on a number of instances of this problem as they relate to cloud computing and peer-to-peer distributed applications. For cloud computing, we investigate whether it is possible for a VM to infer the characteristics of other co-tenant workloads (VMs) that are co-hosted on the same infrastructure. For peer-to-peer distributed applications, we investigate whether it is possible for a peer to identify important aspect of access patterns, such as the access frequency in content distribution and information retrieval systems. The ability to infer such characteristics could be at once useful (e.g., for capacity planning purposes) and harmful (e.g., to fingerprint collocated workloads for adversarial purposes). The research pursued as part of this project aims to develop both the inference capabilities as well as the counter-measures that make such inference provably/measurably hard. Additionally, the research aims to investigate the tradeoffs between performance and achievable level of concealment.

Angels in The Cloud:
On Demand Peer-Assisted Content Distribution Cloud Service

Students: Raymond Sweha, Vatche Ishakian
PIs: Azer Bestavros, John Byers
This project develops a Cloud Service for Internet content distribution. Our system assists Seeders (content originators) with the dissemination of content (a file) to a set of nodes using Peer-to-Peer (P2P) concepts so that this dissemination is completed in the minimum time possible. Prior results of ours suggest that minimizing content distribution time may be achieved by adding nodes that are not themselves interested in downloading the content, but rather in assisting other nodes with their download in a prescribed (provably optimal) fashion. We call such nodes Angels. The emerging cloud computing architecture offers the best mechanism to allow such Angels to be created on-the-fly. As seeders request assistance with their file distribution, our service responds by spawning virtual machines that act as Angels. In this work, we describe the design and implementation of our "Angels-on-Demand" cloud service as well as the API for invoking this service.

Generalized Centrality Measures with Applications to Cloud Facility Location

Students: Vatche Ishakian and Dora Erdos
PIs: Azer Bestavros and Evimaria Terzi

Traditionally, different measures of centrality have been defined for individual nodes and links in information networks and distributed systems. Such measures enable applications to compare nodes and links to each other with respect to a particular interesting property of the network. Increasingly, emerging applications may need to assess the centrality of groups of nodes within the network. For example, interesting questions might include ”What is the best group of nodes to carry out inspection, monitoring, sampling, aggregation, or de-duplication in a content flow network?” or ”What is the minimum group of nodes to intercept all possible communication between a specific set of sources and destinations?” To answer these and many similar questions, this project defines novel ways to express group centrality objectives for flow and access networks, and develops algorithmic techniques that can be implemented for use on large-scale networks.

Zenith Attacks and Countermeasures

Students: Rick Skowyra
PIs: Azer Bestavros and Sharon Goldberg

This project identifies and defines Zenith attacks, a new class of attacks on content-distribution systems, which seek to expose the popularity (i.e. access frequency) of individual items of content. As the access pattern to most real-world content exhibits Zipf-like characteristics, there is a small set of dominating items which account for the majority of accesses. Identifying such items enables an adversary to perform follow up adversarial actions targeting these items, including mounting denial of service attacks, deploying censorship mechanisms, and eavesdropping on or prosecution of the host or recipient. We instantiate a Zenith attack on the Kademlia and Chord structured overlay networks and quantify the cost of such an attack. As a countermeasure to these attacks we propose Crypsis, a system to conceal the lookup frequency of individual keys through aggregation over ranges of the keyspace. Crypsis provides provable security guarentees for concealment of lookup frequency while maintaining logarithmic routing and state bounds.

Colocation as a Service (CaaS)
Strategic and Operational Services for Cloud Colocation

Students: Vatche Ishakian, Raymond Sweha, Jorge Londoño
PIs: Azer Bestavros

By colocating with other tenants of an Infrastructure as a Service (IaaS) offering, IaaS users could reap significant cost savings by judiciously sharing their use of the fixed-size instances offered by IaaS providers. This work presents the blueprints of a Colocation as a Service (CaaS) framework. CaaS strategic services identify coalitions of selfinterested users that would benefit from colocation on shared instances. CaaS operational services provide the information necessary for, and carry out the reconfigurations mandated by strategic services. CaaS could be incorporated into an IaaS offering by providers; it could be implemented as a valueadded proposition by IaaS resellers; or it could be directly leveraged in a peer-to-peer fashion by IaaS users. To establish the practicality of such offerings, this paper presents XCS – a prototype implementation of CaaS on top of the Xen hypervisor. XCS makes specific choices with respect to the various elements of the CaaS framework: it implements strategic services based on a game-theoretic formulation of colocation; it features novel concurrent migration heuristics which are shown to be efficient; and it offers monitoring and accounting services at both the hypervisor and VM layers. Extensive experimental results obtained by running PlanetLab trace-driven workloads on the XCS prototype confirm the premise of CaaS – by demonstrating the efficiency and scalability of XCS, and by quantifying the potential cost savings accrued through the use of XCS.

Colocation of Real-Time Systems

Students: Vatche Ishakian
PIs: Azer Bestavros, Assaf Kfoury

Desirable application performance is typically guaranteed through the use of Service Level Agreements (SLAs) that specify fixed fractions of resource capacities that must be allocated for unencumbered use by the application. The mapping between what constitutes desirable performance and SLAs is not unique: multiple SLA expressions might be functionally equivalent. Having the flexibility to transform SLAs from one form to another in a manner that is provably safe would enable hosting solutions to achieve significant efficiencies. This work tries to demonstrate the promise of such an approach by proposing a type-theoretic framework for the representation and safe transformation of SLAs.

Embedding Games

Students: Jorge Londoño
PIs: Azer Bestavros, Shang-Hua Teng

Distributed computing infrastructures are ubiquitous: applications running on the web rely on Cloud infrastructures, scientists run resource intensive applications on Grid systems, and network architects develop next generation protocols and services on testbeds such as PlanetLab, Emulab/Netbed, and GENI. A common challenge facing all these as well as other systems is finding and allocating the resources needed in support of such applications and services. This problem is generally known as the network embedding problem.

Embedding Games arise when multiple competing agents engage in solving the network embedding problem, with each agent trying to maximize its own benefit. From a modeling and analysis perspective, important questions that arise in this context include determining whether and under what conditions do equilibria governing agent interactions exist, whether convergence to such equilibria is assured, and the extent of the inefficiencies associated with these equilibria. From a system design perspective, the main challenge is that of designing mechanisms that provide incentives that ensure that the choices of the non-cooperating, selfish agents engaged in an embedding game lead to an efficient allocation of resources from a system-wide perspective.