Algorithms and Systems for MapReduce and Beyond (BeyondMR) is a workshop for research at the frontier of large-scale computations, with topics ranging from algorithms, to computational models, to the systems themselves. BeyondMR will be held in conjunction with SIGMOD/PODS2016, in San Francisco, USA on Friday July 1, 2016.

The BeyondMR workshop aims to explore algorithms, computational models, architectures, languages and interfaces for systems that need large-scale parallelization and systems designed to support efficient parallelization and fault tolerance. These include specialized programming and data-management systems based on MapReduce and extensions, graph processing systems, data-intensive workflow and dataflow systems.

We invite submission on topics such as:
  • Frameworks for Large-Scale Analytical Processing
  • Algorithms for Large-Scale Data Processing
  • Cost Models and Optimization Techniques
  • Resource Management for Many-Task Computing

Paper submission deadline extended to: Wed. March 19, 2016 (strict)


   Author: Ion Stoica, AMPLab, University of California Berkeley
       Title: Spark: Past, Present, and Future
Abstract: Almost six years ago we started the Spark project at UC Berkeley. Spark is a cluster computing engine that is optimized for in-memory processing, and unifies support for a variety of workloads, including batch, interactive querying, streaming, and iterative computations. Spark is now the most active big data project in the open source community, and is already being used by over one thousand organizations. In this talk, I'll take a look back at Spark's humble beginnings, discuss it's current status, and the new and exciting developments that are coming up.

    Author: Carlos Guestrin, University of Washington
       Title: Big Data, Small Cluster: Choosing “big memory" (RAM, disks, SSDs) over big clusters
Abstract: TBA

BeyondMR is made possible with the generous support of Google,

The prior occurrences of BeyondMR were held in 2014 and 2015 together with EDBT/ICDT.