LAbel-Based I/O System

In the era of data-intensive computing, large-scale applications, in both scientific and the BigData communities, demonstrate unique I/O requirements leading to a variety of storage solutions which are often incompatible with one another. How can we support a wide variety of conflicting I/O workloads under a single storage system?

We introduce the idea of a Label, a new data representation, and, we present LABIOS: a new, distributed, Label- based I/O system. TABIOS boosts I/O performance by up to 17x via asynchronous I/O, supports heterogeneous storage resources, offers storage elasticity, and promotes in-situ analytics via data provisioning. LABIOS demonstrates the effectiveness of storage consolidation to support the convergence of HPC and BigData workloads on a single platform.

  • Applications create I/O requests, called Labels
  • A Label is practically a placeholder of an I/O job:
    • {operation + pointer to data}
  • Labels are pushed in a distributed queue
  • Workers execute Labels independently

Agile:

    • Adaptive to the environment
    • Fully decoupled architecture

Software Defined Storage (SDS):

    • Offloading computation to servers
    • Data-centric architecture

Energy-aware:

    • Power-cap I/O
    • Elastic I/O resources

Reactive:

    • Tunable I/O performance - Concurrency control
    • Guaranteed Storage QoS based on job size

Flexible

    • POSIX, MPI-IO, HDF5, REST/Swift, Hadoop
    • Lustre, GPFS, HDFS, Hive, Object Stores

Challenges

LABIOS Architecture

LABIOS INTERNAL DESIGN

LABIOS Client

LABIOS Core

LABIOS Server

anatomy of LABIOS Operations


More Coming soon...