FTXS 2016 - Kyoto, Japan

The 6th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS) 2016

LAST MINUTE UPDATE: HPDC registration opens at noon but FTXS starts at 10:50am on Tuesday, May 31st. HPDC has said to get your badges during a break so please don't let a lack of a badge stop you from coming to FTXS in the late morning.

UPDATE: Submission extension to February 19th, 2016 (see below for instructions)

WHEN?

WHERE?

VENUE?

ROOM?

IN ASSOCIATION WITH?

REGISTER

PAST FTXSs

CALL FOR PAPERS

See HPDC web site

See sidebar for 2010, 2012, 2013, 2014, and 2015

FTXS 2016 Call for Papers (CFP) available on Google Drive

Workshop Keynote

Fumiyoshi Shoji - RIKEN, Advanced Institute for Computational Science

Director - Operations and Computer Technologies Division

More information to be announced later

Workshop Agenda

6 high quality papers plus a keynote by RIKEN's Advanced Institute for Computational Science Director of Operations and Computer Technologies Division, Fumiyoshi Shoji make for an exciting day for FTXS 2016!

Submission Essential Information

Submissions are expected in the following categories:

  • Regular papers presenting innovative ideas improving the state of the art and experience papers discussing the issues seen on existing extreme-scale systems, including some form of analysis and evaluation

  • Extended abstracts proposing disruptive ideas in the field, including some form of preliminary results

Extended abstracts will be evaluated separately and given shorter oral presentations.

Authors are invited to submit papers with unpublished, original work of a maximum of eight (8) pages for normal papers and four (4) to six (6) pages for extended abstracts. Please follow the US Letter guidelines for ACM Proceedings Style.

Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the IEEE digital library. Submission implies the willingness of at least one of the authors to register and present the paper.

Submit a paper.

Important Dates

Submission of papers: February 13th, 2016 EXTENDED: February 19th, 2016

Author notification: March 12th, 2016 (this date may shift due to submission extension) EXTENDED: March 21st, 2016 due to submission extension

Camera ready papers: March 27th, 2016 UPDATE FROM HPDC: April 19, 2016

Workshop: May 31st, 2016

Workshop Topics

Topics include, but are not limited to:

    • Failure data analysis and field studies

    • Power, performance, resilience (PPR) assessments / tradeoffs

    • Novel fault-tolerance techniques and implementations

    • Emerging hardware and software technology for resilience

    • Silent data corruption (SDC) detection / correction techniques

    • Advances in reliability monitoring, analysis, and control of highly complex systems

    • Failure prediction, error preemption, and recovery techniques

    • Fault-tolerant programming models

    • Models for software and hardware reliability

    • Metrics and standards for measuring, improving, and enforcing effective fault-tolerance

    • Scalable Byzantine fault-tolerance and security from single-fault and fail-silent violations

    • Atmospheric evaluations relevant to HPC systems (terrestrial neutrons, temperature, voltage, etc.)

    • Near-threshold-voltage implications and evaluations for reliability

    • Benchmarks and experimental environments including fault injection

    • Frameworks and APIs for fault-tolerance and fault management

Workshop Chairs

Nathan DeBardeleben - Los Alamos National Laboratory

Workshop Organizing Committee

Keita Teranishi – Sandia National Laboratories

Atsushi Hori – RIKEN AICS

Program Committee

Leonardo Bautista Gomez – Barcelona Supercomputing Center

Bogdan Nicolae – IBM Ireland

Aurélien Bouteiller – University of Tennessee Knoxville

Henri Casanova - University of Hawai`i at Manoa

Zizhong Chen – University of California, Riverside

Robert Clay – Sandia National Laboratories

John Daly - Department of Defense

James Elliott – Sandia National Laboratories

Christian Engelmann – Oak Ridge National Laboratory

Kurt Ferreira – Sandia National Laboratories

Qiang Guan – Los Alamos National Laboratory

Sudhanva Gurumurthi – IBM

Saurabh Hukerikar – Oak Ridge National Laboratory

Hideyuki Jitsumoto – Tokyo Institute of Technology

Zhiling Lan – Illinois Institute of Technology

Scott Levy – University of New Mexico

Naoya Maruyama – RIKEN AICS

Yves Robert - ENS Lyon

Anthony Skjellum – Auburn University

Vilas Sridharan – AMD, Inc.

Peter Strazdins – Australian National University

Abhinav Vishnu - Pacific Northwest National Laboratory