FTXS 2020 - Atlanta, GA (VIRTUAL)

Held in conjunction with:

In cooperation with:

The 10th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS) 2020

Virtual Presentation Guidelines

    • Based on recommendations from the SC Organizing Committee, all presentations will be pre-recorded. Authors of accepted papers must submit their pre-recorded presentation no later than October 9, 2020.

    • Recordings must be uploaded as mp4 files. The following video walks through how to record your presentation in PowerPoint:04_How to Record Video and Audio in PowerPoint.mp4. (UPDATE 10/5/20: Closed Captioning is no longer required for your presentation recording) Closed Captioning (CC) information must also be uploaded as a vtt file. The process of adding CC to your presentation is explained in: 05_Adding Closed Captions to Your Recording Using VTT.mp4. In a nutshell, there's an online tool that will automatically extract captions from your recording.

    • Presenters will need to sign the Video Consent and Release. We don't yet know what the deadline for completing this form is, but it would likely need to be completed no later than when you upload your video files.

    • Additional information is available in: 02B_Tutorials-Workshops_Virtual Presentation Guidelines and Requirements v2.pdf. This document presents some nitty-gritty technical information, e.g., details on resolution, bitrate, aspect ratio, and filenames. Some of these details are presented in the list below, but the referenced document is ultimate authority on the requirements.

        • FILENAME: for this workshop, video (MP4) files should be named using the following format,

        • 2020.11.11_1000_<submissionID>_Workshop_<Last_name>_<First_name>.mp4

        • where <submissionID> is the identifier assigned to your paper when it was submitted (e.g., ws_ftxs101s1), <Last_name> is the last (or family) name of the presenter, and <First_name> is the first (or given) name of the presenter.

        • VIDEO BITRATE: 5-10 Mbps

        • AUDIO BITRATE: 160-256 Kbps

        • ASPECT RATIO: the aspect ratio of the presentation should be 16:9

    • One note of caution: this document is a mix of requirements for presenters and organizers, presenters can, for example, safely ignore the discussion of the Run of Show.

Workshop Program

FTXS 2020 will be held online on Wednesday, November 11. The schedule is provided below. All times are in Eastern Standard Time (the timezone of Atlanta, GA, USA). The name of the presenter of each paper is in italics.

Essential Submission Information

UPDATES

    • The deadline for submission has been EXTENDED. This is the final deadline. Submissions are now due September 7, 2020 (anywhere-on-earth).

    • FTXS will be entirely VIRTUAL this year. We will still publish proceedings, but the event will be held exclusively online. We will provide details on this web page as we get closer to the event.

Workshop Topics

Authors are invited to submit original papers on the research and practice of fault-tolerance in extreme-scale distributed systems (primarily HPC systems, but including grid and cloud systems). Resilience and fault-tolerance remain a major concern for supercomputing and advances in this area are needed to allow applications to compute accurate (or within an acceptable error tolerance) answers in a timely and efficient manner in the presence of degradations or failures of platform components (both hardware and software).

Topics include, but are not limited to:

    • Failure data analysis and field studies

    • Power, performance, resilience (PPR) assessments / tradeoffs

    • Novel fault-tolerance techniques and implementations

    • Emerging hardware and software technology for resilience

    • Silent data corruption (SDC) detection / correction techniques

    • Advances in reliability monitoring, analysis, and control of highly complex systems

    • Failure prediction, error preemption, and recovery techniques

    • Fault-tolerant programming models

    • Models for software and hardware reliability

    • Metrics and standards for measuring, improving, and enforcing effective fault-tolerance

    • Scalable Byzantine fault-tolerance and security from single-fault and fail-silent violations

    • Atmospheric evaluations relevant to HPC systems (terrestrial neutrons, temperature, voltage, etc.)

    • Near-threshold-voltage implications and evaluations for reliability

    • Benchmarks and experimental environments including fault injection

    • Frameworks and APIs for fault-tolerance and fault management

Submission Details

Submissions are solicited in the following categories:

    • Regular papers presenting innovative ideas improving the state of the art or discussing the issues seen on existing extreme-scale systems, including some form of analysis and evaluation.

    • Extended abstracts proposing disruptive ideas and challenging assumptions in the field, including some form of preliminary results.

Extended abstracts will be evaluated separately and given shorter oral presentations.

Submissions shall be sent electronically and must conform to the IEEE proceedings style. Regular papers should not exceed ten (10) pages including all text, appendices, figures, and references. Extended abstract papers should not exceed six (6) pages. Please note that we have only placed a limit on the maximum number of pages that a submission may contain. Papers that are clear, coherent, and complete (with the understanding that the submission may represent a work-in-progress) but are shorter than this maximum are encouraged.

Papers should be submitted to: https://submissions.supercomputing.org. A sample submission form is available here

Submitted papers will be peer-reviewer and will receive a minimum of three reviews.

Authors are encouraged to include reproducibility artifacts as described on the conference website:

https://sc20.supercomputing.org/submit/sc-reproducibility-initiative

Inclusion of reproducibility artifacts is optional.

Important Dates

Paper submission opens: July 1, 2020

Paper submission closes: September 7, 2020 September 4, 2020 August 28, 2020 (anywhere-on-earth)

Author notification: September 28, 2020 (anywhere-on-earth)

Presentation recording uploaded: October 9, 2020

Copyright agreement completed: October 18, 2020 October 7, 2020

Camera ready papers: October 18, 2020 October 11, 2020

Workshop: November 11, 2020

Workshop Co-chairs

Scott Levy - Sandia National Laboratories

Nathan DeBardeleben - Los Alamos National Laboratory

Workshop Organizing Committee

Keita Teranishi – Sandia National Laboratories

John Daly – Laboratory for Physical Sciences

Program Committee

Leonardo Bautista-Gomez — Barcelona Supercomputing Center

Aurelien Bouteiller — University of Tennessee

Chris Cantwell — Imperial College, London

Sunita Chandrasekaran — University of Delaware

Florina M. Ciorba — University of Basel

James Elliott — Sandia National Laboratories

Christian Engelmann — Oak Ridge National Laboratory

Wilfried Gansterer — University of Vienna

Qiang Guan — Kent State University

Zhiling Lan — Illinois Institute of Technology

Jackson Mayo — Sandia National Laboratories

Bogdan Nicolae — Argonne National Laboratory

Yves Robert — ENS Lyon, University of Tennessee

Abhinav Vishnu — Advanced Micro Devices (AMD) Inc

Panruo Wu — University of Houston

Diversity & Inclusivity

As part of SC20, FTXS is fully committed to addressing diversity and inclusivity at our workshop and in the larger HPC fault tolerance community (see here for more information SC20's commitment to inclusivity and diversity). As a first step, we used an anonymous survey to collect demographic information about our Program Committee to ensure that we can measure our progress and so that we can be held accountable by the HPC community. The results of this survey are included below. Because our committee comprises a relatively small number of people, we are not releasing exact numbers in an effort to protect their privacy.

    • Approximately 90% of our Program Committee completed our anonymous demographic survey

    • Location:

      • Approximately 3/5 of respondents work in North America

      • Approximately 1/3 of respondents work in Europe

      • Approxmately 1/10 of respondents work in Asia

    • Gender:

      • Approximately 4/5 of respondents identify as male

      • Approximately 1/5 of respondents identify as female

      • No respondent identified as non-binary

    • Racial and Ethnic Groups

      • Approximately 1/5 of respondents identify as a member of an underrepresented racial or ethnic group

      • Approximately 4/5 of respondents do not identify as a member of an underrepresented racial or ethnic group