ML-Based Data Prefetching Competition

Final Results (presented at ISCA 2021)

Call for Papers

The ML For Computer Architecture and Systems Workshop invites submission for a new “ML Prefetching Competition.” Contestants will be provided a set of training traces to build their best prefetching algorithms, and the submissions will be evaluated using a common methodology on a set of undisclosed traces (details below). Given the context of the workshop, we encourage submissions that use machine learning based models, but this is not mandatory. Submissions will be chosen to appear in the workshop based on their performance on the undisclosed test set and their novelty.

Objective

The goal of the competition is to explore ideas from machine learning that can advance the state-of-the-art in data prefetching. Since evaluating machine learning models for data prefetching presents many challenges, the organizers will provide a common evaluation framework, including traces for several memory-intensive benchmarks. The competition will not generate practical hardware data prefetchers (see disclaimer below), but we hope that it will generate new ideas on how to develop high-performance next-generation data prefetchers.

Format

Input Traces: We will provide memory traces for several memory-intensive benchmarks from SPEC 2006, SPEC 2017, and GAP benchmark suites. The traces will include a sequence of memory loads at a 64-byte granularity. The memory trace is collected at the L3 level, so the stream is filtered by the L2 cache. Each memory access is associated with the following information: Unique Instr Id of the load, CPU cycle timestamp for the load, Physical address of the memory request, Instruction Pointer of the load (PC), LLC hit/miss.


Model Output: Your model should produce at most two prefetches for each memory access in the trace. We will provide a format for the output file. The generated prefetches will be fed into a microarchitectural simulator, which will produce the coverage, accuracy and IPC achieved by your prefetcher.


Simulator/Baselines: We will provide a microarchitectural simulator (a fork of ChampSim) with the baselines and required interface to feed model outputs into the simulator. The output of the simulator will display the coverage, accuracy, and IPC. Winners will be chosen by ranking IPC speedups over the provided baseline.


Evaluation: Since the address vocabulary is different for every program, your models will need to be retrained for every benchmark. To test how submissions generalize, our test set evaluation will have two components:

  • Undisclosed execution samples for the training traces: You can submit a pre-trained model for each benchmark in the training set, and we will evaluate it on a different sample of the same benchmark

  • Undisclosed benchmarks: We will train and test your model on unseen benchmarks using the training routines that you provide


What To Submit

  • A 4-page paper (not including references) - See https://sites.google.com/view/mlarchsys/isca-2021 for instructions on preparing your paper.

  • Code for both training and inference (Python >=3.6, PyTotch 1.5-1.7, TensorFlow 1.14): We will evaluate your code on Linux machines with 48 GB RAM and 8 GB GPU memory. Please ensure that your code can run on such a system in a reasonable amount of time.

  • README: Clear instructions on how to execute your code


Submission Link


Selection Criteria

We will pick the top-3 performing solutions based on IPC speedup on the test set. Papers with high novelty/interesting ideas that are not the top performers will also be considered to appear in the workshop.

Important Dates

Traces/Simulator Release: March 4, 2021

Submission deadline: Submission Deadline: May 16th, 2021 - Midnight Anywhere on Earth

Workshop: June 17-19, 2021

Disclaimer

The organizers would like to emphasize that invited submissions are likely to be impractical for hardware deployment and therefore, we advise caution in directly comparing winning entries to practical hardware prefetchers. Submissions differ from practical hardware data prefetchers in several ways. First, the competition does not impose any storage or computation limits on the submissions, whereas hardware prefetchers are typically designed under a strict storage/timing budget. Second, the competition allows for generous offline training for each benchmark, whereas hardware prefetchers are trained online while the program runs.


Contact

Join Google Group: https://groups.google.com/g/ml_prefetching_competition/


For questions/clarifications contact:

Akanksha Jain (akanksha@cs.utexas.edu)

Quang Duong (qduong@cs.utexas.edu)