DPC4 extends the ChampSim simulator* to evaluate all submitted prefetchers. Please find the source code here: https://github.com/CMU-SAFARI/ChampSim
Please follow the best ChampSim practices to implement the prefetcher: https://champsim.github.io/ChampSim/master/Modules.html#memory-prefetchers. You can take a look at different prefetchers' implementation inside the prefetch directory to understand more.
To aid the prefetcher design across multiple processor configurations, the simulator provides some API functions, which can be found in inc/dpc_api.h. This, along with the counters at each cache level, can provide multitude of system-level information (e.g., main memory bandwidth usage, prefetcher accuracy), which may be useful for a prefetcher design.
Note: Participants may choose to use the "intern_" pointer from a prefetcher that gives them access to the parent cache. However, the usage of cache pointer should be strictly limited to (1) injecting prefetch call(s), and (2) get existing auxiliary stats (e.g., pf_filled, pf_useful etc.) to make better prefetch decisions dynamically. The cache pointer should not be used to gain visibility of the cache in any way that otherwise would have required cache operations in real hardware (e.g., the pointer should not be used to check the existence of a cacheline). If you are unsure whether your usage of cache pointer is correct, please write to us.
As the rules mention, every submission will be evaluated using three processor configurations, which can be found inside the dpc4 directory. These are:
1C.fullBW.baseline: single-core baseline configuration with single-channel 4800 MTPS main memory
1C.limitBW.baseline: single-core bandwidth-limited configuration employing single-channel 800 MTPS main memory
4C.baseline: four-core configuration with a single-channel 4800 MTPS main memory
All three configurations already employ prefetchers: Berti at L1** and Pythia at L2, that use the same parameters across all configurations. Participants can either extend these prefetchers or design new prefetchers from scratch.Â
Every submission will be evaluated against the baseline of each configuration.
* Originally maintained by researchers at Texas A&M Univeristy.
** Note that, we use the version of Berti that was submitted to DPC3 to fairly implement the prefetcher using only the exposed API functions. The MICRO'22 version of Berti can be found in here.
Participants can find a list of disclosed ChampSim traces above. Each trace contains at least 250 million instructions. These traces are meant to be used to explore the design-space of proposed prefetchers. However, every submission will be evaluated over a superset of traces (all of disclosed traces, plus additional hidden traces), which will be released after the conclusion of the championship. This is to prevent any submitted prefetcher to be overfitted for all traces.