Paper Readings

This is the list of papers you need to read for each course topic. You will need to submit a weekly paper review for a subset of the papers, listed here.

Intro:

[1] A. J. Smith, "The Task of the Referee," IEEE Computer, 1990.

[2] M. D. Hill, S. Adve, L. Ceze, M. J. Irwin, D. Kaeli, M. Martonosi, J. Torrellas, T. F. Wenisch, D. Wood, K. Yelick, "21st Century Computer Architecture," CCC Whitepaper, 2012.

Multicores and Multiprogramming:

[3] E. Fatehi, P. V. Gratz, "ILP and TLP in Shared Memory Applications: A Limit Study," PACT, 2014.

[4] C. Bienia, S. Kumar, J. P. Singh, K. Li, "The PARSEC Benchmark Suite: Characterization and Architectural Implications," PACT, 2008.

Synchronization:

[5] M. L. Scott, "Shared-Memory Synchronization," Synthesis Lectures on Computer Architecture, Chapters 1, 4.0-4.3.3 and 5.0-5.2.5.

[6] R. Rajwar, J. R. Goodman, "Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution," MICRO, 2001.

Cache and Memory Hierarchy:

[7] D. J. Sorin, M. D. Hill, D. A. Wood, "A Primer on Memory Consistency and Cache Coherence," Synthesis Lectures on Computer Architecture, Chapter 2.

Coherence:

[7] D. J. Sorin, M. D. Hill, D. A. Wood, "A Primer on Memory Consistency and Cache Coherence," Synthesis Lectures on Computer Architecture, Chapters 6-8.

[8] G. Zhang, W. Horn, D. Sanchez, "Exploiting Commutativity to Reduce the Cost of Updates to Shared Data in Cache-Coherent Systems," MICRO, 2015.

Consistency:

[7] D. J. Sorin, M. D. Hill, D. A. Wood, "A Primer on Memory Consistency and Cache Coherence," Synthesis Lectures on Computer Architecture, Chapters 3-5.

[9] M. D. Hill, "Multiprocessors Should Support Simple Memory Consistency Models," IEEE Computer, 1998.

Transactional Memory:

[10] T. Harris, J. Larus, R. Rajwar, "Transactional Memory, 2nd Edition," Synthesis Lectures on Computer Architecture, Chapters 1 and 5.

[11] K. E. Moore, J. Bobba, M. J. Moravan, M. D. Hill, D. A. Wood, "LogTM: Log-Based Transactional Memory," HPCA, 2006.

Interconnects:

[12] N. Enright Jerger, L.-S. Peh, "On-Chip Networks," Synthesis Lectures on Computer Architecture, Chapters 3-6.

[13] T. Moscibroda, O. Mutlu, "A Case for Bufferless Routing in On-Chip Networks," ISCA, 2009.

[14] J. Kim, J. Balfour, W. Dally, "Flattened Butterfly Topology for On-Chip Networks," MICRO, 2007.

GPUs:

[15] H. Kim, R. Vuduc, S. Baghsorkhi, J. Choi, W.-M. Hwu, "Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU)," Synthesis Lectures on Computer Architecture, Chapter 1.

[16] D. Wong, N. S. Kim, M. Annavaram, "Approximating Warps with Intra-Warp Operand Value Similarity," HPCA, 2016.

Accelerators:

[17] M. S. B. Altaf, D. A. Wood, "LogCA: A High-Level Performance Model for Hardware Accelerators," ISCA, 2017.

Unconventional Parallelism:

[18] M. C. Jeffrey, S. Subramanian, C. Yan, J. Emer, D. Sanchez, "A Scalable Architecture for Ordered Parallelism," MICRO, 2015.

[19] S. Aga, S. Jeloka, A. Subramaniyan, S. Narayanasamy, D. Blaauw, R. Das, "Compute Caches," HPCA, 2017.

[20] J. San Miguel, N. Enright Jerger, "The Anytime Automaton," ISCA, 2016.

Page updated

Report abuse