Publications

NVIDIA

Mohamed Tarek Ibn Ziad, Sana Damani, Aamer Jaleel, Stephen W. Keckler, Mark Stephenson, cuCatch: A Debugging Tool for Efficiently Catching Memory Safety Violations in CUDA Applications. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2023 (pdf).

Sana Damani, Mark Stephenson, Ram Rangan, Daniel Johnson, Rishkul Kulkarni, Stephen W. Keckler, GPU Subwarp Interleaving.  In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), Industry Track, 2022 (pdf).

Mark Stephenson, Ram Rangan, Stephen W. Keckler, Cooperative Profile Guided Optimization.  In the International Conference on High Performance Graphics (HPG) and the Computer Graphics Forum (CGF), 2021 (pdf, video).

Mark Stephenson and Ram Ragan, PGZ: Automatic Zero-Value Code SpecializationIn the International Conference on Compiler Construction (CC), 2021 (pdf, slides, video).

Ram Rangan, Mark Stephenson, Aditya Ukarande, Shyam Murthy, Virat Agarwal, Marc Blackstein, Zeroploit: Exploiting Zero Valued Operands in Interactive Gaming Applications. In the ACM Transactions on Architecture and Code Optimization (TACO), 2020 (pdf).

Sana Damani, Daniel R. Johnson, Mark Stephenson, Stephen W Keckler, Eddie Yan, Michael McKeown, Olivier Giroux, Speculative Reconvergence for Improved SIMT Efficiency, In the 18th International Symposium on Code Generation and Optimization.  February 2020.  (Sana's project page.)

Oreste Villa, Mark Stephenson, David Nellans, Stephen W Keckler, NVBit: A Dynamic Binary Instrumentation Framework for NVIDIA GPUs, In the 52nd Annual International Symposium on Microarchitecture (MICRO-52).  October 2019. (Project page.)

Neal C Crago, Mark Stephenson, Stephen W Keckler, Exposing Memory Access Patterns to Improve Instruction and Memory Efficiency in GPUs, In the ACM Transactions on Architecture and Code Optimization (TACO).  Volume 15, October 2018.

Dani Voitsechov, Arslan Zulfiqar, Mark Stephenson, Mark Gebhart, Stephen W Keckler, Software-Directed Techniques for Improved GPU Register File Utilization, In the ACM Transactions on Architecture and Code Optimization (TACO).  Volume 15, September 2018 (pdf).

Siva Kumar Sastry Hari, Timothy Tsai, Mark Stephenson, Steve Keckler, and Joel Emer, SASSIFI: An Architecture-Level Fault Injection Tool for GPU Application Resilience Evaluation, In the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2017.

Gwangsun Kim, Jiyun Jeong, John Kim, Mark Stephenson, Automatically Exploiting Implicit Pipeline Parallelism from Multiple Dependent Kernels for GPUs, In Proceedings of the International Conference on Parallel Architecture and Compilation Technique (PACT),  September 2016.

Tianhao Zheng, David Nellans, Arslan Zulfiqar, Mark Stephenson, Stephen W Keckler, Towards High Performance Paged Memory for GPUs, In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA).  Barcelona, Spain.  March 2016.

Mark Stephenson, Siva Kumar Sastry Hari, Yunsup Lee, Eiman Ebrahimi, Daniel R. Johnson, David Nellans, Mike O'Connor, and Stephen W. Keckler, Flexible Software Profiling of GPU Architectures, In the 42nd International Symposium on Computer Architecture (ISCA), Portland, June 2015 (pdf, slides, project page).

Siva Kumar Sastry Hari, Timothy Tsai, Mark Stephenson, Steve Keckler, and Joel Emer, SASSIFI: Evaluating Resilience of GPU Applications, In the 11th Workshop on Silicon Errors in Logic - System Effects (SELSE), Austin, Texas, 2015.

Neha Agarwal, David Nellans, Mark Stephenson, Mike O’Connor, Stephen W. Keckler, Page Placement Strategies for GPUs within Heterogeneous Memory Systems, In Proceedings of International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2015. 

Yunsup Lee, Vinod Grover, Ronny Krashinsky, Mark Stephenson, Stephen W. Keckler, Krste Asanović. Exploring the Design Space of SPMD Divergence Management on Data-Parallel Architectures. In the 47th International Symposium on Microarchitecture (MICRO-47), Cambridge, UK, December 2014.

IBM

Ram Rajamony, Mark Stephenson, Evan Speight.  The Power 775 Architecture at Scale.  In the International Conference on Supercomputing (ICS).  Eugene, Oregon.  June 2013 (pdf).

Mark Stephenson, Ram Rangan, Emmanuel Yashchin, and Eric Van Hensbergen.  Statistically Regulating Program Behavior via Mainstream Computing.  In the International Symposium on Code Generation and Optimization (CGO). Toronto, Ontario. April 2010 (pdf, slides).

Mark Stephenson, Lixin Zhang, and Ram Rangan. Lightweight Predication Support for Out of Order Processors. In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA). Raleigh, North Carolina. February 2009 (pdf).

Vipin Sachdeva, Evan Speight, Mark Stephenson, and L. Chen. Characterizing and Improving the Performance of Bioinformatics Workloads on the POWER5 Architecture. In International Symposium on Workload Characterization (IISWC). Boston, Massachusetts. September 2007.

MIT

Mark Stephenson, Saman Amarasinghe. Predicting Unroll Factors Using Supervised Classification. In Proceedings of International Symposium on Code Generation and Optimization (CGO). San Jose, California. March 2005 (ppt, project page).

Diego Puppin, Mark Stephenson, Walter Lee, Saman Amarasinghe. Convergent Scheduling. In Journal of Instruction-Level Parallelism. Volume 6, September 2004 (pdf).

Diego Puppin, Mark Stephenson, Saman Amarasinghe, Una-May O'Reilly, Martin Martin. Adapting Convergent Scheduling Using Machine Learning. In Proceedings of the 16th International Workshop on Languages and Compilers for Parallel Computing, College Station, TX, October 2003 (pdf, ppt).

Mark Stephenson, Martin Martin, Una-May O'Reilly, and Saman Amarasinghe. Meta Optimization: Improving Compiler Heuristics with Machine Learning. In Proceedings of the SIGPLAN '03 Conference on Programming Language Design and Implementation (PLDI), San Diego, CA, June 2003 (pdf, ppt, project page).

Mark Stephenson, Una-May O'Reilly, Martin Martin, and Saman Amarasinghe. Genetic Programming Applied to Compiler Heuristic Optimization. In Proceedings of the 6th European Conference on Genetic Programming (EuroGP), Essex, UK, April 14, 2003 (pdf, ppt, project page).

Mark Stephenson, Jonathan Babb, and Saman AmarasingheBitwidth Analysis with Application to Silicon Compilation.  In Proceedings of the SIGPLAN conference on Programming Language Design and Implementation (PLDI), Vancouver, British Columbia, June 2000 (pdf, ppt, project page).

THESES

Automating the Construction of Compiler Heuristics Using Machine LearningPhD thesis.  Massachusetts Institute of Technology.  May 2006 (pdf, project page, job talk).

Bitwise: Optimizing Bitwidths Using Data-Range PropagationMaster's thesis.  Massachusetts Institute of Technology.  May 2000 (pdf, project page).