Publications

2023

M. Pellauer, J. Clemons, V. Balaji, N. Crago, A. Jaleel, D. Lee, M. O’Connor, A. Parashar, S. Treichler, P.-A. Tsai, S.W. Keckler, and J.S. Emer, "Symphony: Orchestrating Sparse and Dense Tensors with Hierarchical Heterogeneous Processing," ACM Transactions on Computer Systems, December 2023.  [PDF]

2022

M.B. Sullivan, N. Saxena, M. O’Connor, D. Lee, P. Racunas, S. Hukerikar, T. Tsai, S. Hari, and S.W. Keckler, “Characterizing and Mitigating Soft Errors in GPU DRAM,” IEEE Micro "Top Picks" issue, July/August 2022. 

M. O’Connor, D. Lee, N. Chatterjee, M.B. Sullivan, and S.W. Keckler, “Saving PAM4 Bus Energy with SMOREs: Sparse Multi-level Opportunistic Restricted Encodings,” Proceedings of the 28th IEEE International Symposium on High Performance Computer Architecture (HPCA 2022), April 2022.  [PDF]

2021

M.B. Sullivan, N. Saxena, M. O’Connor, D. Lee, P. Racunas, S. Hukerikar, T. Tsai, S. Hari, and S.W. Keckler, “Characterizing and Mitigating Soft Errors in GPU DRAM,” 54th International Symposium on Microarchitecture (MICRO 2021), October 2021.  [PDF]

Selected for Top Picks (One of top 12 Architecture papers in 2021)

J. M. O'Connor, "Energy-Efficient, High-Bandwidth DRAM for Throughput Processors,” PhD Dissertation, Department of Electrical and Computer Engineering, The University of Texas at Austin, May 2021.  [PDF] 

A. Mehrabi, D. Lee, N. Chatterjee, D. Sorin, B. Lee, and M. O’Connor, “Learning Sparse Matrix Row Permutations for Efficient SpMM on GPU Architectures,” 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2021), March 2021.  [PDF] 

2020

E. Chouske, M.B. Sullivan, M. O’Connor, M. Erez, J. Pool, D. Nellans, and S.W. Keckler, “Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs,” 47th International Symposium on Computer Architecture (ISCA 2020), June 2020.  [PDF] 

2019

D. Fujiki, N. Chatterjee, D. Lee, and M. O’Connor, “Near-Memory Data Transformation for Efficient Sparse Matrix Multi-Vector Multiplication,” Proceedings of the 2019 International Conference for High-Performance Computing, Networking, Storage, and Analysis (Supercomputing ’19), November 2019.  [PDF] 

S. Lym, D. Lee, N. Chatterjee, M. O'Connor, and M. Erez, “DeLTA: GPU Performance Model for Deep Learning with In-depth Memory System Data Traffic Analysis,” Proceeding of the 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2019), March 2019.  [PDF]

2018

S. Ghose, A. G. Yağlıkçı, R. Gupta, D. Lee, K. Kudrolli, W. Liu, H. Hasan, K. Chang, N. Chatterjee, A. Agrawal, M. O’Connor, and O. Mutlu, “What Your DRAM Power Models Aren’t Telling You: Lessons from a Detailed Experimental Study,” Proceedings of  the ACM International Conference on Measurement and Analysis of Computer Systems (SIGMETRICS 2018), June 2018.  [PDF]

D. Lee, M. O'Connor, and N. Chatterjee, “Reducing Data Transfer Energy by Exploiting Similarity within a Data Transaction,Proceedings of the 24th IEEE International Symposium on High Performance Computer Architecture (HPCA 2018), February 2018. [PDF]

Best Paper nominee 

M. Rhu, M. O'Connor, N. Chatterjee, J. Pool, Y. Kwon, and S.W. Keckler, "Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks," Proceedings of the 24th IEEE International Symposium on High Performance Computer Architecture (HPCA 2018), February 2018. [PDF]  [arXiv preprint]

2017

G. Kim, N. Chatterjee, M. O’Connor, K. Hsieh, “Towards Standardized Near-Data Processing with Unrestricted Data Placement,” Proceedings of the 2017 International Conference for High-Performance Computing, Networking, Storage, and Analysis (Supercomputing ’17), November 2017. [PDF]

M. O’Connor, N. Chatterjee, D. Lee, J. Wilson, A. Agrawal, S. W. Keckler, and W. J. Dally, “Fine-Grained DRAM: Energy Efficient DRAM for Extreme Bandwidth Systems,” Proceedings of the 50th International Symposium on Microarchitecture (MICRO 2017), October 2017. [PDF]

K. Chang, A. G. Yağlıkçı, S. Ghose, A. Kashyap, H. Hassan, A. Agrawal, N. Chatterjee, D. Lee, M. O’Connor, and O. Mutlu, “Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms," Proceedings of the ACM SIGMETRICS International Conference on Measurement and Analysis of Computer Systems (SIGMETRICS 2017), June 2017. [PDF]

N. Chatterjee, M. O'Connor, D. Lee, D. R. Johnson, M. Rhu, S.W. Keckler, and W. J. Dally, “Architecting an Energy-Efficient DRAM System for GPUs,” Proceedings of the 23rd IEEE International Symposium on High Performance Computer Architecture (HPCA 2017), February 2017. [PDF]

2016

A. Agrawal, M. O’Connor, E. Bolotin, N. Chatterjee, J. Emer, and S.W. Keckler, “CLARA: Circular Linked-List Auto- and Self-Refresh Architecture,” Proceedings of the 2016 International Symposium on Memory Systems (MEMSYS’16), October 2016. [PDF]

K. Hsieh, E. Ebrahimi, G. Kim, N. Chatterjee, M. O’Connor, N. Vijaykumar, O. Mutlu, and S. W. Keckler, “Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems,” Proceedings of the 43rd International Symposium on Computer Architecture (ISCA 2016), June 2016. [PDF]

2015

M. O'Connor and E.E. Swartzlander, Jr., "Exploiting Asymmetry in Booth-Encoded Multipliers for Reduced Energy Multiplication," Proceedings of the 49th Asilomar Conference on Signals, Systems and Computers, November 2015. [PDF]

T. H. Hetherington, M. O’Connor, and T. M. Aamodt, “MemcachedGPU: Scaling-up Scale-out Key-value Stores,” Proceedings of the 2015 ACM Symposium on Cloud Computing (SoCC'15), August 2015. [PDF]

E. Bolotin, D. Nellans, O. Villa, M. O'Connor, A. Ramirez, and S. W. Keckler, "Designing Efficient Heterogeneous Memory Architectures," IEEE Micro special issue on Heterogeneous Computing, July/August 2015. [PDF]

G. Pekhimenko, E. Bolotin, M. O'Connor, O. Mutlu, T. C. Mowry, and S. W. Keckler, "Toggle-Aware Compression for GPUs," Computer Architecture Letters, Jan-June 2015. [PDF] 

T. G. Rogers, D. R. Johnson, M. O’Connor, and S. W. Keckler, “A Variable Warp Size Architecture,” Proceedings of the 42nd International Symposium on Computer Architecture (ISCA 2015), June 2015. [PDF]

M. Stephenson, S. Hari, Y. Lee, E. Ebrahimi, D. R. Johnson, D. Nellans, M. O’Connor, and S. W. Keckler, “Flexible Software Profiling of GPU Architectures,” Proceedings of the 42nd International Symposium on Computer Architecture (ISCA 2015), June 2015. [PDF]

N. Agarwal, D. Nellans, M. Stephenson, M. O’Connor, and S. W. Keckler, “Page Placement Strategies for GPUs within Heterogeneous Memory Systems,” Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2015), March 2015. [PDF] 

N. Agarwal, D. Nellans, M. O’Connor, S. W. Keckler, and T. Wenisch, “Unlocking Bandwidth for GPUs in CC-NUMA Systems,Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture (HPCA 2015), February 2015. [PDF]

D. Li, M. Rhu, D. R. Johnson, M. O’Connor, M. Erez, D. Burger, D. Fussell, and S. W. Keckler, “Priority-based Cache Allocation in Throughput Processors,Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture (HPCA 2015), February 2015. [PDF]

2014

T. G. Rogers, M. O’Connor, and T. M. Aamodt, “Learning Your Limit: Managing Massively Multithreaded Caches Through Scheduling,” Communications of the ACM, December 2014. [PDF]

O. Villa, D. R. Johnson, M. O’Connor, E. Bolotin, D. Nellans, J. Luitjens, N. Sakharnykh, P. Wang, P. Micikevicius, A. Scudiero, S. W. Keckler, and W. J. Dally, “Scaling the Power Wall: A Path to Exascale,Proceedings of the 2014 International Conference for High-Performance Computing, Networking, Storage, and Analysis (Supercomputing ’14), November 2014. [PDF]

N. Chatterjee, M. O’Connor, N. Jayasena, G. H. Loh, and R. Balasubramonian, “Managing DRAM Latency Divergence in Irregular GPGPU Applications,” Proceedings of the 2014 International Conference for High-Performance Computing, Networking, Storage, and Analysis (Supercomputing ’14), November 2014. [PDF]

I. Singh, A. Shriraman, W. Fung, M. O’Connor, and T. M. Aamodt, “Cache Coherence for GPU Architectures,” IEEE Micro “Top Picks” issue, May/June 2014. [PDF]

J. Sim, G. Loh, V. Sridharan, and M. O’Connor, “A Configurable and Strong RAS Solution for Die-stacked DRAM Caches,” IEEE Micro “Top Picks” issue, May/June 2014. [PDF]

A. El Tantawy, J.W. Ma, M. O’Connor, and T. M. Aamodt, “A Scalable Multi-Path Microarchitecture for Efficient GPU Control Flow,” Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture (HPCA 2014), February 2014. [PDF]

2013

T. G. Rogers, M. O’Connor, and T. M. Aamodt, “Divergence-Aware Warp Scheduling,” Proceedings of the 46th International Symposium on Microarchitecture (MICRO 2013), December 2013.  [PDF]

T. G. Rogers, M. O’Connor, and T. M. Aamodt, “Cache-Conscious Thread Scheduling for Massively Multithreaded Processors,” IEEE Micro “Top Picks” issue, May/June 2013. [PDF]

L. Guckert, M. O'Connor, S. Ravindranath, Z. Zhao, and V. J. Reddi, “A Case for Persistent Caching of Compiled JavaScript Code in Mobile Web Browsers,” 6th Workshop on Architectural and Microarchitectural Support for Binary Translation (AMAS-BT 2013), June 2013. [PDF]

J. Sim, G. Loh, V. Sridharan, and M. O’Connor, “Resilient Die-Stacked DRAM Caches,” Proceedings of the 40th International Symposium on Computer Architecture (ISCA 2013), June 2013. [PDF]

Selected for Top Picks (One of top 12 Architecture papers in 2013)

K. Chang, G. Loh, M. Thottethodi, Y. Eckert, M. O’Connor, L. Subramanian, and O. Mutlu, “Enabling Efficient Dynamic Resizing of Large DRAM Caches via A Hardware Consistent Hashing Mechanism,” Carnegie Mellon University - SAFARI Technical Report No. 2013-001, April 2013. [PDF]

H. Jooybar, W. Fung, M. O’Connor, J. Devietti, and T. M. Aamodt, “GPUDet: A Deterministic GPU Architecture,” Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2013), March 2013. [PDF]

I. Singh, A. Shriraman, W. Fung, M. O’Connor, and T. M. Aamodt, “Cache Coherence for GPU Architectures,” Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture (HPCA 2013), February 2013. [PDF]

Selected for Top Picks (One of top 12 Architecture papers in 2013)

2012

T. G. Rogers, M. O’Connor, and T. M. Aamodt, “Cache-Conscious Wavefront Scheduling,” Proceedings of the 45th International Symposium on Microarchitecture (MICRO 2012), December 2012. [PDF]

Best Paper runner-up 

Selected for Top Picks (One of top 11 Architecture papers in 2012)

Selected for Research Highlights article in Communications of the ACM

J. Sim, G. Loh, H. Kim, M. O’Connor, and M. Thottethodi, “A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch,” Proceedings of the 45th International Symposium on Microarchitecture (MICRO 2012), December 2012. [PDF]

T. H. Hetherington, T. G. Rogers, L. Hsu, M. O’Connor, and T. M. Aamodt, “Characterizing and Evaluating a Key-Value Store Application on Heterogeneous CPU-GPU Systems,” Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2012), April 2012. [PDF]

G. H. Loh, N. Jayasena, J. Chung, S. K. Reinhardt, J. M. O’Connor, and K. McGrath, “Challenges in Heterogeneous Die-Stacked and Off-Chip Memory Systems,” 3rd Workshop on SoCs, Heterogeneous Architectures and Workloads (SHAW-3), February 2012. [PDF]

Earlier

M. O’Connor and C. Gomez, “The iFlow Address Processor,” IEEE Micro "Hot Chips" issue, March/April 2001. [PDF]

S. Hangal and M. O’Connor, “Performance Analysis and Validation of the picoJava Processor,” IEEE Micro special issue on Processor Modeling and Validation, May/June 1999. [PDF]

H. McGhan and M. O’Connor, “picoJava: A Direct Execution Engine for Java Bytecode,” IEEE Computer, October 1998. [PDF]

J. M. O’Connor and M. Tremblay, “picoJava-I: The Java Virtual Machine in Hardware,” IEEE Micro "Hot Chips" issue, March/April 1997. [PDF]

M. Tremblay, J. M. O’Connor, V. Narayanan, and L. He, “VIS Speeds New Media Processing,” IEEE Micro special issue on Media Processing, August 1996. [PDF]

M. Tremblay and J. M. O’Connor, “UltraSPARC-I: A Four-Issue Processor Supporting Multimedia,” IEEE Micro "Hot Chips" issue, March 1996. [PDF]

M. O’Connor, “Extending instructions for multimedia,” Electronic Engineering Times, Issue 874, November 13, 1995. [PDF]

G. Maturana, L. J. Ball, J. Gee, A. Iyer, and J. M. O’Connor, “Incas: A cycle accurate model of UltraSPARC,” Proceedings of the 1995 International Conference on Computer Design (ICCD), October 1995. [PDF]

S. Saiyed, J. M. O’Connor, and M. Franklin, “POWER2 CPU-Intensive Workload Performance,” PowerPC and POWER2: Technical Aspects of the New IBM RISC System/6000, September 1993. [PDF]

J.M. O’Connor, “Evaluating the Performance of Instruction Set Architectures for Superscalar Processors,” MS Thesis, Department of Electrical and Computer Engineering, The University of Texas at Austin, August 1993. [PDF] 


Click here to visit my Google Scholar page