Publications

Please respect all copyrights.

Where available, I have provided direct links to the official copies of the papers ("link") as well as personal local copies of the papers ("pdf).  When possible, please download from the respective digital libraries since this often generates additional revenue for the respective computer architecture special interest groups.

2024


Alan Smith, Gabriel H. Loh, Michael J. Schulte, Mike Ignatowski, Samuel Naffziger, Mike Mantor, Mark Fowler, Nathan Kalyanasundharam, Vamsi Alla, Nicholas Malaya, Joseph L. Greathouse, Eric Chapman, Raja Swaminathan

Realizing the AMD Exascale Heterogeneous Processor Vision (Industry Track) (pdf)

In the International Symposium on Computer Architecture (ISCA), Buenos Aires, AR, June 2024


Alan Smith, Gabriel H. Loh, Hugh McIntyre, John J. Wuu, Raja Swaminathan, Ramon Mangaser, Samuel Naffziger, Tyrone Huang, Wonjun Jung

AMD Instinct MI300X Accelerator: Packaging and Architecture Co-Optimization

In the International Symposium on VLSI Technology and Circuits (VLSI), Honolulu, HI, June 2024


2023


Gabriel H. Loh

RETROSPECTIVE: 3D-Stacked Memory Architectures for Multi-Core Processors (link)

In the ISCA@50 25-year Retrospective 1996-2020, 2023, Orlando, FL


Stuart Schechter, Karin Strauss, Gabriel H. Loh, Doug Burger

RETROSPECTIVE: Use ECP, not ECC, for Hard Failures in Resistive Memories (link)

In the ISCA@50 25-year Retrospective 1996-2020, 2023, Orlando, FL


Gabriel H. Loh, Michael J. Schulte, Mike Ignatowski, Vignesh Adhinarayanan, Shaizeen Aga, Derrick Aguren, Varun Agrawal, Ashwin M. Aji, John Alsop, Paul Bauman, Bradford M. Beckmann, Majed Valad Beigi, Sergey Blagodurov, Travis Boraten, Michael Boyer, William Brantley, Noel Chalmers, Shaoming Chen, Kevin Cheng, Michael L. Chu, David Cownie, Nicholas Curtis, Joris del Pino, Nam Duong, Alexandru Dutu, Yasuko Eckert, Christopher Erb, Chip Freitag, Joseph L. Greathouse, Sudhanva Gurumurthi, Anthony Gutierrez, Khaled Hamidouche, Sachin Hossamani, Wei Huang, Mahzabeen Islam, Nuwan Jayasena, John Kalamatianos, Onur Kayiran, Jagadish Kotra, Alan Lee, Daniel Lowell, Niti Madan, Abhinandan Majumdar, Nicholas Malaya, Srilatha Manne, Susumu Mashimo, Damon McDougall, Elliott Mednick, Michael Mishkin, Mark Nutter, Indrani Paul, Matthew Poremba, Brandon Potter, Kishore Punniyamurthy, Sooraj Puthoor, Steven E. Raasch, Karthik Rao, Greg Rodgers, Marko Scrbak, Mohammad Seyedzadeh, John Slice, Vilas Sridharan, Rene van Oostrum, Eric van Tassell, Abhinav Vishnu, Samuel Wasmundt, Mark Wilkening, Noah Wolfe, Mark Wyse, Adithya Yalavarti, Dmitri Yudanov

A Research Retrospective on AMD’s Exascale Computing Journey (Industry Track) (link, pdf)

In the International Symposium on Computer Architecture (ISCA), Orlando, FL, June 2023


Raja Swaminathan, Michael J. Schulte, Brett Wilkerson, Gabriel H. Loh, Alan Smith, Norman James

AMD Instinct™ MI250X Accelerator Enabled by Elevated Fanout Bridge Advanced Packaging Architecture

In the International Symposium on VLSI Technology and Circuits (VLSI), Kyoto, Japan, June 2023


Gabriel H. Loh, Raja Swaminathan

The Next Era for Chiplet Innovation

In the Design, Automation, and Test in Europe Conference (DATE), Antwerp, Belgium, April, 2023


2022


(No publications this year... and that's OK!)


2021


Jagadish B. Kotra, Michael Lebeane, Mahmut Kandemir, Gabriel H. Loh

Increasing GPU Translation Reach by Leveraging Under-Utilized On-Chip Resources

In the International Symposium on Microarchitecture (MICRO), Virtual/Athens, Greece, October, 2021


Samuel Naffziger, Noah Beck, Thomas Burd, Kevin Lepak, Gabriel H. Loh, Mahesh Subramony, Sean White

Pioneering Chiplet Technology and Design for the AMD EPYC™ and Ryzen™ Processor Families (Industry Track)  (link, pdf)

In the International Symposium on Computer Architecture (ISCA), Virtual/Valencia, Spain, June 2021


Mark Papermaster, Stephen Kosonocky, Gabriel H. Loh, Samuel Naffziger

A New Era of Tailored Computing (plenary/short paper)

In the 2021 Symposia on VLSI Circuits and Technology (VLSI), Virtual/Kyoto, Japan, June 2021


Mohamed Assem Ibrahim, Onur Kayiran, Yasuko Eckert, Gabriel H. Loh, Adwait Jog

Analyzing and Leveraging Decoupled L1 Caches in GPUs

In the International Symposium on High-Performance Computer Architecture (HPCA), Virtual/Seoul, South Korea, February 2021


Gabriel H. Loh, Sam Naffziger, Kevin Lepak

Understanding Chiplets Today to Anticipate Future Integration Opportunities and Limits (Special Session)

In the Design, Automation, and Test in Europe Conference (DATE), Virtual/Grenoble, France, February 2021


2020


Mohamed Assem Ibrahim, Onur Kayiran, Yasuko Eckert, Gabriel H. Loh, Adwait Jog

Analyzing and Leveraging Shared L1 Caches in GPUs (link, pdf)

In the International Conference on Parallel Architectures and Compilation Techniques (PACT), Virtual/October 2020


Jieming Yin, Subhash Sethumurugan, Yasuko Eckert, Chintan Patel, Alan Smith, Eric Morton, Mark Oskin, Natalie Enright Jerger, Gabriel H. Loh

Experiences with ML-Driven Design: A NoC Case Study (Industrial Track) (link, pdf)

In the International Symposium on High-Performance Computer Architecture (HPCA), San Diego, CA, February 2020


2019


Dylan Stow, Itir Akgun, Wenqin Huangfu, Yuan Xie, Xueqi Li, Gabriel H. Loh

Efficient System Architecture in the Era of Monolithic 3D: Dynamic Inter-tier Interconnect and Processing-in-Memory (link, pdf)

In the ACM Design Automation Conference (DAC), Las Vegas, NV, June 2019


2018


Amin Farmahini-Farahani, Sudhanva Gurumurthi, Gabriel H. Loh, Mike Ignatowski

Challenges of High-Capacity DRAM Stacks and Potential Directions (link, pdf)

In the Workshop on Memory Centric High Performance Computing (MCHPC), Dallas, TX, November 2018 (Held in conjunction with SC'18)


Joseph L. Greathouse, Gabriel H. Loh

Machine Learning for Performance and Power Modeling of Heterogeneous Systems (link, pdf)

In the 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Diego, CA, November 2018


Jieming Yin, Zhifeng Lin, Onur Kayiran, Matthew Poremba, Muhammad Shoaib Bin Altaf, Natalie Enright Jerger, Gabriel H. Loh

Modular Routing Design for Chiplet-based Systems (link, pdf)

In the International Symposium on Computer Architecture (ISCA), Los Angeles, CA, June 2018


Jan Vesely, Arkaprava Basu, Abhishek Bhattacharjee, Gabriel H. Loh, Mark Oskin, Steven K. Reinhardt

Generic System Calls for GPUs (link, pdf)

In the International Symposium on Computer Architecture (ISCA), Los Angeles, CA, June 2018


Seunghee Shin, Guilherme Cox, Mark Oskin, Gabriel H. Loh, Yan Solihin, Abhishek Bhattacharjee, Arkaprava Basu

Scheduling Page Table Walks for Irregular GPU Applications (link, pdf)

In the International Symposium on Computer Architecture (ISCA), Los Angeles, CA, June 2018


Jieming Yin, Yasuko Eckert, Shuai Che, Mark Oskin, Gabriel H. Loh

Toward More Efficient NoC Arbitration: A Deep Reinforcement Learning Approach (link, pdf)

In the International Workshop on AI-assisted Design for Architecture (AIDArc), Los Angeles, CA, June 2018 (Held in conjunction with ISCA'18)


Hyojong Kim, Ramyad Hadidi, Lifeng Nai, Hyesoon Kim, Nuwan Jayasena, Yasuko Eckert, Onur Kayiran, Gabriel H. Loh

CODA: Enabling Co-location of Computation and Data for Multiple GPU Systems (link, pdf)

In the ACM Transactions on Architecture and Code Optimization (TACO), Vol. 15, No. 3, April, 2018.


2017


Abhinav Agarwal, Gabriel H. Loh, James Tuck

Leveraging Near Data Processing for High-performance Checkpoint/Restart (link, pdf)

In the International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC), Denver, CO, November 2017


Dylan Stow, Yuan Xie, Taniya Siddiqua, Gabriel H. Loh

Cost-effective Design of Scalable High Performance Systems Using Active and Passive Interposers (link, pdf)

In the International Conference on Computer Aided Design (ICCAD), Irvine, CA, November 2017


Amro Awad, Arkaprava Basu, Sergey Blagodurov, Yan Solihin, Gabriel H. Loh

Avoiding TLB Shootdowns with Self-invalidating TLB Entries (link, pdf)

In the International Conference on Parallel Architectures and Compilation Techniques (PACT), Portland, OR, September 2017


Matthew Poremba, Itir Akgun, Jieming Yin, Onur Kayiran, Yuan Xie, Gabriel H. Loh

There and Back Again: Optimizing the Interconnect in Networks of Memory Cubes (link, pdf)

In the International Symposium on Computer Architecture (ISCA), Toronto, Canada, June, 2017


Thiruvengadam Vijayaraghavan, Yasuko Eckert, Gabriel H. Loh, Michael Schulte, Mike Ignatowski, Indrani Paul, Brad Beckmann, Steven K. Reinhardt, William Brantley, Joseph Greathouse, Onur Kayiran, Matthew Poremba, Wei Huang, Arun Karunanithi, Greg Sadowski, Vilas Sridharan, Steven Raasch, Mitesh Meswani

Design and Analysis of an APU for Exascale Computing Industrial Track) (link, pdf)

In the International Symposium on High-Performance Computer Architecture (HPCA), Austin, TX, February, 2017


Andreas Prodromou, Mitesh Meswani, Nuwan Jayasena, Gabriel H. Loh, Dean Tullsen

MemPod: A Clustered Architecture for Efficient and Scalable Migration in Flat Address Space Multi-Level Memories (link, pdf)

In the International Symposium on High-Performance Computer Architecture (HPCA), Austin, TX, February, 2017


2016


Jia Zhan, Onur Kayiran, Gabriel H. Loh, Chita R. Das, Yuan Xie

OSCAR: Orchestrating STT-RAM Cache Traffic in Heterogeneous Architectures (link, pdf)

In the International Symposium on Microarchitecture (MICRO), Taipei, Taiwan, October 2016


Zhe Wang, Daniel Jimenez, Tao Zhang, Gabriel H. Loh, Yuan Xie

Building a Low Latency, Highly Associative DRAM Cache with the Buffered Way Predictor (link, pdf)

In the 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), Los Angeles, CA, October 2016


Onur Kayiran, Adwait Jog, Ashutosh Pattnaik, Rachata Ausavarungnirun, Xulong Tang, Mahmut T. Kandemir, Gabriel H. Loh, Onur Mutlu, Chita R. Das

μC-States: Fine-grain GPU Datapath Management (link, pdf)

In the International Conference on Parallel Architectures and Compilation Techniques (PACT), Haifa, Israel, September 2016


Ajaykumar Kannan, Natalie Enright Jerger, Gabriel H. Loh

Exploiting Interposer Technologies to Disintegrate and Reintegrate Multicore Processors (link, pdf)

In IEEE Micro, special issue on Top Picks in computer architecture conferences (TOPPICKS), 2016 (An earlier version appeared below in MICRO 2015.)


Rachata Ausavarungnirun, Chris Fallin, Xiangyao Yu, Kevin Kai-Wei Chang, Greg Nazario, Reetuparna Das, Gabriel H. Loh, Onur Mutlu

A Case for Hierarchical Rings with Deflection Routing: An Energy-efficient On-chip Communication Substrate (link, pdf)

In Parallel Computing, vol. 54: pp. 29-45, May 2016


Jan Vesely, Arkaprava Basu, Mark Oskin, Gabriel H. Loh, Abhishek Bhattacharjee

Observations and Opportunities in Architecting Shared Virtual Memory for Heterogeneous Systems (link, pdf)

In the International Symposium on Performance Analysis of Systems and Software (ISPASS), Uppsala, Sweden, April 2016


Jieming Yin, Onur Kayiran, Matthew Poremba, Natalie Enright Jerger, Gabriel H. Loh

Efficient Synthetic Traffic Models for Large, Complex SoCs (link, pdf)

In the International Symposium on High Performance Computer Architecture (HPCA), Barcelona, Spain, February 2016


2015


Ajaykumar Kannan, Natalie Enright Jerger, Gabriel H. Loh

Enabling Interposer-based Disintegration of Multi-core Processors (link, pdf)

In the International Symposium on Microarchitecture (MICRO), Honolulu, HI, December 2015 (A version of this paper appears above in TOPPICKS for 2015.)


Binh Pham, Jan Vesely, Gabriel H. Loh, Abhishek Bhattacharjee

Large Pages and Lightweight Memory Management in Virtualized Systems: Can You Have it Both Ways? (link, pdf)

In the International Symposium on Microarchitecture (MICRO), Honolulu, HI, December 2015 (Best Paper Nominee)


Mark Oskin, Gabriel H. Loh

A Software-managed Approach to Die-Stacked DRAM (link, pdf)

In the International Conference on Parallel Architectures and Compilation Techniques (PACT), San Francisco, CA, October 2015


Rachata Ausavarungnirun, Onur Kayiran, Saugata Ghose, Gabriel H. Loh, Chita Das, Mahmut Kandemir, Onur Mutlu

Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance (link, pdf)

In the International Conference on Parallel Architectures and Compilation Techniques (PACT), San Francisco, CA, October 2015


Gabriel H. Loh, Natalie Enright Jerger, Yasuko Eckert, Ajayjumar Kannan

Interconnect-Memory Challenges for Multi-chip, Silicon Interposer Systems (link, pdf)

In the International Symposium on Memory Systems (MEMSYS), Washington, DC, October 2015


ChunYi Sun, Edgar A. Leon, Gabriel H. Loh, David Roberts, Kirk Cameron, Dimitrios S. Nikolopoulos, Bronis R. de Supinski

HpMC: An Energy-aware Management System of Multi-level Memory Architectures (link, pdf)

In the International Symposium on Memory Systems (MEMSYS), Washington, DC, October 2015


Michael J. Schulte, Mike Ignatowski, Gabriel H. Loh, Bradford M. Beckmann, William C. Brantley, Sudhanva Gurumurthi, Nuwan Jayasena, Indrani Paul, Steven K. Reinhardt, Gregory Rodgers

Achieving Exascale Capabilities through Heterogeneous Computing (link, pdf)

In IEEE Micro, vol. 35(4), July-August, 2015


Mitesh Meswani, Sergey Blagodurov, David Roberts, John Slice, Mike Ignatowski, Gabriel H. Loh

Heterogeneous Memory Architectures: A HW/SW Approach for Mixing Die-stacked and Off-package Memories (link, pdf)

In the International Symposium on High Performance Computer Architecture (HPCA), San Francisco, CA, February 2015 (Best Paper Nominee)


Dae Hyun Kim, Krit Athikulwongse, Michael B. Healy, Mohammad M. Hossain, Moongon Jung, Ilya Khorosh, Gokul Kumar, Young-Joon Lee, Dean L. Lewis, Tzu-Wei Lin, Chang Liu, Shreepad Panth, Mohit Pathak, Minzhen Ren, Guanhao Shen, Taigon Song, Dong Hyuk Woo, Xin Zhao, Joungho Kim, Ho Choi, Gabriel H. Loh, Hsien-Hsin S. Lee, Sung Kyu Lim

Design and Analysis of 3D-MAPS (3D Massively Parallel Processor with Stacked Memory) (link, pdf)

In the IEEE Transactions on Computers, January 2015


2014


Natalie Enright Jerger, Ajaykumar Kannan, Zimo Li, Gabriel H. Loh

NoC Architectures for Silicon Interposer Systems (link, pdf)

In the International Symposium on Microarchitecture (MICRO), Cambridge, UK, December 2014 (Selected as an honorable mention in TOPPICKS for 2014.)


Onur Kayiran, Nachiappan Chidambaram Nachiappan, Adwait Jog, Rachata Ausavarungnirun, Mahmut T. Kandemir, Gabriel H. Loh, Onur Mutlu, Chita R. Das

Managing Concurrency in Heterogeneous Architectures (link, pdf)

In the International Symposium on Microarchitecture (MICRO), Cambridge, UK, December 2014


Djordje Jevdjic, Gabriel H. Loh, Cansu Kaynak, Babak Falsafi

Unison Cache: A Scalable and Effective Die-Stacked DRAM Cache (link, pdf)

In the International Symposium on Microarchitecture (MICRO), Cambridge, UK, December 2014


Yasuko Eckert, Nuwan Jayasena, Gabriel H. Loh

Thermal Feasibility of Die-Stacked Processing in Memory (pdf)

In the 2nd Workshop on Near-Data Processing (WoNDP), Cambridge, UK, December 2014 (Held in conjunction with MICRO'14)


Niladrish Chatterjee, Mike O'Connor, Gabriel H. Loh, Nuwan Jayasena, Rajeev Balasubramonian

Managing DRAM Latency Divergence in Irregular DRAM Applications (link, pdf)

In the International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC), New Orleans, LA, November 2014


Mitesh Meswani, Gabriel H. Loh, Sergey Blagodurov, David Roberts, John Slice, Mike Ignatowski

Toward Efficient Programmer-managed Two-level Memory Hierarchies in Exascale Computers (link, pdf)

In the International Workshop on Hardware-Software Co-Design for High Performance Computing (Co-HPC), New Orleans, LA, November 2014 (Held in conjunction with SC'14.) 


Rachata Ausavanrungnirun, Chris Fallin, Xiangyao Yu, Kevin Kai-Wei Chang, Greg Zazario, Reetuparna Das, Gabriel H. Loh, Onur Mutlu

Design and Evaluation of Hierarchical Rings with Deflection Routing (link, pdf)

In the 26th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), Paris, France, October 2014


Hyeran Jeon, Gabriel H. Loh, Murali Annavaram

Efficient RAS Support for Die-stacked DRAM (link)

In the International Test Conference (ITC), Seattle, WA, October 2014


Yingying Tian, Samira Khan, Daniel Jimenez, Gabriel H. Loh

Last-Level Cache Deduplication (link, pdf)

In the International Conference on Supercomputing (ICS), 2014


Jaewoong Sim, Gabriel H. Loh, Vilas Sridharan, Mike O'Connor

A Configurable and Strong RAS Solution for Die-stacked DRAM Caches (link)

In the IEEE Micro magazine, special issue on Top Picks in computer architecture conferences (TOPPICKS), 2014  (An earlier version appeared below in ISCA 2013.)


Binh Pham, Abhishek Bhattacharjee, Yasuko Eckert, Gabriel H. Loh

Increasing TLB Reach by Exploiting Clustering in Page Translations (link, pdf)

In the 20th International Symposium on High-Performance Computer Architecture (HPCA), Orlando, FL, February 2014


2013


Gabriel H. Loh, Nuwan Jayasena, Mark Oskin, Mark Nutter, David Roberts, Mitesh Meswani, Dong Ping Zhang, Mike Ignatowski

A Processing in Memory Taxonomy and a Case for Studying Fixed-function PIM (pdf)

In the Workshop on Near-Data Processing (WoNDP), Davis, CA, December 2013 (Held in conjunction with MICRO-46.) 


Jaewoong Sim, Gabriel H. Loh, Vilas Sridharan, Mike O'Connor

Resilient Die-stacked DRAM Caches (link, pdf)

In the 40th International Symposium on Computer Architecture (ISCA), Tel-Aviv, Israel, June 2013 (A version of this paper appears above in TOPPICKS for 2013.)


Hyeran Jeon, Mark Wilkening, Vilas Sridharan, Sudhanva Gurumurthi, Gabriel H. Loh

Architectural Vulnerability Modeling and Analysis of Integrated Graphics Processors (pdf)

In the Workshop on Silicon Errors in Logic - System Effects (SELSE), Stanford, CA, March 2013


2012


Moin Qureshi, Gabriel H. Loh

Fundamental Latency Trade-offs in Architecting DRAM Caches (link, pdf)

In the 45th International Symposium on Microarchitecture (MICRO), Vancouver, BC, December 2012


Jaewoong Sim, Gabriel H. Loh, Hyesoon Kim, Mike O'Connor, Mithuna Thottethodi

A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch (link, pdf)

In the 45th International Symposium on Microarchitecture (MICRO), Vancouver, BC, December 2012


Jishen Zhao, Guangyu Sun, Yuan Xie, Gabriel H. Loh

Energy-Efficient GPU Design with Reconfigurable In-Package Graphics Memory (link, pdf)

In the International Symposium on Low-Power Electronics and Design (ISLPED), Redondo Beach, CA, July 2012


Rachata Ausavarungnirun, Kevin Chang, Lavanya Subramanian, Gabriel H. Loh, and Onur Mutlu

Staged Memory Scheduling: Achieving High Performance and Scalability in Heterogeneous Systems (link, pdf)

In the 39th International Symposium on Computer Architecture (ISCA), Portland, OR, June 2012


Gabriel H. Loh, Mark D. Hill

Supporting Very Large DRAM Caches with Compound Access Scheduling and MissMaps (link, pdf, addendum)

In the IEEE Micro magazine, special issue on Top Picks in computer architecture conferences (TOPPICKS), 2012  (An earlier version appeared below in MICRO 2011.)


Gabriel H. Loh, Nuwan Jayasena, Kevin McGrath, Mike O'Connor, Steve K. Reinhardt, Jaewoong Chung

Challenges in Heterogeneous Die-Stacked and Off-Chip Memory Systems (pdf)

In the 3rd Workshop on SoCs, Heterogeneous Architectures and Workloads (SHAW), February, 2012, New Orleans, LA, USA  (Held in conjunction with HPCA-18.) 


Dae Hyun Kim, Krit Athikulwongse, Michael B. Healy, Mohammad M. Hossain, Moongon Jung, Ilya Khorosh, Gokul Kumar, Young-Joon Lee, Dean L. Lewis, Tzu-Wei Lin, Chang Liu, Shreepad Panth, Mohit Pathak, Minzhen Ren, Guanhao Shen, Taigon Song, Dong Hyuk Woo, Xin Zhao, Joungho Kim, Ho Choi, Gabriel H. Loh, Hsien-Hsin S. Lee, Sung Kyu Lim

3D-MAPS: 3D Massively Parallel Processor with Stacked Memory (link, pdf)

In the International Solid-State Circuits Conference (ISSCC), February, 2012, San Francisco, CA, USA


2011


Gabriel H. Loh, Mark. D. Hill

Efficiently Enabling Conventional Block Sizes for Very Large Die-stacked DRAM Caches (link, pdf, slides, addendum)

In the 44th International Symposium on Microarchitecture (MICRO), December, 2011, Porto Alegre, Brazil  (A version of this paper appears above in TOPPICKS for 2011.)


Gabriel H. Loh

A Register-file Approach for Row Buffer Caches in Die-stacked DRAMs (link, pdf)

In the 44th International Symposium on Microarchitecture (MICRO), December, 2011, Porto Alegre, Brazil


Andrew Hay, Karin Strauss, Timothy Sherwood, Gabriel H. Loh, Doug Burger

Preventing PCM Banks from Seizing Too Much Power (link, pdf)

In the 44th International Symposium on Microarchitecture (MICRO), December, 2011, Porto Alegre, Brazil


Yuejian Xie, Gabriel H. Loh

Thread-Aware Dynamic Shared Cache Compression in Multicore Processors (link)

In the International Conference on Computer Design (ICCD), October, 2011, Amherst, MA, USA


2010


Yuejian Xie, Gabriel H. Loh

Managing Shared Caches with Low-Overhead Mechanisms for Embedded Multi-Cores

In Transactions on HiPEAC, vol. 5(1), 2010


Gabriel H. Loh, Yuan Xie

3D Microprocessor: Are We There Yet? (link, pdf)

In IEEE Micro, vol. 30(3), pp. 60-64, May, 2010


Michael B. Healy, Krit Athikulwongse, Rohan Goel, Mohammad M. Hossain, Dae Hyun Kim, Young-Joon Lee, Dean L. Lewis, Tzu-Wei Lin, Chang Liu, Moongon Jung, Brian Ouellette, Mohit Pathak, Hemant Sane, Guanhao Shen, Dong Hyuk Woo, Xin Zhao, Gabriel H. Loh, Hsien-Hsin S. Lee, Sung Kyu Lim

Design and Analysis of 3D-MAPS: A Many-Core 3D Processor with Stacked Memory (link, pdf)

In the proceedings of the IEEE Custom Integrated Circuits Conference (CICC), September, 2010, San Jose, CA, USA


Dean Lewis, Michael B. Healy, Mohammad M. Hossain, Tzu-Wei Lin, Mohit Pathak, Hemant Sane, Sung Kyu Lim, Gabriel H. Loh, Hsien-Hsin S. Lee

Design and Test of 3D-MAPS, a 3D Die-Stack Many-Core Processor (pdf)

In the first IEEE International Workshop on Testing Three-Dimensional Stacked Integrated Circuits, November, 2010, Austin, TX, USA


Stuart Schechter, Karin Strauss, Gabriel H. Loh, Doug Burger

Use ECP, not ECC, for Hard Failures in Resistive Memories (link, pdf) (Selected for inclusion in ISCA@50 25-year Retrospective 1996-2020)

In the International Symposium on Computer Architecture (ISCA), June 19-23, 2010, Saint-Malo, France


Serkan Ozdemir, Pan Yan, Abhishek Das, Gabriel H. Loh, Gokhan Memik, Alok Choudhary

Quantifying and Coping with Parametric Variations in 3D-Stacked Microarchitectures (link, pdf)

In the ACM Design Automation Conference (DAC), June 13-18, 2010, Anaheim, CA, USA


Yuejian Xie, Gabriel H. Loh

Scalable Shared Cache Management by Containing Thrashing Workloads (link, pdf)

In the International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC), pp. 262-276, January 25-27, 2010, Pisa, Italy


2009


Gabriel H. Loh

Extending the Effectiveness of 3D-Stacked DRAM Caches with an Adaptive Multi-Queue Policy (link, pdf)

In the International Symposium on Microarchitecture (MICRO), pp. 201-212, December 12-16, 2009, New York, NY, USA


Kiran Puttaswamy, Gabriel H. Loh

3D-Integrated SRAM Components for High-Performance Microprocessors (link)

In the IEEE Transactions on Computers (TC), vol. 58(10), pp. 1369-1381, October, 2009


Yuejian Xie, Gabriel H. Loh

PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches (link, pdf)

In the International Symposium on Computer Architecture (ISCA), pp. 174-183, June 20-24, 2009, Austin, TX, USA


Gabriel H. Loh, Samantika Subramaniam, Yuejian Xie

Zesto: A Cycle-Level Simulator for Highly Detailed Microarchitecture Exploration (link, pdf)

In the International Symposium on Performance Analysis of Software and Systems (ISPASS), pp. 53-64, April 27, 2009, Boston, MA, USA


Samantika Subramaniam, Anne C. Bracy, Hong Wang, Gabriel H. Loh

Criticality-Based Optimizations for Efficient Load Processing (link, pdf)

In the 19th International Symposium on High-Performance Computer Architecture (HPCA), pp. 419-430, February 18, 2009, Raleigh, NC, USA


Michael B. Healy, Hsien-Hsin S. Lee, Gabriel H. Loh, Sung Kyu Lim

Thermal Optimization in Multi-Granularity Multi-Core Floorplanning (link, pdf)

In the 14th Asia and South Pacific Design Automation Conference (ASPDAC), pp. 43-48, January 19, 2009, Yokohama, Japan


Samantika Subramaniam, Gabriel H. Loh

Design and Optimization of the Store Vectors Memory Dependence Predictor (link, pdf)

In ACM Transactions on Architecture and Code Optimization (TACO)


Gabriel H. Loh

3D Microprocessor Design, book chapter (link)

In "Three Dimensional Integrated Circuits: EDA, Design and Microarchitecture", Jason Cong, Sachin Sapatnekar, Yuan Xie; Springer, 2009


2008


Mauricio Breternitz Jr., Gabriel H. Loh, Bryan Black, Jeffrey Rupley, Peter G. Sassone, Wesley Attrot, Youfeng Wu

A Segmented Bloom Filter Algorithm for Efficient Predictors (link, pdf)

In the 20th IEEE International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 123-130, October 30, 2008, Campo Grande, Brazil


Gabriel H. Loh

3D-Stacked Memory Architectures for Multi-Core Processors (link, pdf) (Awarded the ISCA Influential Paper Award at ISCA 2023.) (Selected for inclusion in ISCA@50 25-year Retrospective 1996-2020)

In the 35th ACM International Symposium on Computer Architecture (ISCA), pp. 453-464, June 21-25, 2008, Beijing, China


Rahul Garde, Samantika Subramaniam, Gabriel H. Loh

Deconstructing the Inefficacy of Global Cache Replacement Policies (pdf)

In the 7th Workshop on Duplicating, Deconstructing, and Debunking (WDDD), June 22, 2008, Beijing, China (Held in conjunction with ISCA-35.


Yuejian Xie, Gabriel H. Loh

Dynamic Classification of Program Memory Behaviors in CMPs (pdf)

In the 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects (CMP-MSI), June 22, 2008, Beijing, China (Held in conjunction with ISCA-35.


Jonathan D. Kron, Brooks Prumo, Gabriel H. Loh

Double-DIP: Augmenting DIP with Adaptive Promotion Policies to Manage Shared L2 Caches (pdf)

In the 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects (CMP-MSI), June 22, 2008, Beijing, China (Held in conjunction with ISCA-35.


Gabriel H. Loh

The Cost of Uncore in Throughput-Oriented Many-Core Processors (pdf)

In the Workshop on Architectures and Languages for Throughput Applications (ALTA), June 22, 2008, Beijing, China (Held in conjunction with ISCA-35.


Gabriel H. Loh

A Modular 3D Processor for Flexible Product Design and Technology Migration (link, pdf)

In the ACM International Conference on Computing Frontiers (CF), pp. 159-170, May 5-7, 2008, Ischia, Italy


Gabriel H. Loh, Daniel A. Jimenez

Modulo Path History for the Reduction of Pipeline Overheads in Path-Based Neural Branch Predictors (link, pdf)

In the Springer International Journal of Parallel Programming (IJPP), vol. 36(2), pp. 267-286, April, 2008


Samantika Subramaniam, Milos Prvulovic, Gabriel H. Loh

PEEP: Exploiting Predictability of Memory Dependences in SMT Processors (link, pdf)

In the 14th International Symposium on High-Performance Computer Architecture (HPCA), pp. 137-148, February 16-20, 2008, Salt Lake City, UT, USA


2007


Peter G. Sassone, D. Scott Wills, Gabriel H. Loh

Static Strands: Safely Exposing Dependence Chains for Increasing Embedded Power Efficiency (link, pdf)

In ACM Transactions on Embedded Computing Systems (TECS), vol. 6(4), September, 2007


Peter G. Sassone, Jeff Rupley, Edward Brekelbaum, Gabriel H. Loh, Bryan Black

Matrix Scheduler Reloaded (link, pdf)

In the 34th International Symposium on Computer Architecture (ISCA), pp. 335-346, June 9-13, 2007, San Diego, CA, USA


Kiran Puttaswamy, Gabriel H. Loh

Scalability of 3D-Integrated Arithmetic Units in High-Performance Microprocessors (link, pdf)

In the ACM Design Automation Conference (DAC), pp. 622-625, June 4-8, 2007, San Diego, CA, USA


Gabriel H. Loh, Yuan Xie, Bryan Black

Processor Design in Three-Dimensional Die-Stacking Technologies (link, pdf)

In IEEE Micro, vol. 27(3), pp. 31-48, May-June, 2007


Kiran Puttaswamy, Gabriel H. Loh

Thermal Herding: Microarchitecture Techniques for Controlling HotSpots in High-Performance 3D-Integrated Processors (link, pdf)

In the 13th International Symposium on High-Performance Computer Architecture (HPCA), pp. 193-204, February 13, 2007, Phoenix, AZ, USA


Michael Healy, Mario Vittes, Mongkol Ekpanyapong, Chinnakrishnan Ballapuram, Sung Kyu Lim, Hsien-Hsin S. Lee, Gabriel H. Loh

Multi-Objective Microarchitectural Floorplanning for 2D and 3D ICs (link, pdf)

In IEEE Transactions on Computer Aided Design (TCAD), vol. 26(1), pp. 38-52, January, 2007


2006


Bryan Black, Murali M. Annavaram, Edward Brekelbaum, John DeVale, Lei Jiang, Gabriel H. Loh, Don McCauley, Pat Morrow, Donald W. Nelson, Daniel Pantuso, Paul Reed, Jeff Rupley, Sadas Shankar, John Paul Shen, Clair Webb

Die Stacking (3D) Microarchitecture (link, pdf)

In the 39th International Symposium on Microarchitecture (MICRO), pp. 469-479, December 9-13, 2006, Orlando, FL, USA


Samantika Subramaniam, Gabriel H. Loh

Fire-and-Forget: Load/Store Scheduling with No Store Queue at All (link, pdf)

In the 39th International Symposium on Microarchitecture (MICRO), pp. 273-284, December 9-13, 2006, Orlando, FL, USA  (Best student presentation award)


Ranjith Subramanian, Yannis Smaragdakis, Gabriel H. Loh

Adaptive Caches: Effective Shaping of Cache Behavior to Workloads (link, pdf)

In the 39th International Symposium on Microarchitecture (MICRO), pp. 385-396, December 9-13, 2006, Orlando, FL, USA


Chinnakrishnan Ballapuram, Kiran Puttaswamy, Gabriel H. Loh, Hsien-Hsin S. Lee

Entropy-based Low Power Data TLB Design (link, pdf)

In the ACM/IEEE Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES), pp. 304-311, October 23-25, 2006, Seoul, South Korea


Daniel A. Jimenez, Gabriel H. Loh

Controlling the Power and Area of Neural Branch Predictors for Practical Implementation in High-Performance Processors (link, pdf)

In the 18th IEEE International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 55-62, October 18, 2006, Ouro Preto, Brazil


Kiran Puttaswamy, Gabriel H. Loh

Thermal Analysis of a 3D Die-Stacked High-Performance Microprocessor (link, pdf)

In the ACM/IEEE Great Lakes Symposium on VLSI (GLSVLSI), pp. 19-24, May 1, 2006, Philadelphia, PA, USA


Kiran Puttaswamy, Gabriel H. Loh

Dynamic Instruction Schedulers in a 3-Dimensional Integration Technology (link, pdf)

In the ACM/IEEE Great Lakes Symposium on VLSI (GLSVLSI), pp. 153-158, May 1, 2006, Philadelphia, PA, USA


Kiran Puttaswamy, Gabriel H. Loh

The Impact of 3-Dimensional Integration on the Design of Arithmetic Units (link, pdf)

In the IEEE International Symposium on Circuits and Systems (ISCAS), pp. 4951-4954, May 24, 2006, Kos, Greece (Selected as a Google Scholar "Classic Paper" for 2016 in Computer Hardware Design)


Yuan Xie, Gabriel H. Loh, Bryan Black, Kerry Bernstein

Design Space Exploration for 3D Architectures (link, pdf)

In ACM Journal of Emerging Technologies in Computing Systems (JETC), vol. 2(2), pp. 65-103, April, 2006


Gabriel H. Loh

Revisiting the Performance Impact of Branch Predictor Latencies (link, pdf)

In the International Symposium on Performance Analysis of Software and Systems (ISPASS), pp. 59-69, March 20, 2006, Austin, TX, USA


Kiran Puttaswamy, Gabriel H. Loh

Implementing Register Files for High-Performance Microprocessors in a Die-Stacked (3D) Technology (link, pdf)

In the IEEE International Symposium on VLSI (ISVLSI), pp. 384-389, March 3, 2006, Karlsrühe, Germany


Michael Healy, Mario Vittes, Mongkol Ekpanyapong, Chinnakrishnan Ballapuram, Sung Kyu Lim, Hsien-Hsin S. Lee, Gabriel H. Loh

Microarchitectural Floorplanning Under Performance and Temperature Tradeoff (link, pdf)

In the Conference for Design, Automation and Test in Europe (DATE), pp. 1288-1293, March 9, 2006, Munich, Germany


Samantika Subramaniam, Gabriel H. Loh

Store Vectors for Scalable Memory Dependence Prediction and Scheduling (link, pdf)

In the 12th International Symposium on High-Performance Computer Architecture (HPCA), pp. 64-75, February 13, 2006, Austin, TX, USA  (Best student presentation award)


2005


Kiran Puttaswamy, Gabriel H. Loh

Implementing Caches in a 3D Technology for High Performance Processors (link, pdf)

In the International Conference on Computer Design (ICCD), pp. 525-532, October 5, 2005, San Jose, CA, USA


Gabriel H. Loh

A Simple Divide-and-Conquer Approach for Neural-Class Branch Prediction (link, pdf)

In the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 243-254, September 20, 2005, St. Louis, MO, USA


Peter G. Sassone, D. Scott Wills, Gabriel H. Loh

Static Strands: Safely Collapsing Dependence Chains for Increasing Embedded Power Efficiency (link, pdf)

In the Conference on Languages, Compilers and Tools for Embedded Systems (LCTES), pp. 127-136, June 16, 2005, Chicago, IL, USA


Gabriel H. Loh, Daniel A. Jimenez

Reducing the Power and Complexity of Path-Based Neural Branch Prediction (pdf)

In the 5th Workshop on Complexity Effective Design (WCED), pp. 1-8, June 5, 2005, Madison, WI, USA  (Held in conjunction with ISCA-32.


Gabriel H. Loh

Simulation Differences Between Academia and Industry: A Branch Prediction Case Study (link, pdf)

In the International Symposium on Performance Analysis of Software and Systems (ISPASS), pp. 21-31, March 20, 2005, Austin, TX, USA


Gabriel H. Loh

Deconstructing the Frankenpredictor for Implementable Branch Predictors (pdf)

In Journal of Instruction Level Parallelism (JILP), vol. 7, pp. 1-10, April, 2005


Gabriel H. Loh

Advanced Instruction Flow Techniques, book chapter (link)

In "Modern Processor Design: Fundamentals of Superscalar Processors", John Paul Shen and Mikko H. Lipasti, McGraw Hill, 2005


2004


Gabriel H. Loh

The Frankenpredictor: Stitching Together Nasty Bits of Other Predictors (link, pdf)

In the 1st Championship Branch Prediction Contest (CBP1), pp. 1-4, Dec 6, 2004, Portland, OR, USA  (Held in conjunction with MICRO-37.


2003


Gabriel H. Loh

Width-Partitioned Load Value Predictors (pdf)

In Journal of Instruction Level Parallelism (JILP), vol. 5, pp. 1-23, November, 2003


Gabriel H. Loh

Width Prediction for Reducing Value Predictor Size and Power (pdf)

In the 1st Value-Prediction Workshop (VPW1), pp. 86-93, June 7, 2003, San Diego, CA, USA  (Held in conjunction with ISCA-30.) (Also appears in an extended journal version.)


Gabriel H. Loh, Dana S. Henry, Arvind Krishnamurthy

Exploiting Bias in the Hysteresis Bit of 2-bit Saturating Counters in Branch Predictors (pdf)

In Journal of Instruction Level Parallelism (JILP), vol. 5, pp. 1-32, June, 2003


2002


Gabriel H. Loh

Exploiting Data-Width Locality to Increase Superscalar Execution Bandwidth (link, pdf)

In the 35th International Symposium on Microarchitecture (MICRO), pp. 395-405, November 18-22, 2002, Istanbul, Turkey


Gabriel H. Loh, Dana S. Henry

Predicting Conditional Branches With Fusion-Based Hybrid Predictors (link, pdf)

In the 11th Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 165-176, September 22-25, 2002, Charlottesville, VA, USA


Gabriel H. Loh, Dana S. Henry

Applying Machine Learning for Ensemble Branch Predictors (link, pdf)

In the 15th Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEAAIE), pp. 264-274, June 17-20, 2002, Cairns, Australia. Springer LNCS-2358


Gabriel H. Loh, Rahul Sami, Daniel H. Friendly

Memory Bypassing: Not Worth the Effort (pdf)

In the Workshop on Duplicating, Deconstructing, and Debunking (WDDD), pp. 71-80, May 26, 2002, Anchorage, AK, USA  (Held in conjunction with ISCA-29.


Dana S. Henry, Gabriel H. Loh, Rahul Sami

Speculative Clustered Caches for Clustered Processors (link, pdf)

In the 4th International Symposium on High Performance Computing (ISHPC), pp. 281-290, May 15-17, 2002, Kansai Science City, Japan. Springer LNCS-2327


Gabriel H. Loh

Microarchitecture for Billion-Transistor VLSI Superscalar Processors (pdf)

PhD Thesis, Yale University, Dept. of Computer Science, New Haven, CT, USA, 255 pp., May 2002


Bradley C. Kuszmaul, Dana S. Henry, Gabriel H. Loh

A Comparison of Asymptotically Scalable Superscalar Processors (link, pdf)

In Theory of Computing Systems, vol. 35(2), pp. 129-150, April 5, 2002, Springer-Verlag


2001


Gabriel H. Loh

A Time-Stamping Algorithm for Efficient Performance Estimation of Superscalar Processors (link, pdf)

In the Joint International Conference on Measurement & Modeling of Computer Systems (SIGMETRICS), pp. 72-81, June 16-20, 2001, Cambridge, MA, USA


2000


Dana S. Henry, Bradley C. Kuszmaul, Gabriel H. Loh, Rahul Sami

Circuits for Wide-Window Superscalar Processors (link, pdf)

In the 27th International Symposium on Computer Architecture (ISCA), pp. 236-247, June 10-14, 2000, Vancouver, Canada


1999


Bradley C. Kuszmaul, Dana S. Henry, Gabriel H. Loh

A Comparison of Scalable Superscalar Processors (link, pdf)

In the 11th Symposium on Parallel Algorithms and Architectures (SPAA), pp. 126-137, June 27-30, 1999, Saint-Malo, France  (Also appears in a journal version.)