Publications
Note: The documents are provided hereby for timely dissemination of works on a non-commercial basis. These works are copyrighted by the authors or other copyright holders. Please honor the copyright policies.
(H-Index and i-10 Index: 8; i-50 Index: 3; Citations: 310+; Citations in Patents from Industry Leaders: 6; Source: Google Scholar, Semantic Scholar, etc.)
Select Publications
[ASPLOS] Explainable-DSE: An Agile and Explainable Exploration of Efficient Hardware/Software Codesigns of Deep Learning Accelerators Using Bottleneck Analysis. Shail Dave, Tony Nowatzki, Aviral Shrivastava, in ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). [Paper] [Slides] [Poster] [Teaser] [News Article]
Won Silver Medal at ACM Student Research Competition (host: ACM SIGBED)
Work featured on front page of ASU news, ACM tech news, Communications of the ACM news, local media, blogs of various companies, etc.[Proceedings of the IEEE] Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights. Shail Dave, Riyadh Baghdadi, Tony Nowatzki, Sasikanth Avancha, Aviral Shrivastava, Baoxin Li, in Proceedings of the IEEE, volume 109, issue 10, 2021. [Paper] [DOI] [Poster] [summary tweet-thread]
Journal impact factor: 14.91; flagship venue in engineering for more than 110 years; highly studied and referenced article by industry and academic researchers and engineers in machine learning and computing systems and hardware design communities.
Key topics: Sources of sparsity and irregular tensor shapes and acceleration opportunities; implications of irregular or structured sparsity on acceleration; accelerator-aware DNN model pruning; circuit/architecture/mapping/model level techniques; impact of varying sparsity and tensor shapes of different DNN operators on data reuse and storage efficiency; understanding achievable accelerations for recent DNNs, techniques for data extraction and load balancing of effectual computations; sparsity-aware dataflows; leveraging value similarity and approximation in temporal and spatial data of computer vision and speech processing applications; trends and directions for accelerator/model codesigns for DNNs, etc.[DAC] RAMP: Resource-Aware Mapping for CGRAs. Shail Dave, Mahesh Balasubramanian, Aviral Shrivastava, in Proceedings of the 55th Annual Design Automation Conference (DAC), 2018 [Paper] [Slides] [Poster]
In Top 5% Highly Cited Papers, as per Google Scholar, among 1000+ Published Papers at DAC during 2017-2023.
tldr: Mapping optimizations for accelerating loops of general-purpose computing with software pipelining.[CODES+ISSS] dMazeRunner: Executing Perfectly Nested Loops on Dataflow Accelerators. Shail Dave, Youngbin Kim, Sasikanth Avancha, Kyoungwoo Lee, Aviral Shrivastava, in ACM Transactions on Embedded Computing Systems (TECS), Vol. 18, No. 5s, 2019 [Special Issue on ESWEEK 2019 - ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)]. [Paper] [Slides with Demo] [Slides] [Poster] [Code]
2nd Highest Cited CODES+ISSS 2019 Paper (Reference: Google Scholar); Highest Downloaded ESWEEK 2019 Paper and ACM TECS Paper for Years 2017-2023 among 600+ published papers in ACM TECS. (Reference: ACM Digital Library)
Key topics: Defining comprehensive hardware/software design space for loop nests; Accelerator cost model for evaluating execution metrics for variations in hardware architecture, dataflows, model layers; Search-space reduction techniques for getting efficient mappings in a few seconds; Generic algorithms for obtaining all unique data reuse scenarios for loop-orderings.[TECHCON] Automating the Architectural Execution Modeling and Characterization of Domain-Specific Architectures. Shail Dave, Aviral Shrivastava. In Semiconductor Research Corporation TECHCON 2023 [Paper]
[ADVISORY REPORT] Chapter: Sustainable Computing Architectures. In Report for National Science Foundation (NSF) Workshop on Sustainable Computing. 2023.
[LATTE @ ASPLOS] Design Space Description Language for Automated and Comprehensive Exploration of Next-Gen Hardware Accelerators. Shail Dave and Aviral Shrivastava, in Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE), co-located with ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022. [Paper] [Talk] [Workshop]
Led to New Project with Semiconductor Research Corporation's AI Hardware Program
Key topics: Comprehensive, Reusable, Explainable, and Agile Design Exploration of Architectures.[Demonstrations @ DATE, DAC] CCF: CGRA Compilation and Simulation Framework. Shail Dave, Aviral Shrivastava, in University Booth Demonstration at the 21st International Conference on Design Automation and Test in Europe (DATE), 2018 [Paper] [Demo] [Infrastructure]
10s of downloads and usage in CGRA research at various leading academic and industrial groups.
Other Publications (In Chronological Order)
[DATE] Learning-Oriented Reliability Improvement of Computing Systems From Transistor to Application Level. Behnaz Ranjbar, Florian Klemme, Paul R. Genssler, Hussam Amrouch, Jinhyo Jung, Shail Dave, Hwisoo So, Kyongwoo Lee, Aviral Shrivastava, Ji-Yung Lin, Pieter Weckx, Subrat Mishra, Francky Catthoor, Dwaipayan Biswas, Akash Kumar, in Proceedings of the 26th International Conference on Design Automation and Test in Europe (DATE), 2023 [Paper]
(Invited Special Session. Authors are listed as per topical work and affiliation-wise.)[VTS] Towards an Agile Design Methodology for Efficient, Secure, and Reliable ML Systems. Shail Dave, Alberto Marchisio, Muhammad Abdullah Hanif, Amira Guesmi, Aviral Shrivastava, Ihsen Alouani, Muhammad Shafique, in Proceedings of the 40th IEEE VLSI Test Symposium (VTS), 2022 (Invited Special Session). [Paper] [Slides]
[ICASSP] dMazeRunner: Optimizing Convolutions and GEMMs on Dataflow Accelerators. Shail Dave, Aviral Shrivastava, Youngbin Kim, Sasikanth Avancha, Kyoungwoo Lee, in Proceedings of the 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020 [Paper] [Slides] [Code] [Talk]
[TACO] SPX64: A Scratchpad Memory for General-Purpose Microprocessors. Abhishek Singh, Shail Dave, PanteA Zardoshti, Robert Brotzman, Chao Zhang, Xiaochen Guo, Aviral Shrivastava, Gang Tan, Michael Spear, in ACM Transactions on Architecture and Code Optimization (TACO), Vol. 18, No. 1, 2021. [Paper] [Full Talk] [Talk @ HiPEAC '21]
tldr: Software-managed cache; Securing execution against side-channel attacks; Accelerating persistent transactions for non-volatile memory.[DATE] URECA: A Compiler Solution to Manage Unified Register File for CGRAs. Shail Dave, Mahesh Balasubramanian, Aviral Shrivastava, in Proceedings of the 21st International Conference on Design Automation and Test in Europe (DATE), 2018 [Paper] [Slides]
[DATE] LASER: A Hardware/Software Approach to Accelerate Complicated Loops on CGRAs. Mahesh Balasubramanian, Shail Dave, Aviral Shrivastava, Reiley Jeyapaul, in Proceedings of the 21st International Conference on Design Automation and Test in Europe (DATE), 2018 [Paper] [Slides]
[INROADS] Derivation of Transfer Function Model based on Miniaturized Cryocooler Behavior. Jiten Bhatt, Shail Dave, Manish M. Mehta, and Nitin Upadhyay. INROADS 5, no. 1s (2016): 336-340.
US Patents and Disclosures
Method and Apparatus for Dynamic and Efficient Hardware/Software Codesigns of Deep Learning Accelerators. US Provisional Patent 2022.
Explainable Computing Systems Designs and Explainable Optimizations for Efficiency and Productivity. US Provisional Patent 2022.
Hybrid and efficient approach to accelerate complicated loops on coarse-grained reconfigurable arrays (cgra) accelerators. Mahesh Balasubramanian, Shail Dave, Aviral Shrivastava, (ASU), Reiley Jeyapaul (ARM), US 2020/0133672, Published 2020.
Under Submission (single-blind review)
Cyclebite: Extracting Task Graphs From Unstructured Compute-Programs, Benjamin Willis, Aviral Shrivastava, Joshua Mack, Shail Dave, Chaitali Chakrabarti, John Brunhaver, in IEEE Transactions of Computers (TC), 2023.
Related Flagship Conferences:
[Computer Design Automation and Embedded Systems] DAC, ESWEEK (CODES+ISSS, CASES), DATE
[Computer Architecture and Computing Systems] ASPLOS, ISCA, MLSys
[Industrial Semiconductor Chip Design Forums] HotChips, TECHCON
Acceptance rates of these top annual conferences are about between 20%--25%.