SHORT BIO AND RESEARCH SUMMARY:
UPDATE: I am on the academic job market looking for a tenure-track assistant professor position. If your department is hiring, please feel free to reach out to me at talatin@umich.edu.
Download CV (Last updated: November 2024)
Research Statement Teaching Statement
I am an Assistant Research Scientist at the Computer Science and Engineering department at University of Michigan. I lead a group of talented students working on computer architecture, compilers, and systems software with a focus on co-designing architectures and systems for AI and data analytics.
Education:
PhD in Computer Science and Engineering, 2022; University of Michigan, Ann Arbor, MI, USA
MSc in Electrical Engineering (with thesis), 2018; Technion - Israel Institute of Technology, Haifa, Israel
BE in Electrical and Electronics Engineering, 2016; Birla Institute of Technology and Science (BITS) Pilani, Goa, India
LATEST NEWS:
[January 2025] Yichao's paper on GPU-accelerated data analytics is accepted at VLDB 2025! Congratulations Yichao!
[November 2024] Two papers accepted at HPCA 2025. This includes Haojie's paper on protocol-hardware co-design for oblivious memory and Ali's paper on multi-dimensional ISA design for in-cache computing. Congratulations Haojie and Ali!
[September 2024] Yunjie's paper in collaboration with C. Seshadhri's group at UC Santa Cruz on developing accurate and scalable algorithms for pattern matching in graphs is accepted at ICDM 2024. Congratulations Yunjie!
[August 2024] Our group secured gift funding from AMD. Thanks AMD for supporting our research!
[August 2024] Two of our position papers are invited for presentation at the US Department of Energy (DoE)'s workshop on Energy Efficient Computing for Science. They will be presented by myself and Mosharaf Chowdhuri.
[July 2024] Yuchen's paper in collaboration with GeorgiaTech and Intel Labs on characterizing and building a cost model for LLM fine-tuning is accepted at IISWC 2024. Congratulations Yuchen!
[May 2024] Along with my collaborators from Barcelona Supercomputing Center, we will be organizing a tutorial on expedited development of novel RISC-V instructions through emulation-simulation framework at ISCA 2024. Please checkout the details on our tutorial webpage.
[April 2024] Julian's paper on accelerating genome sequencing workloads using CPU's vector data path is accepted at ISCA 2024. Congratulations Julian!
[January 2024] Our group secured funding from the Semiconductor Research Corporation (SRC). Thanks SRC for supporting our research!
[October 2023] Yuhan's paper on understanding the performance-accuracy trade-off for different graph sparsification algorithms is accepted at VLDB 2024. Congratulations Yuhan!
[September 2023] Yichao's paper on high-performance GPU code generation for mining temporal motifs is accepted at VLDB 2024. Congratulations Yichao!
[August 2023] Ali's paper on benchmarking and understanding the performance of vector processing on mobile CPUs is accepted at IISWC 2023, and nominated for the best paper award!. Congratulations Ali!
MENTEES:
I am fortunate to work with these extremely talented group of students every day!
PhD Students:
Yuchen Xia
Undergraduate/master's Students:
Wynn Kaza (UM CSE SUGS student)
Arjun Laxman (UM CSE undergraduate student)
Divyam Sharma (UM SI master's student)
Alumni:
Advait Iyer (first employment: Tesla)
Haojie Ye (first employment: NVIDIA)
Yuhan Chen (first employment: Meta)
Heewoo Kim (first employment: CU Boulder)
Kuan-Yu Chen (first employment: Tenstorrent)
Gaurav Shah (summer 2023 visitor from IIT Gandhinagar)
Joe Su (summer 2023 visitor from Penn State University)
PUBLICATIONS:
In case this page is not up-to-date, I believe that Google Scholar is reasonably accurate.
Conference Publications
Y. Yuan, A. Iyer, L. Ma, and N. Talati, "Vortex: Overcoming Memory Capacity Limitations in GPU-Accelerated Large-Scale Data Analytics," at 51st International Conference on Very Large Data Bases VLDB 2025.
H. Ye, Y. Xia, Y. Chen, K-Y. Chen, Y. Yuan, S. Deng, B. Kasikci, T. Mudge, and N. Talati, "Palermo: Improving the Performance of Oblivious Memory using Protocol-Hardware Co-Design," at the 31st IEEE International Symposium on High-Performance Computer Architecture HPCA 2025.
[PDF] [Code]A. Khadem, D. Fujiki, H. Chen, Y. Gu, N. Talati, S. Mahlke, and R. Das "Multi-Dimensional Vector ISA Extension for Mobile In-Cache Computing," at the 31st IEEE International Symposium on High-Performance Computer Architecture HPCA 2025.
[PDF] [Code]Y. Pan, O. Bhalerao, C. Seshadhri, and N. Talati, "Accurate and Fast Estimation of Temporal Motifs using Path Sampling," at International Conference on Data Mining ICDM 2024.
[PDF] [Code]Y. Xia, J. Kim, Y. Chen, H. Ye, S. Kundu, C. Hao, and N. Talati, "Understanding The Performance and Estimating The Cost Of LLM Fine-Tuning," at IEEE International Symposium on Workload Characterization IISWC 2024.
[PDF] [Code]J. Pavon, I. Valdivieso, C. Morales, C. Hernandez, M. Aslan, J. Lindegger, Y. Yuan, R. Bagué, M. Alser, O. Mutlu, S. Marco-Sola, O. Ergin, N. Talati, M. Valero, O. Unsal, A. Cristal, "QUETZAL: Vector Acceleration Framework for Modern Genome Sequence Analysis Algorithms," at 2024 International Symposium on Computer Architecture ISCA 2024.
[PDF]Y. Chen, H. Ye, S. Vedula, A. Bronstein, R. Dreslinski, T. Mudge, and N. Talati, "Demystifying Graph Sparsification Algorithms in Graph Properties Preservation," at International Conference on Very Large Databases VLDB 2024.
[PDF] [Code] [Supplementary Material]Y. Yuan, H. Ye, S. Vedula, W. Kaza, and N. Talati, "Everest: GPU-Accelerated System for Mining Temporal Motifs," at International Conference on Very Large Databases VLDB 2024.
[PDF] [Code]A. Khadem, D. Fujiki, N. Talati, S. Mahlke, and R. Das, "Vector Processing for Mobile Devices: Benchmark and Analysis," at IEEE International Symposium on Workload Characterization IISWC 2023. (Best Paper Honorable Mention)
[PDF] [Code]H. Kim, H. Ye, T. Mudge, R. Dreslinski, and N. Talati, "RecPIM: A PIM-Enabled DRAM-RRAM Hybrid Memory System For Recommendation Models," at ACM/IEEE International Symposium on Low Power Electronics and Design ISLPED 2023.
[PDF]H. Ye, S. Vedula, Y. Chen, Y. Yang, A. Bronstein, R. Dreslinski, T. Mudge, and N. Talati, "GRACE: A Scalable Graph-Based Approach To Accelerating Recommendation Model Inference," at 28th Conference on Architectural Support for Programming Languages and Operating Systems ASPLOS 2023.
[PDF] [Code] [Slides] [Lightning Talk]Y. Chen, A. Khadem, X. He, N. Talati, T. A. Khan, T. Mudge, "PEDAL: A Power Efficient GCN Accelerator with Multiple DAtafLows," at Design, Automation, and Test in Europe DATE 2023. (Best Paper Honorable Mention)
[PDF]N. Talati, H. Ye, S. Vedula, K-Y Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. Bronstein, T. Mudge, and R. Dreslinski, "Mint: An Accelerator For Mining Temporal Motifs," at the 55th IEEE/ACM International Symposium on Microarchitecture MICRO 2022.
[PDF] [Slides]L. Belayneh, H. Ye, K-Y Chen, D. Blaauw, T. Mudge, R. Dreslinski, N. Talati, "Locality-aware Optimizations for Improving Remote Memory Latency in Multi-GPU Systems," at the 31st International Conference on Parallel Architectures and Compilation Techniques PACT 2022.
[PDF]N. Talati, H. Ye, Y. Yang, L. Belayneh, K-Y Chen, D. Blaauw, T. Mudge, R. Dreslinski, "NDMiner: Accelerating Graph Pattern Mining Using Near Data Processing," at 2022 International Symposium on Computer Architecture ISCA 2022.
[PDF] [Slides]N. Talati, D. Jin, H. Ye, A. Brahmakshatriya, S. Amarasinghe, T. Mudge, D. Koutra, R. Dreslinski, "A Deep Dive Into Understanding The Random Walk-Based Temporal Graph Learning," at 2021 IEEE International Symposium on Workload Characterization IISWC 2021.
[PDF] [Code] [Full Presentation] [Slides]N. Talati, K. May, A. Behroozi, Y. Yang, K. Kaszyk, C. Vasiladiotis, T. Verma, L. Li, B. Nguyen, J. Sun, J. Magnus Morton, A. Ahmadi, T. Austin, M. O'Boyle, S. Mahlke, T. Mudge, R. Dreslinski, "Prodigy: Improving the Memory Latency of Data-Indirect Irregular Workloads Using Hardware-Software Co-Design," at the 27th IEEE International Symposium on High-Performance Computer Architecture HPCA 2021. (Best Paper Award)
[PDF] [Full Presentation] [Short Presentation] [Slides] [UMich announcement]Y. Yang, H. Ye, Y. Chen, X., Liu, N. Talati, X. He, T. Mudge, R. Dreslinski, "CoPTA: Contiguous pattern speculating TLB architecture," at International Conference on Embedded Computer Systems: Architecture, Modeling and Simulation SAMOS 2020.
[PDF]N. Talati, A. Haj Ali, R. Ben Hur, N. Wald, R. Ronen, P. E. Gaillardon, and S. Kvatinsky, "Practical Challenges in Delivering the Promises of Real Processing-in-Memory Machines," at Design, Automation, and Test in Europe DATE 2018.
[PDF]R. Ben Hur, N. Wald, N. Talati, and S. Kvatinsky. “SIMPLE MAGIC: Synthesis and In-memory Mapping of Logic Execution for Memristor Aided loGIC,” at International Conference on Computer Aided Design ICCAD 2017.
[PDF]J. Ruben, R. Ben Hur, N. Wald, N. Talati, A. Haj Ali, P.E. Gaillardon, and S. Kvatinsky. “Memristive Logic: A Framework for Evaluation and Comparison,” at International Symposium on Power and Timing Modeling, Optimization, and Simulations PATMOS 2017.
[PDF]
Journal Publications
H. Kim, A. Amarnath, J Bagherzadeh, N. Talati, R. Dreslinski. "A Survey Describing Beyond Si Transistors, and Exploring Their Implications for Future Processors," ACM Journal on Emerging Technologies in Computing Systems (JETC), vol 17, no. 3, pp. 1-44, June 2021.
[PDF]N. Talati, H. Ha, B. Perach, R. Ronen, S. Kvatinsky, "CONCEPT: A Column-Oriented Memory Controller for Efficient Memory and PIM Operations in RRAM," IEEE Micro, vol. 39, no. 1, pp. 33-43, Jan-Feb 2019.
[PDF]N. Talati, S. Gupta, P. Mane, and S. Kvatinsky, "Logic design within memristive memories using Memristor Aided loGIC (MAGIC)," IEEE Transactions on Nanotechnology, vol. 15, no. 4, pp. 635-650, July 2016.
[PDF] (manuscript), [PDF] (supplementary material).P. Mane, N. Talati, A. Riswadkar, R. Raghu, and C. Ramesha, "Reconfiguration on nanocrossbar using material implication," Sadhana - Academy Proceedings in Engineering Science, vol. 42, no. 1, pp. 33-44, Jan. 2017.
P. Mane, N. Talati, A. Riswadkar, R. Raghu, and C. Ramesha, "Stateful-NOR based reconfigurable architecture for logic implementation," Microelectronics Journal, vol. 46, no. 6, pp. 551 - 562, June 2015.
[PDF]
Book Chapters
N. Talati, R. Ben Hur, N. Wald, A. Haj Ali, J. Reuben, and S. Kvatinsky, "mMPU – a Real Processing–in–Memory Architecture to Combat the von Neumann Bottleneck," in Advanced Applications of Emerging NVM Devices, Springer Series in Advanced Microelectronics, 2017.
[PDF]J. Reuben, N. Talati, N. Wald, R. Ben Hur, A. Haj Ali, P.E. Gaillardon, and S. Kvatinsky, "A Taxonomy and Evaluation Framework for Memristive Logic," in Handbook of Memristor Networks, Springer, 2017.
[PDF]
Workshop Publications
(Position paper) N. Talati, B. Kasikci, W. Zhu, and N. Sung Kim, "A Deep Co-Design Approach: Systems, Architectures, and Device Cross-Cutting Research for Energy Efficiency," DoE ASCR Energy Efficient Computing for Science Workshop 2024.
[PDF](Position paper) J-W. Chung, N. Talati, and M. Chowdhury, "Toward Cross-Layer Energy Optimizations in AI Systems," DoE ASCR Energy Efficient Computing for Science Workshop 2024.
[PDF]N. Talati, Z. Wang, and S. Kvatinsky, "Rate-Compatible and High-Throughput Architecture Designs for Encoding LDPC Codes," International Symposium on Circuits and Systems ISCAS 2017.
[PDF]S. Kvatinsky, R. Ben Hur, N. Talati, and N. Wald, "mMPU: memristive Memory Processing Unit," MEMRISYS 2017.
R. Ben Hur, N. Talati, and S. Kvatinsky, "Algorithmic Considerations in Memristive Memory Processing Units (MPU)," Proceedings of the International Cellular Nanoscale Networks and their Applications, August 2016.
[PDF]R. Ben Hur, N. Talati, Nimrod Wald, and Shahar Kvatinsky, "Memory Processing Unit for In-Memory Processing," The First International Workshop on In-Memory and In-Storage Computing with Emerging Technologies (In Conjunction with 25th International Conference on Parallel Computing and Compilation Techniques (PACT)), 2016.
P. Mane, N. Talati, A. Riswadkar, B. Jasani, and C. Ramesha, "Implementation of NOR logic based on material implication on CMOL FPGA architecture," 28th International Conference on VLSI Design (VLSID), 2015, , pp. 523 - 528, Jan 2015.
[PDF] (Travel grant award)P. Mane, N. Talati, A. Riswadkar, R. Raghu and C.K. Ramesha, "Implicating logic functions with memristors," 11th International Conference on SoC Design (ISOCC), 2014, pp. 232 - 233, Nov 2014.
[PDF] (Best poster award + BITSAA travel grant award)
Technical Reports
R. Ben Hur, N. Wald, N. Talati, and S. Kvatinsky, ” Latency Optimized Mapping of Logic Functions for Memristor Aided Logic (MAGIC),” CCIT Technical Report #908, December 2016.
[PDF]
INDUSTRY EXPERIENCE:
Meta Platforms Inc., Menlo Park, CA, USA
Software Engineering PhD Intern from May to August 2022
I worked with Meta's AI Infrastructure group on detection, monitoring, and optimization of the data ingestion pipeline of Meta's fleetwide AI training workflows.Advanced Micro Devices (AMD) Research, Austin, TX, USA
Co-op Engineer from May to November 2020
I worked with Dr. Ganesh Dasika on the performance analysis and optimization of large-scale data analytics workloads on AMD platforms.
AWARDS:
Best paper award at HPCA 2021 (Paper title: Prodigy: Improving the Memory Latency of Data-Indirect Irregular Workloads Using Hardware-Software Co-Design).
Graduate Student Research Assistant (GSRA) position at CSE, University of Michigan.
Full graduate scholarship for Master's degree at EE, Technion.
Travel grant award to visit and present a poster at VLSID 2015.
Best poster award in the 'Nanoelectronic Devices and Circuits' track at ISOCC 2014.
TEACHING/MENTORING EXPERIENCE:
Graduate Student Instructor (GSI) for EECS 370 (Intro to computer organization) at CSE, University of Michigan in the Fall 2021 semester.
Mentored a research project for high-school students as a part of SciTech Summer Camp Program in 2017.
Mentored a research project of two undergraduate students at EE, Technion in Winter 2017 semester.
Teaching assistant for Microelectronics Circuits at at BITS Pilani, Goa in the Spring Semester 2016.
SCIENTIFIC PEER REVIEW ACTIVITY:
Technical Program Committee Member: PACT 2022
External Review Committee Member: MICRO 2022, ISCA 2022
Journal Referee (as a primary reviewer): IEEE Transactions on Very Large Scale Systems (TVLSI), ACM Transactions on Architecture and Code Optimization (TACO), IEEE Transactions on Emerging Topics in Computing (TETC), IEEE Transactions on Nanotechnology (TNANO), Microelectronics Journal, Elsevier
Conference External Review (assisted primary reviewers): HPCA 2018, DATE 2018, CASES 2017, ISCAS 2017, ISCAS 2016, CNNA 2016
COMMUNITY SERVICE:
I have had excellent opportunities to work at the Blind People's Association, India at Ahmedabad, twice (during the winter and summer of my freshman year)!
Thanks to Ms. Kinnari Desai for giving me this chance to serve the people with disabilities at BPA.
I was involved in organizing important government resolutions (GRs) for people with disabilities during winter, and designing a software package for organizing events at the institution during summer.
Also, thanks to my friends Vraj, Vikram, and Mit, who joined and helped me in this noble cause.
CONTACT:
Address:
1029 DOW (next to the CSE building)
University of Michigan,
Ann Arbor, MI - 48109
E-mail:
nishil dot talatiwork at gmail dot com or
talatin at umich dot edu
Please do NOT spam or distribute my email address to a third party without my consent.