System Support for Big, Heterogeneous Memory Platforms
The emergence of many techniques (such CXL interconnect and persistent memory) is poised to enable big memory systems. How to make the best usage of big memory systems in a scalable and cost-effective way is an open question. This research studies the supports of OS, runtime, programming models, and applications for emerging big memory systems.
Research Outcome:
[HPCA'24] Jie Ren, Dong Xu, Shuangyan Yang, Jiacheng Zhao, Zhicheng Li, Christian Navasca, Chenxi Wang, Harry Xu, and Dong Li. "Enabling Large Dynamic Neural Network Training with Learning-based Memory Managemen". In 30th International Symposium on High-Performance Computer Architecture, 2024.
[ASPLOS'23] Shuangyan Yang, Minjia Zhang, Wenqian Dong, and Dong Li. Betty: Enabling Large-Scale GNN Training with Batch-Level Graph Partitioning. In 28th Architectural Support for Programming Languages and Operating Systems
[PPoPP'23] Zhen Xie, Jie Liu, Jiajia Li, and Dong Li. Merchandiser: "Data Placement on Heterogeneous Memory for Task-Parallel HPC Applications with Load-Balance Awareness". In 28th Principles and Practice of Parallel Programming
[ICS'21] Zhen Xie, Wenqian Dong, Jie Liu, Ivy Peng, Yanbao Ma and Dong Li. "MD-HM: Memoization-based Molecular Dynamics Simulations on Big Memory System". In 35th International Conference on Supercomputing
[ICS'21] Jie Ren, Jiaolin Luo, Ivy Peng, Kai Wu and Dong Li. "Optimizing Large-Scale Plasma Simulations on Persistent Memory-based Heterogeneous Memory with Effective Data Placement Across Memory Hierarchy". In 35th International Conference on Supercomputing
[ICS'21] Jiawen Liu, Dong Li, Roberto Gioiosa and Jiajia Li. "Athena: High-Performance Sparse Tensor Contraction Sequence on Heterogeneous Memory". In 35th International Conference on Supercomputing.
[FAST'21] Kai Wu, Jie Ren, Ivy Peng and Dong Li. "ArchTM: Architecture-Aware, High Performance Transaction for Persistent Memory". In 19th USENIX Conference on File and Storage Technologies
[ASPLOS'21] Bang Di, Jiawen Liu, Hao Chen and Dong Li. "Fast, Flexible and Comprehensive Bug Detection for Persistent Memory Programs". In 26th Architectural Support for Programming Languages and Operating Systems (distinguished artifact award)
[PPoPP'21] Jiawen Liu, Jie Ren, Roberto Gioiosa, Dong Li and Jiajia Li. "Sparta: Efficient Sparse Tensor Contraction on Heterogeneous Memory Systems". In 26th Principles and Practice of Parallel Programming
[HPCA'21] Jie Ren, Jiaolin Luo, Kai Wu, Minjia Zhang, Hyeran Jeon and Dong Li. "Sentinel: Efficient Tensor Migration and Allocation on Heterogeneous Memory Systems for Deep Learning". In 27th IEEE International Symposium on High-Performance Computer Architecture.
[NeurIPS'20] Jie Ren, Minjia Zhang and Dong Li. "HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory ". In 34th Conference on Neural Information Processing Systems.
[PACT'20] Kai Wu, Ivy Peng, Jie Ren and Dong Li. "Ribbon: High Performance Cache Line Flushing for Persistent Memory". In 29th International Conference on Parallel Architectures and Compilation Techniques.
[Cluster'20] Jie Ren, Kai Wu and Dong Li. "Exploring Non-Volatility of Non-Volatile Memory for High Performance Computing Under Failures". In IEEE International Conference on Cluster Computing. (Link to tech report) (Link to the NVC tool )
[SC'20] Jiaolin Luo, Luanzheng Guo, Jie Ren, Kai Wu and Dong Li. Enabling Faster NGS Analysis on Optane-based Heterogeneous Memory. Poster In 32nd ACM/IEEE International Conference for High Performance Computing, Performance Measurement, Modeling and Tools.
[IPDPS'20] Ivy Peng, Kai Wu, Jie Ren, Dong Li and Maya Gokhale. Demystifying the Performance of HPC Scientific Applications on NVM-based Memory Systems. In 34th IEEE International Parallel and Distributed Processing Symposium.
[SC'18] Kai Wu, Jie Ren, and Dong Li. Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task Parallel Programs In 30th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (acceptance rate: %).
[SC'17] Kai Wu, Yingchao Huang, and Dong Li. Unimem: Runtime Data Management on Non-Volatile Memory-based Heterogeneous Main Memory In 29th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (acceptance rate: 18.7%).
[Cluster'17] Shuo Yang, Kai Wu, Yifan Qiao, Dong Li, and Jidong Zhai. Algorithm-Directed Crash Consistence in Non-Volatile Memory for HPC In IEEE International Conference on Cluster Computing (acceptance rate: 21.8%).
[NAS'17] Wei Liu, Kai Wu, Jialin Liu, Feng Chen, and Dong Li. Performance Evaluation and Modeling of HPC I/O on Non-Volatile Memory. In 12th International Conference on Networking, Architecture, and Storage.
[HPDC'16] Panruo Wu, Dong Li, Zizhong Chen, Jeffrey S. Vetter, and Sparsh Mittal. Algorithm-Directed Data Placement in Explicitly Managed Non-Volatile Memory. In 25th ACM International Symposium on High Performance Parallel and Distributed Computing (acceptance rate: 16%).
[MICRO'14] Guoyang Chen, Bo Wu, Dong Li, and Xipeng Shen. PORPLE: An Extensible Optimizer for Portable Data Placement. In 47th Annual IEEE/ACM International Symposium on Microarchitecture (acceptance rate: 19%).
[PACT'13] Bin Wang, Bo Wu, Dong Li, Xipeng Shen, Weikuan Yu, Yizheng Jiao, and Jeffrey S. Vetter. Exploring Hybrid Memory for GPU Energy Efficiency through Software-Hardware Co-Design. In 22nd ACM/IEEE International Conference on Parallel Architectures and Compilation Techniques (acceptance rate: 17%)
[IPDPS'12] Li, D., Vetter, J., Marin, G., McCurdy, C., Cira, C., Liu, Z., and Yu, W. Identifying Opportunities for Byte-Addressable Non-Volatile Memory in Extreme-Scale Scientific Applications. In 26th IEEE International Parallel and Distributed Processing Symposium.
This research is supported by: