[MICRO'24] PointCIM: A Computing-in-Memory Architecture for Accelerating Deep Point Cloud Analytics
[HPCA'23] Tensor Movement Orchestration In Multi-GPU Training Systems
[ASP-DAC'22] PUMP: Profiling-free Unified Memory Prefetcher for Large DNN Model Support
[NVMSA'22] Efficient and Atomic-Durable Persistent Memory through In-PM Hybrid Logging
[HPCA'22] Efficient Bad Block Management with Cluster Similarity
[ICIP'19] Iotbench: A Benchmark Suite for Intelligent Internet of Things Edge Devices
[DAC'18] Active Forwarding: Eliminate IOMMU Address Translation for Accelerator-Rich Architectures
[ASP-DAC'17] Enabling Fast Preemption via Dual-Kernel Support on GPUs
[DAC'16] Latency Sensitivity-Based Cache Partitioning for Heterogeneous Multi-core Architecture
[ISPLED'15] Fine-grained Write Scheduling for PCM Performance Improvement under Write Power Budget
[ISLPED'09] PPT: Joint Performance/Power/Thermal Management of DRAM Memory for Multi-Core Systems
[ICCAD'09] Thermal Modeling for 3D-ICs with Integrated Microchannel Cooling
[VLSI-DAT'09] Content-Aware Energy Prediction for Video Streaming in Mobile Devices
[IEEE Micro'24] CIMNet: Joint Search for Neural Network and Computing-in-Memory Architecture
[CAL'17] Improving GPGPU Performance via Cache Locality Aware Thread Block Scheduling
[TC'16] Improving Read Performance of NAND Flash SSDs by Exploiting Error Locality
[TECS'15] System-Level Performance and Power Optimization for MPSoC - A Memory-Access Aware Approach
[TCAD'09] A Progressive-ILP-Based Routing Algorithm for the Synthesis of Cross-Referencing Biochips\
[TODAES'09] Leakage-aware task scheduling for partially dynamically reconfigurable FPGAs
[TODAES'09] T-trees: A Tree-Based Representation for Temporal and Three-Dimensional Floorplanning