Awais Khan (Ph.D.)
HPC Systems Scientist
Oak Ridge National Lab,
Oak Ridge, TN, USA
Email: khana@ornl.gov (Google Scholar)
Contact: +1865-368-1557
Research Interests
Data Management and File System Indexing Services, High-Performance Computing, Cloud Computing, I/O Middlewares, Key-Value Stores, Data Deduplication, and Distributed and Parallel Storage Systems etc.
Education
Ph.D. Computer Science and Engineering (Sept, 2015 - Feb, 2021)
Dissertation: Metadata Search and Discovery Services for Scientific Applications on HPC Storage Architectures
Advisor: Prof. Youngjae Kim
Sogang University, Seoul, South Korea
BS Bioinformatics (Major in Computer Sciences) (2007 - 2011)
Mohammad Ali Jinnah University, Islamabad, Pakistan
Work Experience
HPC Systems Scientist at Oak Ridge National Lab, TN, USA (Aug, 2023 - Present)
Research Area:- Exascale Storage Systems, System architectureSenior Systems Performance Engineer at Micron, Austin, TX (Oct, 2022 - Aug, 2023)
Research Area:- HPC-AI intersection workloads, CXL-DDR system balance analysis, microarchitecture balance analysis with HBM and CXL, Scientific benchmarks evaluation on emerging memory interconnectsPostdoctoral Research Associate at Oak Ridge National Lab, TN, USA (Nov, 2021 - Oct 2022)
Research Area: Building optimized caching framework for large-scale AI/ML applications, Checkpointing large-scale application data on high-performance storage systemsPostdoctoral Research Associate at Sogang University, Seoul, South Korea (Mar, 2021 - Oct, 2021)
Research Area: Working on optimizing big data streaming engines, enhancing AI/ML applications on Deduplication-enabled StoragesGraduate Research Assistant at Sogang University, Seoul, South Korea (2015 - Feb, 2021)
Research Area: Working on Distributed and Parallel File Systems, Big Data Storage and Computing, Cluster-wide Data DeduplicationResearch Intern at Oak Ridge National Laboratory (ORNL), TN, USA (May, 2019- Sept, 2019)
Research Area: NVM-based Future HPC storage frameworks, Scientific Discovery Services, and System Balance Trend Analysis of Top500 SupercomputersSenior Software Engineer at Digital Research Laboratories (DRL), Islamabad, Pakistan (2011 - 2015)
Job Description: Analysis, Design, Implementation and Deployment of Enterprise Resource Planning (ERP) Software Modules
Skills and Expertise
Skillful in Linux, distributed systems, file systems, database systems and scientific middleware I/O libraries
Programming Languages: C, C++, Java SE & EE, Python
File & Storage Systems: Linux VFS, FUSE
Non-Volatile Memory: Intel’s PMDK Library
Parallel & Distributed File Systems: Ceph, Glustre, Lustre, and HDFS
Databases & Key-value Stores: PostgreSQL, SQLite, Cassandra, LevelDB, RocksDB, PMEM-KV
Parallel Programming: MPI, pthread, OpenMP
Parallel I/O Library: HDF5, NetCDF
Benchmarks: IOR, FIO, MDTest
Tools: gcc, gdb, cscope, ctags, autotools, git, svn, eclipse, jdeveloper, ireports, visual studio, latex, gnuplot, omnigraffle
Research and Development Experience
An Indexable and Searchable Memory Object Management System
Designing and developing a persistent memory object management system to accelerate scientific applications with emerging persistent memory technology, enabling users and applications to create, tag, index, share and query the memory objects.
Indexed objects can be searched via multiple attributes in a scalable fashion with efficient persistent index data structures. Several scientific tools are also provided to convert the legacy scientific data formats such as HDF5, netCDF directly to memory objects.
Intel’s PMDK library, HDF5, NetCDF, FITS , PMEM-KV, Persistent index data structures
An Integrated Indexing and Search Service for Distributed File Systems
Designed and developed a scalable data management service framework aimed at scientific datasets for Ceph distributed file system without voilating the shared-nothing design property of Ceph.
The developed tool indexes millions of files extracting automated and user-defined metadata inline/offline capturing the additional descriptive metadata and store it in distributed database shards spanning over hundreds of nodes.
Ceph file system, Linux FUSE, SQLite
SciSpace: Scientific Collaboration Workspace for Geo-distributed HPC Data Centers
To improve information and resource sharing for joint simulation and analysis between the HPC data centers, we designed and developed SciSpace, Scientific Collaboration Workspace for collaborative data centers.
Offers a global view of information shared from multiple geo-distributed HPC data centers under a single workspace and is equipped with multi-mode data indexing and query-like discovery service to optimize data sharing and information retrieval.
FUSE, Linux VFS, Lustre PFS, Scientific Data formats such as HDF5, NetCDF and Climate Data Operator tools
Cluster-wide Deduplication in Distributed Storage Systems
Designed and developed a scalable and robust cluster-wide deduplication framework and integrated into Ceph to improve storage space efficiency across the cluster with thousands of nodes. Specifically, a distributed deduplication metadata sharding approach has been adopted to store and index the redundant data fingerprints.
Ceph, RADOS, SQLite, LevelDB, CRUSH algorithm, Communication Sub-System (Async Messenger)
Publications
Awais Khan, Jack Lange, Nick Hagerty, Edwin F. Posada, John Holmen, James White, Austin Harris, Veronica Melesse Vergara, Christopher Zimmer, Scott Atchley, An Evaluation of the Effect of Network Cost Optimization for Leadership Class Supercomputers in Proceedings of the International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC), Atlanta, GA, 17-22 Nov 2024. (Best Paper Nomination)
Awais Khan, Christopher Zimmer, Scott Atchley, Ross Miller, Sarp Oral, Feiyi Wang, Accelerating Application Bulk Synchronous Writes in HPC Environments in Proceedings of the Seventh International Workshop on Systems and Network Telemetry and Analytics (SNTA) held in conjunction with ACM HPDC, PISA, Italy, 3-7 June, 2024.
Yoochan Kim, Kihyun Kim, Yonghyeon Cho, Jinwoo Kim, Awais Khan, Ki-Dong Kang, Baik-Song An, Myung-Hoon Cha, Hong-Yeon Kim, Youngjae Kim, DeepVM: Integrating Spot and On-Demand VMs for Cost-Efficient Deep Learning Clusters in the Cloud In Proceedings of the IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) (2024), Philadelphia, May, 2024.
Yeohyeon Park, Junhyeok Park, Junghwan Park, Awais Khan, Kyungpyo Kim, Sung-Soon Park, Youngjae Kim, OctoKV: An Agile Network-based Key-Value Storage System with Robust Load Orchestration, MDPI Electronics, 2024.
Yeohyeon Park, Junhyeok Park, Awais Khan, Jungwhan Park, Woosuk Chung, Youngjae Kim, OctoKV: An Agile Network-based Key-Value Storage System with Robust Load Orchestration, In Proceedings of the IEEE Int'l Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS) (2023), Stony Brook, NY, USA, October 16-18, 2023.
Donghyun Min, Kihyun Kim, Chaewon Moon, Awais Khan, Seungjin Lee, Changhwan Yun, Woosuk Chung, Youngjae Kim, A Multi-Tenant Key-Value SSD with Secondary Index for Search Query Processing and Analysis, ACM Transactions on Embedded Computing Systems (TECS).
Safdar Jamil, Abdul Salam, Awais Khan, Bernd Burgstaller, Sung-Soon Park, Youngjae Kim, Scalable NUMA-aware Persistent B+-tree for Non-Volatile Memory Devices, Cluster Computing: The Journal of Networks, Software Tools and Applications, 2022.
Safdar Jamil, Awais Khan, Kihyun Kim, Jae-Kook Lee, Dosik Ahn, Taeyoung Hong, Sarp Oral, Youngjae Kim, DENKV: Addressing Design Trade-offs of Key-value Stores for Scientific Applications in Proceedings of the 7th International Parallel Data Systems Workshop (PDSW) held in conjunction with SC22, Dallas, TX, November 2022.
Awais Khan, Arnab K. Paul, Christopher Zimmer, Sarp Oral, Sajal Dash, Scott Atchley, Feiyi Wang, HVAC: Removing I/O bottleneck for Large-scale Deep Learning Applications in Proceedings of the IEEE Cluster 2022, Heidelberg, Germany, 6-9 September, 2022.
Hyungjoon Kwon, Yonghyeon Cho, Awais Khan, Yeohyeon Park, Youngjae Kim, DeNOVA: Deduplication Extended NOVA File System In Proceedings of the 36th IEEE Int'l Parallel and Distributed Processing Symposium (IPDPS) (2022), Lyon, France, May 30-June 3, 2022.
Invited Session Presentation, IEEE NVMSA 2022 (Virtual)Safdar Jamil, Awais Khan, Youngjae Kim, Exploring Data Deduplication in LSM Tree-based Key-Value Stores, Work-In-Progress (WiP), USENIX Conference on File and Storage Technologies (FAST) (2022), San Jose, CA, February 2022.
Safdar Jamil, Awais Khan, Burgstaller Bernd, Youngjae Kim, Towards Scalable Manycore-aware Persistent B+-Tree for Efficient Indexing in Cloud Environment, 9th International Workshop on Autonomic Management of High performance Grid and Cloud Computing (AMGCC), Washington D.C., September 27, 2021. [pdf]
Prince Hamandawana, Awais Khan, Jongik Kim, Tae-Sun Chung, Accelerating AI/ML Applications with Hierarchical Caching on Deduplication Storage Clusters, IEEE Transactions on Big Data, 2021. [pdf]
Awais Khan, Hyogi Sim, Sudharshan S. Vazhkudai, Youngjae Kim, MOSIQS: Persistent Memory Object Storage with Indexing and Querying for Scientific Computing, IEEE Access, 2021. [pdf]
Awais Khan, Hyogi Sim, Sudharshan S. Vazhkudai, Ali R. Butt, Youngjae Kim, An Analysis of System Balance and Architectural Trends Based on Top500 Supercomputers, In Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia), January 20-22, 2021. [pdf][talk]
Awais Khan, Prince Hamandawana, Youngjae Kim, A Content Fingerprint-based Cluster-wide Inline Deduplication for Shared-nothing Storage Systems, IEEE Access, 2020. [pdf]
Heerock Lee, Chang-Gyu Lee, Awais Khan, Hyeongu Kang, Jinseok Ma, Song-Woo Suk, Myeonghoon Oh, Youngjae Kim, Scalable Container-based Software Platform for Fabric-Attached Memory Pool, in Korea Software Conference (KSC), 2020. [pdf]
Awais Khan, Hyogi Sim, Sudharshan S. Vazhkudai, Jinsuk Ma, Myeong-Hoon Oh, Youngjae Kim, Persistent Memory Object Storage and Indexing for Scientific Computing, In Proceedings of the Workshop of Memory Centric High Performance Computing (MCHPC) held in conjunction with SC'20. [pdf][talk]
Hyogi Sim, Awais Khan, Sudharshan S. Vazhkudai, An Analysis of System Balance and Architectural Trends Based on Top500 Supercomputers, published as Technical Report, ORNL, TN, USA, 2020.
Joongeon Park, Safdar Jamil, Awais Khan, Matt Sangkeun Lee, Youngjae Kim, ScaleML: Machine Learning based Heap Memory Object Scaling Prediction, In Proceedings of the 9th IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA), Busan, South Korea, 2020. [pdf]
Hyogi Sim, Awais Khan, Sudharshan S. Vazhkudai, Seung-Hwan Lim, Ali R. Butt, Youngjae Kim, An Integrated Indexing and Search Service for Distributed File Systems, IEEE Transactions on Parallel and Distributed Systems. (SCI, IF 4.18) [pdf]
Prince Hamandawana, Awais Khan, Changgyu Lee, Sungyong Park, Youngjae Kim, Crocus: Enabling Computing Resource Orchestration for Inline Cluster-wide Deduplication on Scalable Storage Systems, IEEE Transactions on Parallel and Distributed Systems, 2020. (SCI, IF 4.18) [pdf]
Awais Khan, Taeuk Kim, Hyunki Byun, Youngjae Kim, SciSpace: A Scientific Collaboration Workspace for Geo-Distributed HPC Data Centers, Journal of Future Generation Computer System, 2019. (SCI, IF 6.1) [pdf]
Awais Khan, Muhammad Attique, Youngjae Kim, iStore: Towards the Optimization of Federation File Systems, Journal of IEEE Access, 2019. (SCIE, IF 4.098) [pdf]
Awais Khan, Changgyu Lee, Prince Hamandawana, Sungyong. Park, Youngjae Kim, A Robust Fault-Tolerant and Scalable Cluster-wide Deduplication for Shared-Nothing Storage Systems, In Proceedings of the 2018 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), Milwaukee, WI, USA, September 25-28, 2018. [pdf][talk]
Awais Khan, Muhammad Attique, Youngjae Kim, Sungyong Park, Byungchul Tak, EdgeStore: A Single Namespace and Resource-aware Federation File System for Edge Servers, In Proceedings of the 2018 IEEE International Conference of Edge Computing (EDGE), San Francisco, CA, USA, July 2-7, 2018. [pdf][talk]
Bodon Jeong, Awais Khan, Sungyong Park, Async-LCAM: A Lock Contention Aware Messenger for Ceph Distributed Storage System, Cluster Computing: The Journal of Networks, Software Tools and Applications, 2018. [pdf]
Taeuk Kim, Awais Khan, Youngjae Kim, Preethika Kasu, Scott Atchley, NUMA-Aware Thread Scheduling for Big Data Transfers over Terabits Network Infrastructure, Journal of Scientific Programming, 2018. [pdf]
Jangwoong Kim, Youngjae Kim, Awais Khan, Sungyong Park, Understanding the Performance of Storage Class Memory Filesystems in the NUMA Architecture, Cluster Computing: The Journal of Networks, Software Tools and Applications, 2018. [pdf]
Awais Khan, Changgyu Lee, Sungyong. Park, Youngjae Kim, Tagged Consistency and Garbage Identification in Deduplication-enabled Storage Systems, (WiP) USENIX Conference on File and Storage Technologies (FAST), Oakland, CA, USA, February 2018. [pdf][talk]
Prince Hamandawana, Awais Khan, Changgyu Lee, Sungyong Park, Youngjae Kim, Accelerating the Data Deduplication Performance with GPU in Hybrid Storage Systems, (WiP) PDSW-DISCS'17 (held in conjunction with SC'17), Denver, CO, USA, November 13, 2017. [pdf]
Taeuk Kim, Awais Khan, Youngjae Kim, Sungyong Park, Scott Atchley, NUMA-Aware Thread and Resource Scheduling for Terabit Data Movement, (WiP) PDSW-DISCS'17 (held in conjunction with SC'17), Denver, CO, USA, November 13, 2017. [pdf]
Jangwoong Kim, Jaehoon Kim, Awais Khan, Youngjae Kim, Sungyong Park, ZonFS: A Storage Class Memory File System with Memory Zone Partitioning on Linux, 5th International Workshop on Autonomic Management of High performance Grid and Cloud Computing (AMGCC), Tucson, AZ, September, 2017. [pdf]
Muhammad Attique, Awais Khan, Tae-Sun Chung, eSPAK: Top-k Spatial Keyword Query Processing in Directed Road Networks, In Proceedings of the EDBT/ICDT Workshop'17, Venice, Italy, March 21-24, 2017. [pdf]
Awais Khan, Muhammad Attique, Tae-Sun Chung, Youngjae Kim, Time Optimization Modeling for Big Data Placement and Analysis for Geo-Distributed Data Centers, In Proceedings of IEEE International Conference on Cluster Computing, Taipei, Taiwan, September 2016. (short paper) [pdf][talk]
Professional Activities
Staff Chair: NVMSA'20, AMGCC'21 (responsible for managing the program, helping session chairs, and virtual zoom sessions etc.)
Session Chair: IEEE Cluster'21, AMGCC'21
Reviewer: IEEE Transactions on Computers, IEEE Transactions on Emerging Topics in Computing, IEEE Systems Journal, IEEE TPDS (Review Board Member), IEEE Transactions on Knowledge from Data Discovery (TKDD), IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, Cluster Computing Journal (Guest Editor'22), IEEE ACCESS, IEEE Cluster'21 (External Reviewer), MDPI BDCC, ETRI Journal
Invited Lectures: Two Invited Graduate Course Lectures at Sogang University, Seoul, South Korea
Awards and Honors
Global Fellowship, Sogang University, 2017- Dec, 2019.
USENIX ATC and HotStorage Student Grant, 2020.
Special Research Scholarship, Sogang University, Mar, 2018.
Full Tuition Assistantship, Ajou University, 2015-2016.