Open-Source Software

FedML-AI (FedML)

An open-source software framework for federated learning. I contributed to the job scheduling module.

Parallel Convolutional Neural Network Software Framework (pcnn)

This program is data-parallel Convolutional Neural Network training framework written in C/C++. The data parallelism is implemented using MPI and each process further parallelizes its workload using OpenMP exploiting all the local compute resources. The kernel functions such as matrix multiplications are implemented using Intel MKL, a highly-optimized math library. The framework supports most of the modern deep learning features including mini-batch SGD, Adam, momentum, L-2 regularization (weight-decay), batch normalization. In addition, it also supports local SGD with periodic model averaging. The software git repository will be opened once all the relevant papers are published.

Parallel HDF5 Dataset Concatenation Program (ph5concat)

Under SciDAC RAPIDS project, I developed a parallel software for HDF5 dataset concatenation. This C++ software is opened publicly. Given many input HDF5 files, the program concatenates all individual HDF5 objects (datasets) across the input files and writes it into a single output file. The program is parallelized on file dimension such that each process is assigned with a subset of the input files and reads each dataset locally. Then, all the processes collectively write each dataset one after another. A case study paper regarding this program is currently under review.

TensorFlow 2-based CosmoFlow (tf2-cosmoflow)

Under SciDAC RAPIDS project, I developed a deep learning solution for Cosmology parameter regression problems. This program is a software template for data-parallel training of 3-D CNN on dark matter distribution data. The code is based on TensorFlow 2.x and the training is parallelized using Horovod.