Md. Khairul Islam

Cosmo3DFlow: Wavelet Flow Matching for Spatial-to-Spectral Compression in Reconstructing the Early Universe

Md Khairul Islam, Zeyu Xia, Ryan Goudjil, Jialu Wang, Arya Farahi, Judy Fox

University of Virginia, Virginia, USA University of Texas at Austin, Austin, Texas, USA

Reconstructing the early Universe from the evolved present-day Universe is a challenging and computationally demanding problem in modern astrophysics. We devise a novel generative framework, Cosmo3DFlow, designed to address dimensionality and sparsity, the bottlenecks inherent in current methods for cosmological inference. By integrating 3D Discrete Wavelet Transform (DWT) with flow matching, we effectively represent high-dimensional cosmological structures. Using large-scale cosmological 𝑁 -body simulations, at 128^3 resolution, we achieve up to 50× faster sampling than diffusion models, combining a 10× reduction in integration steps with lower per-step computational cost from wavelet compression

OmniSpectra: A Unified Foundation Model for Native Resolution Astronomical Spectra

Md Khairul Islam, Judy Fox

University of Virginia, USA

We present OmniSpectra, a native-resolution foundation model for astronomy spectra. Unlike traditional models, which are limited to fixed-length input sizes or configurations, OmniSpectra handles spectra of any length at their original size, without resampling or interpolation. OmniSpectra demonstrates excellent zero-shot generalization compared to methods tailored for specific tasks. This transfer learning capability makes this model the state-of-the-art across various astronomy tasks, including source classification, redshift estimation, and properties prediction for stars and galaxies.

Scalable Cosmic AI Inference using Cloud Serverless Computing with FMI

[The International Journal of High Performance Computing Applications] [Code]

Mills Staylor, Amirreza Dolatpour Fathkouhi, Md Khairul Islam, Kaleigh O’Hara, Ryan Ghiles Goudjil, Geoffrey Fox, Judy Fox

University of Virginia, USA

Current deep-learning models on large-scale astronomical image data demand substantial computational resources. We introduce the Cloud-based Astronomy Inference (CAI) framework to address these challenges. This scalable solution integrates pre-trained foundation models with serverless cloud infrastructure. Using a foundation model for redshift prediction as a case study, our extensive experiments cover user devices, HPC (High-Performance Computing) servers, and Cloud. CAI’s significant scalability improvement on large data sizes provides an accessible and effective tool for the astronomy community.

WinTSR: A Windowed Temporal Saliency Rescaling Method for Interpreting Time Series Deep Learning Models

[AAAI'25 Workshop AI4TS: AI for Time Series Analysis] [Code]

Md Khairul Islam, Ayush Karmacharya, Timothy Sue, Judy Fox

University of Virginia, USA

Existing interpretation methods are limited by focusing mostly on classification tasks, evaluating using custom baseline models, using simple synthetic datasets, and requiring training another model. We introduce a novel interpretation method called Windowed Temporal Saliency Rescaling (WinTSR) addressing these limitations. This captures temporal dependencies among the past time steps and efficiently scales the feature importance. We benchmark WinTSR against 10 recent interpretation techniques with 5 state-of-the-art models of different architectures, including a foundation model. We use 3 real-world datasets for classification and regression. WinTSR significantly outranks the other local interpretation methods in overall performance. Finally, we provide a novel and open-source framework to interpret the latest time series transformers and foundation models.

Large Language Models for Financial Aid in Financial Time-series Forecasting

IEEE BigData 2024 Workshop on Large Language Models for Finance

Md Khairul Islam, Ayush Karmacharya, Timothy Sue, Judy Fox

University of Virginia, USA

We use state-of-the-art time series models including pre-trained LLMs (GPT-2 as the backbone), transformers, and other models to demonstrate their ability to outperform traditional approaches with minimal (”few-shot”) or no fine-tuning (”zero-shot”). Our benchmark study with eight financial time series tasks, shows the potential of using LLMs for scarce financial datasets.

Interpreting Time Series Transformer Models and Sensitivity Analysis of Population Age Groups to COVID-19 Infections

AAAI'24 Workshop AI4TS: AI for Time Series Analysis

Md Khairul Islam, Tyler Valentine, Timothy Joowon Sue, Ayush Karmacharya, Luke Neil, Benham, Zhengguang Wang, Kingsley Kim, Judy Fox

University of Virginia, USA

We interpreted six latest time series transformer-based models with eight recent local interpretation methods. We showed how we can efficiently benchmark the interpretation performance of those methods. The primary dataset consists of daily COVID-19 infection cases collected from 3,142 US counties for three years and around 3.5 million sample instances. Then we showed the generability of our approach using two other popular time series datasets.

Temporal Dependencies and Spatio-Temporal Patterns of Time Series Models

AAAI'24 Doctoral Consortium

Md Khairul Islam, Judy Fox

University of Virginia, USA

Interpreting time series forecasting models faces unique challenges compared to image and text data. These challenges arise from the temporal dependencies between time steps and the evolving importance of input features over time. My thesis focuses on addressing these challenges by aiming for more precise explanations of feature interactions, uncovering spatiotemporal patterns, and demonstrating the practical applicability of these interpretability techniques using real-world datasets and state-of-the-art deep learning models.

Interpreting County-Level COVID-19 Infections Using Transformer and Deep Learning Time Series Models

IEEE International Conference on Digital Health, 2023

3rd place in NSF Student Research Competition

Md Khairul Islam, Di Zhu, Yingzheng Liu, Andrej Erkelen, Nick Daniello, Aparna Marathe, and Judy Fox

University of Virginia, USA

We forecast US county-level COVID-19 infections using the Temporal Fusion Transformer (TFT). We focus on heterogeneous time-series deep learning model prediction while interpreting the complex spatiotemporal features learned from the data. We collected around 2.5 years of socioeconomic and health features for 3142 US counties. Our results show that the TFT model outperforms other baseline models in all evaluation metrics. We then interpreted the temporal and spatial patterns learned by the TFT model using the multi-head self-attention weights.

Does Differential Privacy Impact Bias in Pretrained Language Models?

Data Engineering Bulletin 2023

Md. Khairul Islam 1, Andrew Wang 1, Tianhao Wang 1 , Yangfeng Ji 1 , Judy Fox 1 , Jieyu Zhao 2

1 University of Virginia, 2 University of Maryland, College Park

In this work, we show the impact of DP on bias in LMs. Differential private training can increase the model bias against protected groups w.r.t AUC-based bias metrics. DP makes it more difficult for the model to differentiate between the positive and negative examples from the protected groups and other groups in the rest of the population.

Does Differential Privacy Impact Bias in Pretrained NLP Models?

The Fourth AAAI Workshop on Privacy-Preserving Artificial Intelligence 2023 (Poster)

Md. Khairul Islam 1, Andrew Wang 1, Tianhao Wang 1 , Yangfeng Ji 1 , Jieyu Zhao 2

1 University of Virginia, 2 University of Maryland, College Park

In this work, we show through empirical analysis the impact of DP on bias in LLMs. We find that differentially private training can increase the model bias against protected groups w.r.t AUC-based bias metrics. DP makes it more difficult for the model to differentiate between the positive and negative examples from the protected groups and other groups in the rest of the population.

MVAM: Multi-variant attacks on memory for IoT trust computing

Cyber-Physical Systems and Internet of Things Week 2023

Arup Kumar Sarker 1, Md Khairul Islam 1, Yuan Tian 2, Geoffrey Fox 1

1 University of Virginia, 2 University of California, Los Angeles, USA

This paper examines the vulnerabilities of the TrustZone extension of ARM Cortex-M processors and develops a threat model to carry out these attacks. After performing multi-variety attacks from different angles, it is found that TrustZone is susceptible to buffer overflow attacks that can compromise the security of other trusted apps. Finally, a trust model is proposed to address these vulnerabilities.

Early Prediction for Merged vs Abandoned Code Changes in Modern Code Reviews

Information and Software Technology 2022

Khairul Islam 1, Toufique Ahmed 2, Rifat Shahriyar 1, Anindya Iqbal 1, and Gias Uddin 3

1 Bangladesh University of Engineering and Technology, 2 University of California, Davis, and 3 University of Calgary

In this work, we have presented a LightGBM classifier-based tool called PredCR, that can predict whether a code change would be merged or abandoned as soon as the code change request is submitted. We have mined 146,612 code changes from the code reviews of three large and popular open-source software. Using longitudinal ten-fold cross-validation, our tool achieves an 85% AUC score on average and relatively improves the state-of-the-art by 14-23%.

Network Anomaly Detection Using LightGBM: A Gradient Boosting Classifier

2020 30th International Telecommunication Networks and Applications Conference

Md. Khairul Islam 1, Prithula Hridi 1, Md. Shohrab Hossain 1, Husnu S. Narman 2

1 Bangladesh University of Engineering and Technology, 2 Marshall University

In this paper, we have worked on a benchmark network anomaly detection dataset UNSW-NB15. We have used a machine learning classifier LightGBM to perform binary classification on this dataset and achieved state-of-the-art performance. Using ten-fold cross-validation on the train, test, and combined dataset, our model has achieved 97.21%, 98.33%, and 96.21% f1 scores, respectively. Also, the model fitted only on train data achieved a 92.96% f1 score on the separate test data.

Page updated

Google Sites

Report abuse