Research Projects
The current projects focus on developing novel machine-learning algorithms to enhance the quality of data in both static and streaming environments. The assumption of static data is often violated in many real-world applications where data may exist in a streaming format. Data streams arise in many contexts: sensor measurements, machine monitoring, network traffic flows, automated medical diagnosis, and remote sensing.
Outlier/Anomaly Detection
Many applications require being able to decide whether a new observation belongs to the same distribution as existing observations (it is an inlier), or should be considered as different (it is an outlier) for example intrusion detection and fraud detection. The current frameworks in this area face challenges in real-time outlier detection and graph outlier detection. This project agenda puts forth a principled approach for three tasks: (i) A detailed study in outlier detection for real-time and graph data so that other researchers in this area could benefit, (ii) a software package for outlier detection in static and streaming environment, and (iii) a special tool for outlier detection from graph data. The possible applications of outlier detection include fraud detection, network intrusion detection, healthcare anomaly monitoring, predictive maintenance in manufacturing, financial risk analysis, environmental monitoring, and data quality assurance
The rapid growth of generative AI has increased the danger posed by deepfakes, which can mislead audiences, spread misinformation, and erode trust in digital content. This project, Behind the Curtain: Spotting Deepfakes, seeks to develop a comprehensive detection system to identify AI-generated content across text, image, and video formats
Past Projects
Feature Selection
In real-world applications, it is not practical to wait until all features have been generated before feature selection begins. Therefore, many interesting and challenging research questions arise for streaming data: (1) how to select relevant features when feature space is unknown? (2) how to update the feature set as new features are available over time? (3) How can feature relevance be assessed without label information? We are developing methods for streaming feature selection in two settings: (i) supervised, and (ii) unsupervised. In addition, we are designing novel algorithms for static feature selection.
Feature Extraction
With the rapid growth of video data, feature extraction has been a key technology for intelligent video processing in various application domains such as surveillance, automotive systems, and robotics. This project focuses on developing novel algorithms for feature extraction in live-streaming videos. In addition, we are focusing on feature extraction from text data.
Noise Detection in Supervised Learning
The key ingredient for any machine learning model is data. Along with the continuing development of new technologies, the volume of data collection increases in almost all fields of human endeavor. Lack of data is no longer the problem; lack of effective and efficient methods to prepare, learn, and act on the massive data has become a crucial problem. In this project, we investigate ``Learning from Noisy Data'' which addresses the problem of learning in the presence of noise. We developed three novel techniques to handle noisy data to improve the predictive performance of machine learning models.