This project introduces a novel learning paradigm that captures token relations through progressive summarization of features. From high-dimensional features, the token relation is obtained in a multi-scale low-frequency approximation of the features. This approach efficiently represents fine-grained local to coarse global contexts within each network layer. Furthermore, computing self-attention on the low-frequency approximation of features significantly reduces the computational complexity, effectively addressing the challenges posed by high-dimensional data in vision transformers. The work has been provisionally accepted for publication in the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI).
The project is being conducted in collaboration with Microsoft AI, CARVEL Lab at UCF, and Augmented Design Lab at UCSC. For details, please visit here.
Early and accurate detection of cervical lymph nodes is essential for optimal managing and staging patients with head and neck malignancies. This project aims to develop a non-invasive deep learning (DL) algorithm for detecting and automatically segmenting cervical lymph nodes in 25,119 CT slices from 221 regular neck contrast-enhanced CT scans from patients without head and neck cancer.
The project focuses on the most challenging task of segmenting small lymph nodes. A segmentation framework successful in this task could represent an essential initial block for future algorithms aiming to evaluate small objects such as lymph nodes in different body parts, including small lymph nodes that look normal to the naked human eye but harbor early nodal metastases.
The project has been conducted in collaboration with the Augmented Intelligence and Precision Health Laboratory at the University of McGill where head and neck CT data from 221 patients were acquired. Institutional review board approval was obtained for this retrospective study.
For details of the project, you can visit here.
The project aims to build up the necessary components to facilitate the design of a real-time trojan detection system. The task has been initiated on the SEM images of 28nm Node Technology and will be extended to other nodes (e.g., 14nm, 12nm, and 7nm). The project focuses on detecting anomalies inserted by third-party foundries. The real-time system comprises a cell extraction component, a cell identification module, and a decision analysis module. The task involves extracting logical cell images from SEM images and using these to generate diverse synthetic cell images for various illumination conditions. Using these real and synthetic images, the cell recognition and malicious change identification unit is trained and prepared for deployment in the real-time system.
For details about the project, please visit here.
Human Activity Recognition on Border Secured Data
The project focuses on predicting human activity from videos captured from border area. There can be single or multiple instances of one or several types of actions in one-clip. The system comprises of an activity localization and a recognition unit. From input clips, localization tubes are predicted, and compressed representations of those tubes are obtained and classified in the recognition unit. The system is trained in an end-to-end manner.
The Project was accomplished during my works as a Research Assistant in Center for Research in Computer Vision (CRCV) at the University of Central Florida. The project was funded by Elbit Systems of America. For details, please visit here.
The goal of the project is to build up necessary components to facilitate the design of a real-time trojan detection system. The project is currently being conducted on SEM images of 28nm node technology. The project focuses on detecting anomaly inserted by third party foundries. The task involves extracting logical cell images from SEM image and use those to generate diversified synthetic cell images for different illumination condition. Using these real and synthetic images, the cell recognition unit is trained and prepared to deploy in the real-time system.
The project is ongoing. For details of the project, please visit here.
The goal of the project is to distill knowledge learned by a large network trained with multi-modal data into a compact network trained with single data modality. The project is conducted with the goal of utilizing the strength of multiple data modalities as well as maintaining a compact network in inference stage. This will reduce the extra memory cost for deploying the network in a high computational real-time system such as, autonomous vehicles. The project is currently being experimented with two data modalities, RGB and Near Infrared (NIR) images. The task involves training a large teacher network with both RGB and NIR images and then distill the knowledge into a smaller student network in an adversarial fashion. The student network takes input one modality of data but through adversarial learning, it learns to produce teacher-like feature.
The project is ongoing. For details of the project, please visit here.