Abstract: This study presents an ensemble transformer framework for detecting depression-related emotions and classifying their severity in social media text. It addresses the need for scalable and trustworthy AI solutions in mental health by integrating four transformer models. The DepTformer-XAI-SV model uses a weighted soft-voting mechanism based on validation macroF1 scores to improve accuracy and incorporates LIME to highlight key linguistic features associated with depression. The framework is evaluated on two benchmark datasets: DepressionEmo, with eight emotion classes, and the Merged Depression Severity Detection (MDSD), with four severity levels, both sourced from social media. To address class imbalance, we use classweighted cross-entropy, stratified k-fold splits, and minority-aware sampling. Results show that the model surpasses individual transformer models and traditional methods, achieving macro-F1 scores of 80.44% for DepressionEmo and 79.88% for MDSD, significantly improving minority class detection. Lastly, a web application has been developed for interactive and interpretable inference.
Publication History: Received July 30, 2025; Revised November 18, 2025; Accepted December 30, 2025; Published online January 4, 2026
Publisher: iScience(2026)
Abstract: Early and accurate segmentation of oral cancer is essential for timely diagnosis and treatment. Traditional methods like visual inspections and biopsies are often subjective and costly, which can hinder early detection. To improve segmentation accuracy for binary and multiclass classification, we propose a transformer-based ensemble model that combines Vision Transformer (ViT), Data-efficient Image Transformer (DeiT), Swin Transformer, and BEiT. This ensemble utilizes self-attention mechanisms for better feature extraction and spatial representation. Our study employs two datasets: the MOD dataset (463 images of oral diseases) and a histopathological dataset (1,224 images of oral squamous cell carcinoma and normal epithelium). We applied extensive preprocessing and augmentation techniques, such as grayscale conversion, binary thresholding, and Contrast Limited Adaptive Histogram Equalization (CLAHE), to enhance image quality and model generalization. The performance evaluation showed that our ensemble model outperformed individual architectures, achieving an Intersection over Union (IoU) of 0.9601 and a Dice Coefficient of 0.9598 for binary segmentation, and IoU of 0.9587 and Dice Coefficient of 0.9575 for multiclass segmentation. A comparative analysis with state-of-the-art models confirmed the effectiveness of our approach. These results demonstrate the potential of transformer-based ensemble learning for oral cancer diagnosis, presenting a scalable tool for clinical applications. Future work will focus on expanding dataset diversity, optimizing computational efficiency, and integrating real-time inference for improved usability in healthcare.
Published in: 2025 International Conference on Electrical, Computer and Communication Engineering (ECCE)
Date of Conference: 13-15 February 2025
Date Added to IEEE Xplore: 29 May 2025
Publisher: IEEE
Researching on the design automation pipeline for Hardware Design using Agentic AI.
Researching on the Nvidia GPU to explore better schedulability on different tasks on different SMs(Streaming Multiprocessor).
Working on a Hybrid Model to perform Named Entity Recognition (NER) specifically tailored for Bengali language on Medical Data.
Integrating traditional NLP techniques with deep learning models for enhanced accuracy in detecting and classifying entities in Bengali text.
Exploring state-of-the-art techniques to improve language processing tasks in the Bengali language, addressing challenges in low-resource languages.
Proposing a system that analyzes Android application permissions using Large Language Models (LLMs) to enhance user privacy and security.
Working on integrating LLMs to intelligently assess the permissions required by apps and provide personalized alerts to users throughout the app usage.
Thesis supervisor: Faisal Bin Ashraf (faisalbashraf@gmail.com)
Abstract: Yoga is one of the best activities from home to preserve our physical condition in the present epidemic. Yoga, on the other hand, is all about performing the 82 Yoga Asanas correctly over the course of six classes. Regrettably, not everyone have the knowledge or can perform yoga accurately. So to do yoga poses correctly we will have to find a yoga instructor, but it can be very hard and expensive to find yoga instructors considering all possible general situation and status. Using Deep Learning(DL), picture categorization and various machine learning approaches, we attempted to build a system or a model that will operate as a self-instructor of Yoga for the user to classify different poses of yoga to distinguish accurate pose in our thesis. It will assist the user in performing Yoga correctly by recognizing errors in their Yoga Asanas. In a nutshell, this section will cover several posture estimation, key point detection, and pose categorization techniques. Moreover, we tried ensemble modeling as a booster to improve the pose prediction accuracy as much as possible.