Portfolio & Projects

Pioneering Practical Solutions

My project portfolio reflects a focused effort to bridge theoretical foundations with real-world applications in artificial intelligence, with a primary emphasis on speech emotion recognition, brain–computer interfaces (BCI), and AI-driven mental health systems. My work lies at the intersection of machine learning, signal processing, and human-centered intelligent systems, aiming to develop technologies that can interpret and respond to human cognitive and emotional states.

The projects presented here span academic research, applied system development, and interdisciplinary experimentation. Core contributions include developing deep learning models for speech emotion recognition, designing EEG-based BCI systems for cognitive training, and exploring data-driven approaches for behavioral and psychological analysis. In parallel, I have worked on time-series forecasting, IoT-based data systems, and real-time analytics, strengthening my ability to handle complex, multimodal datasets.

Each project follows a rigorous research-oriented methodology, including data preprocessing, feature extraction (e.g., MFCC, spectrograms, physiological signals), model development, optimization, and evaluation. This reflects both strong engineering discipline and a commitment to reproducible, scalable research practices.

Collectively, this portfolio demonstrates my ability to integrate AI techniques with real-world constraints, with a growing focus on neurotechnology, affective computing, and intelligent healthcare applications. These projects form the foundation of my progression toward doctoral-level research in artificial intelligence, brain–computer interfaces, and human-centered intelligent systems.

Core Research Projects

Project 1

Ongoing Research Project (EEG & BCI)

EEG-Based Cognitive State Analysis and Dataset Development for Attention and Behavioural Studies

Overview

This ongoing research focuses on developing a data-driven framework for analyzing cognitive and attentional states using electroencephalography (EEG) signals. The project aims to contribute toward scalable brain–computer interface (BCI) systems and the creation of high-quality datasets for behavioural and mental health research, particularly in the context of attention-related conditions such as ADHD.

Research Objectives

Develop methodologies for capturing and analyzing EEG signals for cognitive state assessment
Design and curate a structured EEG dataset for attention and behavioural studies
Explore feature extraction and representation learning techniques for neural signals
Investigate machine learning and deep learning models for classification of cognitive states

Methodology

EEG signal acquisition and preprocessing (noise filtering, artifact removal)
Feature extraction from time-domain and frequency-domain representations (e.g., power spectral density, band analysis)
Exploration of advanced representations for neural data (e.g., time-frequency transformations)
Model development using machine learning and deep learning approaches for classification and pattern recognition
Iterative experimental design for improving data quality and model performance

Technologies & Tools

Python (NumPy, Pandas, SciPy)
Signal Processing Techniques
Machine Learning / Deep Learning Frameworks
EEG Data Processing Pipelines

Expected Contributions

A structured EEG dataset for cognitive and behavioural analysis
Insights into neural patterns associated with attention and cognitive states
Foundations for developing intelligent BCI-based systems for mental health applications

Research Significance

This project aligns with ongoing research in affective computing and neurotechnology, contributing toward the development of intelligent systems capable of interpreting human cognitive and emotional states. It provides a strong foundation for future work in brain–computer interfaces, mental health diagnostics, and human-centered AI.

Project 2

MSc Group Project (Research & Development)

Speech Emotion Recognition Using Deep Learning (Published - Springer LNCS)

Overview

This project focuses on the development of a deep learning-based Speech Emotion Recognition (SER) system capable of identifying human emotions from audio signals, independent of semantic content. The work was later published in Springer Lecture Notes in Computer Science (LNCS), highlighting its research contribution to affective computing and human–computer interaction.

Research Objectives

Develop a robust model for emotion classification from speech signals
Analyze the effectiveness of different audio features for emotion recognition
Evaluate deep learning architectures for improving classification performance
Contribute toward real-world applications such as mental health monitoring and human–machine interaction

Methodology

Dataset: RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song)
Data preprocessing and augmentation (noise injection, pitch shifting, time stretching)
Feature Extraction: Mel Frequency Cepstral Coefficients (MFCC), Mel Spectrogram, Chroma Features, Zero Crossing Rate (ZCR), Root Mean Square Energy (RMS)
Model Development: Convolutional Neural Network (CNN) architecture with multiple convolutional and pooling layers. Regularization using dropout and hyperparameter tuning
Evaluation using accuracy, precision, F1-score, and confusion matrix

Technologies & Tools

Python (Librosa, NumPy, Pandas)
TensorFlow / Keras
Signal Processing & Audio Feature Engineering

Key Results

Achieved ~72% classification accuracy on selected emotion classes
Demonstrated effectiveness of CNN-based architectures for audio-based emotion recognition
Identified limitations related to dataset size and generalization

Publication

Published in Springer Lecture Notes in Computer Science (LNCS), 2023

Research Contribution

This work contributes to the field of affective computing by demonstrating a scalable deep learning approach for emotion recognition from speech. It highlights the importance of feature engineering and model optimization in handling complex audio signals.

Future Directions

Integration of multimodal data (e.g., EEG + speech) for improved emotion recognition
Expansion to larger and cross-cultural datasets
Application in mental health monitoring and adaptive human–computer interaction systems

Project 3

BS Final Year Project (Research & Development)

A Brain Computer Interface (BCI)-Based Attention Training System for ADHD

This project focused on designing and developing an EEG-based cognitive training system aimed at supporting individuals with Attention Deficit Hyperactivity Disorder (ADHD). The system integrates brain–computer interface (BCI) technology with interactive gaming to monitor and enhance user attention in real time.

Project Structure

Phase 1: System Development

Development of three interactive attention-training games tailored to different age groups, incorporating EEG-based feedback mechanisms.

Phase 2: Experimental Evaluation

Session-based experimental analysis to evaluate user engagement and attention levels using EEG signals.

Participants:

Three age groups (approximately 10–40 years)

Objectives

Design a BCI-enabled interactive system for attention training
Analyze EEG-based cognitive responses during gameplay
Evaluate engagement and attention dynamics across different age groups

Technologies Used

Unity 3D, Visual Studio Code
Emotiv Insight EEG Headset & SDK
MATLAB (signal analysis)
C#, Blender

Methodology

EEG signal acquisition during gameplay sessions
Signal preprocessing and band-power feature extraction
Session-level analysis of attention and engagement patterns
Comparative analysis across age groups

Results

Successfully developed a functional EEG-based BCI training prototype
Observed measurable variations in attention-related engagement across sessions
Demonstrated feasibility of integrating EEG signals into interactive cognitive training systems

Category: Research & Prototype Development

Status: Extended into ongoing research work

Applied AI & Data Science Projects

Project 4

AI-Powered Technical Support Agent (R&D, Scalable Deployement)

Developed an advanced Retrieval-Augmented Generation (RAG)-based AI agent for technical support automation. The system ingests historical support tickets, builds a hybrid knowledge base (dense vector + sparse BM25 retrieval), and answers new queries using state-of-the-art LLMs. Designed for high scalability, multilingual support, and seamless integration with web and chat platforms.

Objectives

Automate technical support ticket triage and response generation
Enable continuous knowledge base updates as new tickets arrive
Improve ticket categorization and escalation workflows
Ensure compatibility with multiple languages and large-scale datasets

Methodology

Developed and maintained using VS Code and Cursor for efficient, modern coding workflows
Data ingestion and parsing from raw support transcripts
Hybrid retrieval: dense embeddings (sentence-transformers) + sparse BM25
LLM-based answer generation (Gemini 3.0, Claude Opus 4.6)
Escalation to human agents for unresolved queries
Dockerised backend for cloud deployment; FastAPI for web API
Integration with chatbots (e.g., Lark bot via OpenClaw)
CLI and web-based frontend for testing and demonstration
Advanced analytics for support operations and user feedback:
Users can provide feedback on AI-generated answers, enabling the system to rank and improve reference tickets for greater accuracy in future responses.
Enhanced human-in-the-loop workflows for continuous learning:
The system continuously learns from newly resolved tickets—whether solved by the AI or escalated to human agents. Resolved cases are automatically processed and incorporated into the knowledge base, ensuring ongoing self-improvement and adaptation to new issues.

Technologies & Tools

Python (sentence-transformers, ChromaDB, rank-bm25, OpenAI SDK)
FastAPI, Docker, YAML for deployment
Gemini 3.0, Claude Opus 4.6 for LLM tasks
Multilingual NLP pipelines
Visual Studio Code (VS Code) and Cursor (AI coding assistant)

Key Results

Achieved robust ticket categorization, including for previously uncategorized data
Scalable to millions of tickets with efficient retrieval and storage
Multilingual support for global applicability
Automated escalation for unresolved or novel issues
Customizable skills layer for adaptation to other support domains
Continuous self-improvement through user feedback and new ticket ingestion

Research Contribution

This project demonstrates a production-ready, scalable AI support agent architecture, addressing common limitations in ticketing systems (e.g., poor categorisation, lack of scalability, and language barriers). The modular skills layer enables rapid adaptation to new domains and continuous improvement as new data arrives.

Future Directions

Integration with additional LLMs and retrieval strategies
Expansion to other domains (e.g., customer service, IT helpdesk)

Project 5

MSc Final Dissertation (Research Project)

Smart Meter Data Analysis and Energy Consumption Forecasting

This dissertation was conducted as part of the SAFI (Statistical Analysis for Industry) research initiative, a funded industry-focused project, under the supervision of senior researchers from the AI Research (AIRE) Group at the University of Bradford. The project contributes to ongoing research in data-driven energy analytics, smart grids, and large-scale industrial data science applications.

The work addresses real-world challenges in consumer energy behavior analysis and load forecasting, leveraging large-scale smart meter data to support intelligent energy management and decision-making systems.

Research Objectives

Model and analyze large-scale smart meter data for consumer behavior understanding
Segment users based on energy consumption patterns
Improve forecasting accuracy using hybrid clustering + time-series approaches
Support scalable, data-driven solutions for smart energy systems

Methodology

Data Sources: Smart meter data, weather variables, and UK bank holiday indicators
Clustering: K-Means with Euclidean distance and WCSS (elbow method)
Forecasting: SARIMAX applied on clustered data
Feature Engineering: Normalization, scaling, and dimensionality reduction
Experimental Design: Five experimental configurations; hybrid clustering–forecasting approach yielded best performance

Key Findings

Consumer segmentation significantly improved forecasting performance
External factors (weather, occupancy, holidays) strongly influence energy usage
Normalization was essential for handling heterogeneous consumption scales
Cluster-based modeling enhanced both interpretability and predictive accuracy

Research Significance

This project demonstrates the integration of unsupervised learning and time-series modeling in a real-world, large-scale industrial context. Conducted within a funded research environment, it reflects experience working with:

high-dimensional, real-world datasets
industry-relevant problem settings
scalable analytical methodologies

The work aligns with ongoing research in:

smart grid analytics
sustainable energy systems
applied machine learning for societal impact

Impact

Supports energy-efficient decision-making and demand optimization
Enables consumer-level behavioral insights
Contributes to scalable forecasting solutions for smart grid systems

Future Directions

Deep learning models (LSTM, CNN) for sequential forecasting
Cross-domain data integration (e.g., socio-economic signals)
Advanced cluster validation and interpretability techniques

Note: Conducted under the AI Research (AIRE) Group, led by senior researchers with extensive funded research experience across EPSRC, BBSRC, and industry collaborations.

Project 6

Air Pollution Data Analysis, Forecasting & Real-Time Streaming System

Overview

This project involves end-to-end data analysis, time-series forecasting, and real-time data streaming using environmental pollution datasets. It combines data science, machine learning, IoT communication, and stream processing to analyze and monitor air quality data.

Task 1: Exploratory Data Analysis (EDA)

Description:

Performed data analysis on two real-world air pollution datasets collected in Aarhus, Denmark.

Methods:

Data cleaning (missing/null value detection)
Statistical analysis using describe()
Correlation analysis (heatmaps)
Data visualization: Pairplots, Histogram, Density plots, Box plots, Scatter matrix

Tools

Python (Google Colab)
Pandas, Matplotlib, Seaborn

Outcome

Identified relationships between pollutants and gained insights into data distribution and patterns.

Task 2: Time Series Forecasting (ARIMA)

Description:

Applied ARIMA model to forecast particulate matter levels over time.

Methods:

Data split: 60% training, 40% testing
Time-series transformation (daily frequency)
Stationarity testing using ADF test
Model parameter selection using ACF & PACF
Model evaluation using residual and Box-Ljung tests

Tools

RStudio
ARIMA Modeling

Key Result

IBest model: ARIMA (1,0,1)
Successfully forecasted pollution levels with reliable accuracy

Task 3: IoT Data Streaming using MQTT

Description:

Developed a real-time data communication system using MQTT protocol.

Implementation:

Created two publishers for streaming pollution data
Developed a subscriber to receive and display real-time data
Configured Mosquitto MQTT broker

Tools

Python (Pycharm)
Mosquitto MQTT Broker
Paho-MQTT Library

Outcome

Successfully simulated real-time IoT-based environmental monitoring system.

Task 4: Real-Time Stream Processing (Apache Flink)

Description:

Implemented Complex Event Processing (CEP) for real-time pollution monitoring.

Methods:

Processed streaming data using Apache Flink
Defined thresholds using mean and standard deviation
Generated alerts and warnings based on patterns

Technologies

Apache Flink
Java & XML
Apache NetBeans

Outcome

Built a real-time alert system capable of detecting abnormal pollution levels using streaming analytics.

Final Results

This project demonstrates a complete data pipeline, including:

Data analysis and visualization
Predictive modeling (time-series forecasting)
Real-time IoT data streaming
Stream processing and event detection

It highlights the integration of data science, machine learning, and distributed systems for smart environmental monitoring applications.

Project 7

Classification of Diabetic Retinopathy Using Machine Learning (Debrecen Dataset)

This project investigated the application of supervised machine learning techniques for automated medical diagnosis, focusing on the classification of diabetic retinopathy (DR) using structured features derived from retinal imaging data. The work emphasizes the role of machine learning in early disease detection and clinical decision support systems, aligning with broader research in biomedical signal and data analysis.

Objectives

Develop a classification framework for detecting diabetic retinopathy from structured clinical features
Evaluate and compare multiple supervised learning algorithms on medical diagnostic data
Analyze the impact of feature optimization on classification performance and computational efficiency

Methods & Tools

Languages & Tools: Python, Google Colab, PyCharm

Techniques

Supervised Machine Learning (Binary Classification)
Feature analysis and statistical validation
Hyperparameter tuning and cross-validation
Model evaluation using accuracy, precision, recall, and F1-score

Models Implemented: Support Vector Machine (SVM), Random Forest, AdaBoost, K-Nearest Neighbors (KNN), Gaussian Naïve Bayes, Gaussian Process Classifier (GPC), Decision Tree

Dataset

Debrecen Diabetic Retinopathy dataset (UCI Repository)
1150 instances with features representing microaneurysms and exudates extracted from retinal images
Binary classification: DR vs. Non-DR

Outcomes

Achieved highest performance with Gaussian Process Classifier (~77%) and SVM (~76.5%)
Demonstrated that feature optimization improves model performance and stability
Identified key limitations in medical datasets, including labeling consistency and dataset size
Highlighted feasibility of lightweight ML models for clinical screening systems

Research Significance

This work demonstrates the applicability of classical machine learning methods in medical data classification tasks, particularly in scenarios with limited computational resources. The project also provides foundational insights relevant to biomedical signal processing and pattern recognition, which are critical in domains such as EEG analysis and brain–computer interfaces (BCI).

Category: Applied Machine Learning in Healthcare

Focus Area: Medical Data Analysis | Pattern Recognition | Computational Diagnostics

Project 8

Student Performance Prediction Using Feature Selection and Classification Techniques

This project focuses on applying machine learning and feature selection techniques to predict student academic performance using educational data mining. The study explores how different demographic, behavioral, and academic attributes influence learning outcomes, while addressing challenges such as high-dimensional data, feature correlation, and model generalisation.

Objectives

Predict student academic performance using machine learning classification models
Identify key contributing features influencing student outcomes
Evaluate the impact of feature selection on model accuracy and interpretability

Methods & Tools

Languages & Tools: Python, data visualisation libraries

Techniques

Data preprocessing (cleaning, handling missing values, feature selection)
Exploratory Data Analysis (EDA) and statistical visualization
Supervised Machine Learning for classification
Feature correlation analysis and dimensionality reduction

Models Used: Logistic Regression, Random Forest

Evaluation Metrics: Accuracy, Precision, Confusion Matrix, Comparative Model Performance

Dataset / Scope

Student Performance Dataset (demographics, parental background, test preparation, academic scores)
Mixed-type structured data (categorical and numerical features)
Real-world educational data with noise, imbalance, and feature dependencies

Outcomes

Identified strong correlations between parental education, test preparation, and academic performance
Demonstrated that feature selection significantly improves model performance and interpretability
Observed that Random Forest outperformed Logistic Regression in predictive accuracy on smaller datasets
Highlighted challenges of high-dimensional feature spaces and data imbalance in real-world datasets

Research Significance

This work builds foundational expertise in feature selection, classification, and pattern discovery in complex datasets, which are critical in domains such as neural signal processing. Similar challenges arise in EEG data analysis, where identifying relevant features from high-dimensional, noisy signals is essential for accurate prediction of cognitive states and brain–computer interface (BCI) applications. The experience gained in optimising feature spaces and evaluating model performance directly supports research in computational neuroscience and neuroinformatics.

Category: Machine Learning & Data Mining

Focus Area: Feature Selection | Classification | Predictive Modeling

Data Analysis, Modeling & Research-Oriented Coursework

Project 9

Critical Analysis of Text Classification Algorithms for Large-Scale Document Categorization

This project presents a comprehensive study and critical analysis of machine learning approaches for text classification, focusing on the challenges of processing large-scale, high-dimensional textual data. The work explores the full text classification pipeline, including preprocessing, feature extraction, model selection, and evaluation, with applications spanning domains such as healthcare, legal systems, and social media analytics.

Objectives

Analyze key machine learning algorithms for text classification tasks
Evaluate strengths and limitations of different classification approaches on textual data
Investigate the role of preprocessing, feature extraction, and dimensionality reduction in improving model performance

Methods & Tools

Languages & Tools: Python (conceptual/implementation level), NLP frameworks

Techniques

Text preprocessing (stopword removal, noise reduction, tokenization)
Feature extraction and dimensionality reduction
Supervised Machine Learning for text classification
Comparative evaluation using precision, accuracy, F1-score, and confusion matrix

Models Analysed: Logistic Regression, Random Forest, K-Nearest Neighbors (KNN)

Dataset / Scope

BBC News text dataset for multi-class text classification
High-dimensional textual data with unstructured features
Consideration of real-world text sources (social media, healthcare records, legal documents)

Outcomes

Demonstrated that model performance is highly dependent on feature representation and preprocessing quality
Identified trade-offs between model interpretability (Logistic Regression) and complexity/performance (Random Forest, KNN)
Highlighted computational challenges such as high dimensionality, memory requirements, and scalability
Provided a comparative framework for selecting appropriate models based on dataset characteristics

Research Significance

This work strengthens understanding of pattern recognition in high-dimensional data spaces, which is a fundamental challenge shared across domains such as natural language processing and neural signal analysis. The insights gained from feature extraction, dimensionality reduction, and classification are directly transferable to EEG signal processing, cognitive state decoding, and brain–computer interface (BCI) systems, where complex temporal and high-dimensional data must be efficiently modeled and interpreted.

Category: Machine Learning & Natural Language Processing

Focus Area: Pattern Recognition | High-Dimensional Data | Classification Systems

Project 10

Solar Flare Data Analysis and Visualization

This project focuses on the visual analysis and trend exploration of solar flare observations using a real-world dataset containing flare event records spanning multiple decades. Solar flares are intense bursts of radiation from the Sun that vary in frequency and intensity, and understanding their temporal patterns contributes to space weather research and long-term solar activity analysis.

Objectives

Perform data cleaning and preprocessing of raw solar flare records
Visualize yearly flare occurrences across different flare classes
Analyze correlations among major flare types to understand their temporal relationships
Investigate distinct patterns in flare frequency and class distribution

Methods & Tools

Languages & Tools: Python, data preprocessing libraries, Visualisation libraries

Techniques

Data preprocessing to standardize date and flare classification fields
Line graphs, bar charts, stacked area plots, and heat maps for trend visualization
Correlation analysis between flare types over time
Exploratory data analysis to reveal long-term patterns in solar activity

Dataset / Scope

Solar flare observational dataset (flare events from 1981–2017)
Includes flare occurrences classified into standard flare types (A, B, C, M, X)
Dataset required preprocessing due to inconsistent formats and missing values
Flare event records like these are maintained in databases that integrate observations across instruments and missions for scientific analysis of space weather data trends.

Outcomes

C-class flares were the most frequently occurring flare type across the dataset, followed by B- and M-class flares, while X- and A-class events were comparatively rare.
Visualizations demonstrated how flare frequencies vary year to year, revealing patterns consistent with known solar activity cycles.
Correlation analysis using heat maps and stacked charts showed statistical relationships between C-class flares and both M- and X-class events.
Data preprocessing techniques such as regex-based cleanup and date/time standardization were essential for reliable analysis.

Research Significance

Solar flare datasets like this one reflect long-term observational records used by the heliophysics community to study solar activity and its impact on space weather conditions. Scientific resources and catalogs maintained by organisations such as NASA and international data centers provide structured flare event archives for research and modeling.

This project demonstrates skills in handling real observational data, cleaning noisy datasets, and producing domain-relevant visual insights, which are valuable in research areas involving signals and time-series data — a key methodological overlap with challenges in EEG signal analysis and longitudinal biomedical signal studies.

Category: Data Visualization & Analytics

Focus Area: Time-Series Patterns | Signal Trends | Environmental Data

Systems, IoT & Interdisciplinary Work

Project 11

Big Data Challenges in Smart Grid and Energy Systems

This project explored the role of big data technologies in modern smart grid systems, focusing on the challenges and opportunities associated with analyzing large-scale energy datasets from renewable sources such as solar power plants. The work emphasizes scalable data processing, real-time analytics, and the integration of data-driven approaches for improving energy efficiency and grid reliability.

Objectives

Analyze challenges associated with large-scale data in smart grid and renewable energy systems
Investigate big data solutions for handling high-volume, high-velocity energy datasets
Explore data-driven approaches for improving energy forecasting, monitoring, and optimization

Methods & Tools

Languages & Tools: Python, Apache Flink, SQL, Big Data frameworks

Techniques

Distributed data processing and stream analytics
Time-series analysis of energy consumption and generation data
Data pipeline design for large-scale energy systems
Literature-driven analytical study of smart grid architectures

Dataset / Scope

Large-scale smart grid and renewable energy datasets (solar and green energy systems)
Focus on high-volume, high-velocity, and heterogeneous data sources
Consideration of real-time data streams from IoT-enabled energy infrastructures

Outcomes

Identified key challenges in big data for energy systems, including data volume, velocity, variety, and scalability
Analyzed limitations of traditional data processing methods in handling smart grid data
Proposed scalable architectures and big data solutions for efficient energy analytics
Highlighted importance of real-time processing for grid stability and energy optimization

Research Significance

This work provides insights into the integration of big data analytics with cyber-physical energy systems, highlighting parallels with other large-scale data domains such as biomedical signal processing. The project strengthens foundational knowledge in handling complex, high-dimensional data streams, which is directly relevant to research areas like EEG signal processing, real-time neural data analysis, and brain–computer interface systems.

Category: Big Data Systems & Energy Analytics

Focus Area: Distributed Systems | Time-Series Data | Real-Time Analytics

Project 12

Automotive Data Security and IoT

Overview

This project presents a comprehensive review of data security challenges in IoT-enabled smart automotive systems. With the rise of connected and autonomous vehicles, the study explores critical privacy and security risks such as remote hijacking, malware attacks, and data breaches. It evaluates both traditional and emerging security solutions, with a particular focus on blockchain-based architectures.

Key Objectives

Analyze major security challenges in automotive IoT systems
Review conventional and distributed security frameworks
Evaluate blockchain as a potential solution for secure data management
Identify future research directions in automotive cybersecurity

Methodology

Literature review of existing IoT and automotive security systems
Comparative analysis of centralized vs. distributed security approaches
Conceptual evaluation of blockchain-based architectures
Qualitative assessment of system resilience against cyber threats

Technologies & Concepts

Internet of Things (IoT)
Cyber-Physical Systems (CPS)
Blockchain Technology
Vehicle-to-Vehicle (V2V) & Vehicle-to-Infrastructure (V2I) Communication
Distributed Security Systems

Key Findings

Traditional centralized systems lack scalability, privacy, and user control
IoT-enabled vehicles are highly vulnerable to malware, hacking, and cloud-based threats
Blockchain offers enhanced security through decentralization, transparency, and data integrity
Integration of AI with blockchain can further strengthen automotive security frameworks

Outcome & Future Work

The study highlights blockchain as a promising solution for automotive data security while acknowledging challenges such as implementation complexity and lack of standards. Future work includes integrating machine learning and deep learning techniques to develop intelligent, adaptive security systems for connected vehicles.

Additional Academic & Big Data Science Projects

Overview

In addition to the core projects listed above, I have completed several academic and independent projects covering machine learning, deep learning, natural language processing, and real-time data systems. These projects further strengthened my skills in handling diverse datasets and applying scalable analytical techniques.

Selected Work Includes

Speech Emotion Recognition Using Deep Learning — Published (Springer, LNCS)
Multimodal Emotion Recognition (Audio, Text, Behavioral Data)
Time-Series Forecasting using ARIMA (MAPE-based evaluation)
MQTT-based Publisher–Subscriber Systems (Mosquitto)
Real-Time Stream Processing using Apache Flink (CEP)
Sentiment Analysis using NLP (Disaster Tweets and Product Reviews)
House Price Prediction using Regression Models
Fraud Detection in Financial Transactions (Imbalanced Data Handling)

Page updated

Google Sites

Report abuse