11th International Conference on Signal, Image Processing and Embedded Systems
(SIGEM 2025)
November 15~16, 2025, Zurich, Switzerland
11th International Conference on Signal, Image Processing and Embedded Systems
(SIGEM 2025)
November 15~16, 2025, Zurich, Switzerland
Hybrid Semantic Search for Legal Document Retrieval in the Swiss Parliament: The ParlementAIre Approach
Ornella Vaccarelli1, Emmanuel de Salis2, Eden Brenot1, Henrique Marques Reis2, Jacqueline Kucera3, Philippe Meyer3, Aphrodite Albanis3, Hatem Ghorbel2, and Jean Hennebert1, 1Institute of AI and Complex Systems (iCoSys), School of Engineering and Architecture of Fribourg (HEIA-FR), HES-SO University of Applied Sciences and Arts Western Switzerland, Fribourg, Switzerland, 2Haute Ecole Arc (HE Arc), HES-SO University of Applied Sciences and Arts Western Switzerland, Neuchˆatel, Switzerland, 3Parliamentary Library, Research & Data, Parliamentary Services, Bern, Switzerland
ABSTRACT
We present a domain-adapted hybrid AI retrieval system for legal and parliamentary document search, developed within the ParlementAIre project in collaboration with the Swiss Parliamentary Library. The system integrates BM25-based sparse retrieval with dense neural embeddings in a multilingual, opensource pipeline. It is designed to address the linguistic, structural, and terminological challenges of large legal corpora such as Fedlex and Curia Vista, which differ in language, format, and semantic density. Evaluation on French-language parliamentary queries demonstrates that the hybrid model consistently outperforms both sparse and dense baselines, achieving statistically significant improvements in top-8 retrieval accuracy across heterogeneous document types. Deployed and tested within the Swiss Parliamentary Library, the system also improves time-to-discovery and recall of semantically relevant materials that are frequently missed by conventional keyword-based approaches. These results highlight the potential of AI technologies to enhance parliamentary and institutional processes while adhering to core requirements for sovereignty, transparency, and democratic control.
Keywords
Hybrid Retrieval Models, Document Retrieval, Parliamentary AI, Dense Embeddings, Semantic Search, Lexical Semantics, Semantic Processing, Legal NLP, AI for Governance, Information Retrieval, Information Extraction, Open-source NLP, Institutional NLP.
Elsa: A Style-aligned Dataset for Emotionally Intelligent Language Generation
Vishal Gandhi and Sagar Gandhi, Joyspace AI, WA, USA
ABSTRACT
Advancements in emotion-aware language processing increasingly shape vital NLP applications ranging from conversational AI and affective computing to computational psychology and creative content generation. Existing emotion datasets either lack emotional granularity or fail to capture necessary stylistic diversity, limiting the advancement of effective emotion-conditioned text generation systems. Seeking to bridge this crucial gap between granularity and style diversity, this paper introduces a novel systematically constructed dataset named ELSA (Emotion and Language Style Alignment Dataset)1 leveraging fine-grained emotion taxonomies adapted from existing sources (dair-ai/emotion dataset and GoEmotions taxonomy). This dataset comprises multiple emotionally nuanced variations of original sentences regenerated across distinct contextual styles (conversational, formal, poetic, and narrative) using advanced Large Language Models (LLMs). Rigorous computational evaluation using metrics such as perplexity, embedding variance, readability, lexical diversity, and semantic coherence measures validates the dataset’s emotional authenticity, linguistic fluency, and textual diversity. Comprehensive metric analyses affirm its potential to support deeper explorations into emotion-conditioned style-adaptive text generation. By enabling precision-tuned emotionally nuanced language modeling, our dataset creates fertile ground for research on fine-grained emotional control, prompt-driven explanation, interpretability, and style-adaptive expressive language generation with LLMs.
Keywords
emotion-aware language modeling, fine-grained emotion recognition, stylistic variation, emotion-conditioned text generation, large language models (LLMs), text augmentation, emotion and style transfer, affective text generation, emotion-centric NLP, multistyle text synthesis, Natural Language Generation (NLG).
How Do Sentiment and Toxicity Vary Across Youtube, Reddit, and X Comments Regarding Kylian Mbappé and Fake News?
Olzhasbek Zhakenov, Northwestern University in Qatar, Kazakhstan
ABSTRACT
Social media has fundamentally altered the landscape of sports fandom, creating dynamic platforms for both fervent support and intense criticism. However, the amplification of negativity and misinformation on these platforms poses a significant threat to public perception and the well-being of athletes. This study analyzes social media commentary surrounding Kylian Mbappé and associated sports narratives, leveraging data from YouTube, Reddit, and X. Utilizing Communalytic and Voyant, this research processes over 30,000 comments to assess sentiment, toxicity, and thematic trends. A key component of this analysis is the differentiation between real and fake news contexts, which reveals notable variations in the tone and language of online discussions. The findings of this study underscore the dualistic nature of social media, highlighting its capacity to serve as a conduit for both admiration and harmful discourse. This research provides valuable insights into the complex digital ecosystem of modern sports.
Efficient Hybrid Prompt-pruning for Open-source LLM Based Machine Translation
Zaowad R. Abdullah, Manal Iftikhar, Md. Tariqul Islam Rifat Shahriyar, Bangladesh
ABSTRACT
We propose a hybrid retrieval strategy for open-source LLM-based machine translation that filters out irrelevant top-k candidates before constructing the final translation prompt, thereby reducing input token count while maintaining or improving translation quality. Throughout this work, we demonstrate that fixed top-k retrieval in translation specific LLMs is suboptimal, often incorporating redundant or irrelevant examples into the translation prompt. Our method combines dense embedding model relevance scores and normalized sparse BM25 scores to yield a hybrid score which is later used to filter out irrelevant examples that fall below an empirically derived threshold. Unlike prior domain adaptation methods such as kNN-MT [2], LLM-based translation avoids dense token-level lookups. Rather, it incorporates source-translation pairs semantically/lexically similar to the translation query into the prompt and achieves a significant level of domain adaptation. While being simpler and significantly faster than kNN-MT, the quality of LLM-based MT depends highly on the context provided. Fixed retrieval configurations (e.g., top-5 or top-10), commonly adopted from general NLP tasks, often include irrelevant or redundant examples. While reranker models are usually employed to reorder retrieved examples, they still rely on a fixed top-k setup, leading to the inclusion of superfluous examples. Our experiments demonstrate a simple yet effective method that dynamically filters out suboptimal examples, retaining only the most relevant context for each translation query. Experiments across seven domains and three language pairs (DE→EN, AR→EN, ZH→EN) show that our method preserves translation performance while significantly reducing prompt size. We also compare our setup with the popular reranker model Cohere Rerank 3.5 [3] to establish the credibility of our work. Furthermore, evaluations on the PeerQA benchmark demonstrate substantial gains in zero-shot segment-level retrieval, validating the hybrid pruning method. Our findings highlight the impact of selective example retrieval for optimally domain-adapted multilingual machine translation.
Keywords
Machine Translation, LLM, RAG(Retrieval Augmented Generation), Information Retrieval,n-shot Prompting, Prompt-Pruning , Domain Adaptation.
Chorify Is An Intelligent Desktop Application To Teach Dance And Correct Motion Using Pose Estimation And Vibration Band
Chenxi Huang1, Rodrigo Onate2, 1Crean Lutheran High School, 12500 Sand Canyon Ave, Irvine, CA 92618, 2 California State Polytechnic University, Pomona, CA 91768
ABSTRACT
WThis project aims to make dance learning more accessible for deaf individuals through a program called RhythmSense. Many people who are deaf struggle to follow rhythm or music during dance. My solution combines AI-based pose detection, sound wave visualization, and a vibration feedback band. The system uses MediaPipe to track movements, compares them to a reference video, and provides instant visual and physical feedback. During testing, I focused on improving accuracy, reducing vibration delay, and ensuring that the program worked in different lighting and motion conditions. The results showed that the app could identify mistakes and match rhythm effectively. Overall, RhythmSense helps dancers not only see their errors but also feel the beat through vibration. This technology creates a more inclusive way for everyone, including deaf users, to experience and enjoy dance.
A Hybrid Dual-Specialist Framework for Expressive Singing Head Synthesis Via Identity-Aware Fusion
Khawaja Murad ul Hassan , Qlu.ai, San Francisco, USA
ABSTRACT
The synthesis of photo-realistic and emotionally expressive singing head animations from audio presents a complex multimodal challenge. Monolithic generative models often struggle to simultaneously master precise lip articulation and emotionally resonant head motions, frequently resulting in geometric inconsistencies and identity degradation. This paper proposes a hybrid framework that decomposes the task between two specialist models: MakeItTalk for high-fidelity facial landmarks and StyleHEAT for natural head poses and expressions. Our core contribution is an Identity-Aware Landmark Fitting technique that geometrically aligns specialist outputs without additional training. By extracting person-specific 3D facial structure from the source image and fitting linguistic landmarks to this identity-constrained shape, we generate expression coefficients that preserve facial geometry. Evaluated on the RAVDESS singing dataset, our approach demonstrates quantifiable improvements in lip-sync accuracy (1.2% reduction in landmark distance) and overall visual quality (3.0% FID improvement, 3.9% LPIPS improvement) compared to StyleHEAT, while maintaining comparable identity preservation. Comparisons with MetaPortrait further validate our framework’s effectiveness for zero-shot singing head synthesis.
Keywords
Mtalking head synthesis, facial animation, generative models, 3D morphable model, identity preservation, computer vision
A Reliable Fire Safety System For The Average Home Owner Using Machine Learning And A Mobile Application
Yuxuan Li1 , Garret Washburn2 , 1 United World College South East Asia, 1207 Dover Road, Singapore 139654 , 2 California State Polytechnic University, Pomona, CA 91768
ABSTRACT
House fires in Singapore are quite common due to the widespread lack of consumer fire prevention and safety systems installed in residential areas [3]. This paper proposes a solution, the Scorch Vision system, that uses a machine learning model and a Raspberry Pi with camera to detect fires in camera view and alert users through the accompanying mobile application. The whole system consists of a Raspberry Pi hardware configuration with anon- board custom trained machine learning model using PyTorch, a mobile application written in Dart using the Flutter frame working, and a back-end server created with the Flask framework that communicates directly with a Firebase database [2]. Throughout development there were a few major challenges that required troubleshooting, all of which had to do with working with the individual technologies as there are a lot of moving parts within the system. To ensure the system works as expected, two dif erent experiments were performed to find the accuracy and reliability of the machine learning models classifications as well as the average updating time from the hardware to the database [4]. Both experiment results, included in this paper, were quite positive in displaying the reliability of the Scorch Vision system. Overall, the Scorch Vision system is a promising new way for homeowners to take charge of keeping their home safe from fires in a fashion that is much cheaper than other posed solutions on the market. Additionally, the system is entirely open source and free to download and use, enabling privacy with complete ownership over the system and no middlemen.
Keywords
Fire Detection, Machine Learning, Artificial Intelligence, Home Safety, Mobile Application
Development of Yolofin: An Advanced Yolo–LSTM Based Architecture for Financial Trading
Markos Markides1 and Arodh Lal Karn2, 1University of Cyprus, Nicosia, Cyprus, 2Xi’an Jiaotong–Liverpool University, Suzhou, China
ABSTRACT
This paper presents YOLOFin, a deep learning framework for financial market regime prediction that integrates image-based feature extraction with temporal modeling. Traditional OHLCV (open, high, low, close, volume) data are converted into multiple visual representations, including candlestick charts, indicator heatmaps, Gramian Angular Field (GAF) images, and inter-asset divergence plots. Features are extracted with YOLOv8 and subsequently modeled over time using a Long Short-Term Memory (LSTM) network, with event-driven labels generated via the triple-barrier method Experiments on four years of Bitcoin data show that YOLOFin achieves a precision of 35%, outperforming the 33% random baseline in a Buy/Do Nothing classification setting. These results demonstrate the effectiveness of combining computer vision with financial forecasting and highlight the value of visual time-series representations for capturing patterns in noisy and volatile markets.
Keywords
Cryptocurrency, Deep Learning, YOLOv8, LSTM, Financial Markets.
Cooperative Marl with Structured Rewards and Explainable Agents for Hyperparameter Tuning
Vedat Dogan, Steven Prestwich, and Barry O’Sullivan, University College Cork, Ireland
ABSTRACT
Hyperparameter optimization (HPO) remains a central challenge in literature, particularly when dealing with heterogeneous search spaces and competing optimization objectives. In this work, we present an extended cooperative multi-agent reinforcement learning framework, termed MARL-DA, which builds upon dynamic algorithm configuration (DAC) and multi-agent dynamic algorithm configuration (MA-DAC) by introducing a discrete-continuous agent decomposition with scalarized multi-objective reward components. MARL-DA enables specialized agents—continuous (DDPG) and discrete (DQN)—to optimize distinct types of hyperparameters in parallel. The reward function is designed as a scalarized combination of multiple objectives: predictive performance, training efficiency, model complexity, and generalization robustness. We conduct extensive empirical evaluations across six datasets spanning both classification and regression tasks. Results show that MARL-DA consistently outperforms traditional HPO techniques and MARL baselines, while offering interpretable agent behaviors and stable convergence. Explainability tools are integrated to provide insight into agent decisions, coordination dynamics, and reward attribution. This work demonstrates that minor yet structured modifications to cooperative MARL can yield substantial gains in optimization performance and explainability for HPO task.
Keywords
Hyperparameter tuning, multi-agent reinforcement learning, scalarized multi-objective reward, explainability.
Hyperparameter Sensitivity Analysis of Reinforcement Learning in Autonomous Driving Environments
Marihan Shehata, Mohammed Moness, and Ahmed M. Mostafa, Minia University, Minia, Egypt
ABSTRACT
Hyperparameter tuning plays a critical role in reinforcement learning (RL), particularly in safety-critical domains such as autonomous driving. In this work, we conduct a large-scale empirical analysis of hyperparameter sensitivity for two of the most widely used RL algorithms —Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) using CommonRoad-RL framework and the highD dataset. Functional analysis of variance (FANOVA) is employed to quantify main and interaction effects. Results show that performance variation in both algorithms is dominated by hyperparameter interactions, accounting for over 90% in PPO and nearly 88% in SAC, contrasting prior findings in simpler RL benchmarks. PPO is most sensitive to value learning and gradient stability, whereas SAC is driven by replay and training parameters. These findings highlight the need for interaction-aware tuning strategies to ensure robust RL deployment in complex driving tasks.
Keywords
Autonomous driving, Proximal Policy Optimization (PPO), Soft Actor-Critic (SAC) Hyperparameter optimization, Hyperparameter sensitivity
How to Make Explainable AI (Xai) More Explainable
Jean-Marie Le Ray, Independent researcher, Rome, Italy
ABSTRACT
Large language models (LLMs) produce fluent answers, yet most “explanations” remain narrative, post-hoc, and hard to verify. We introduce Explainability-by-Design (EbD), a framework that requires systems to ship executable explanation artifacts alongside natural-language outputs: (1) an explicit plan with pre/post-conditions, (2) a replayable trail of retrieval and decision steps, and (3) fragment-level, versioned evidence signed by cryptographic hashes. EbD revives and operationalizes three historical architectures—SHRDLU-2025 (situated reasoning + state logs), Memex-2025 (replayable research trails), and Xanadu-2025 (fine-grained, versioned transclusion)—in the same spirit as the Pucci-by-AI reimplementation [arXiv:2509.02506], which re-animates a century-old rule-and-trace paradigm for machine translation. Across evidence-backed question answering—and informed by the Pucci-by-AI module for Romance languages—EbD targets three dimensions neglected by prose-only XAI: (i) source fidelity, (ii) replayability, (iii) plan accuracy. This paper is a conceptual and architectural proposal: we do not release a running system, but provide a design blueprint and verifiable-by-construction contracts for implementation and evaluation. Although model-agnostic, EbD is particularly well-suited to SLMs (small language models): their tighter version surfaces and toolchains make seed/plan locking feasible; lower compute footprints allow complete trail capture and deterministic replays; domain-specialized vocabularies improve fragment-level addressability and citation coverage; and on-prem/edge deployment simplifies compliance, provenance control, and evidentiary hashing. The result reframes explanations as first-class, testable products rather than persuasive text, enabling audit, compliance, and pedagogy in high-stakes domains.
Keywords
Explainability‑by‑Design; XAI; reproducibility; provenance; RAG; fragment‑level citation; Memex; SHRDLU; Xanadu; machine translation; Pucci-by-AI; evaluation metrics.
Design and Development of Verlanguage: A Mobile Application for Real-time Body Language and Emotion Recognition using Google ML Kit and Firebase
Brandon Michael Lim1, Jonathan Thamrun2, 1USA, 2California State Polytechnic University, Pomona, CA 91768
ABSTRACT
Blind people have a difficult time understanding body language or non-verbal emotions [1]. Kids may struggle learning body language, especially those with special needs. My app, verLanguage, identifies and solves these problems. Using code, percentages, and cameras, it calculates body language patterns to create an easy-to-use environment for people to use. Android Studio is the backbone of the app and houses all of the fine details and code. Google ML Kit is a mobile SDK that uses machine learning and houses body language detection to detect someones emotion [2]. Firebase houses different platforms to easily integrate my app onto different platforms and houses user sign-in information. Problems were difficult and fixed by trial and error to see which code/design worked best. To test the app, I wrote 20 inputs and outputs consisting of human faces, pictures, etc, to see if what I wrote initially matched with what happened when the app actually ran. The app is quick, free, and efficient, utilizing built-in technology to understand emotions without needing extra equipment or hassle; it is all with just one press of a button.
Keywords
Emotion Recognition, Machine Learning, Accessibility, Mobile Application.
An Intelligent Mobile System for Personalized Golf Swing Analysis using Computer Vision and Artificial Intelligence
Vivienne Zhai1, Andrew Parkn2, 1USA, 2California State Polytechnic University, Pomona, CA 91768
ABSTRACT
Golf instruction remains economically inaccessible for most recreational players, with professional coaching costing $50-$200 per hour. This research presents Twelfth Tee, an intelligent mobile application leveraging computer vision and artificial intelligence to democratize access to professional-grade golf swing analysis. The system employs a three-tier architecture combining Flutter-based mobile video capture, Python backend pose estimation processing, and OpenAI GPT-4 integration for personalized feedback generation [1]. The methodology utilizes deep learning-based human pose estimation to extract 33 body landmarks from smartphonerecorded golf swing videos, calculating biomechanical metrics including joint angles, swing phases, and movement patterns. These quantitative measurements inform GPT-4 prompts that generate contextual coaching advice tailored to specific techniques and detected issues.Experimental validation demonstrated pose estimation accuracy within 4.2-6.8° mean absolute error across camera perspectives, with side-view recordings providing optimal results. AI-generated feedback achieved 4.08/5.0 expert quality ratings compared to 4.42/5.0 for human PGA professionals, with particular strength in comprehensiveness but weaker prioritization [2]. By combining objective biomechanical analysis with natural language coaching at a fraction of traditional instruction costs, Twelfth Tee represents a significant advancement in accessible sports training technology.
Keywords
Golf Swing Analysis, Pose Estimation, AI Coaching, Sports Technology.
Biochemical Neural Network (BNN) for Adaptive Intelligence: Architecture, Optimization, and Inherent Xai
Marco Armoni1 and Imma Orilio2, 1Visiting Professor in Cybersecurity & Computational Neuroscience Research Associate in Artificial Intelligence and Neuroscience Projects, 2CEO Talence SrL
ABSTRACT
The Biochemical Neural Network (BNN) is an advanced AI model that overcomes the limitations of traditional ANNs (static loss functions, black box opacity). It translates neurochemical dynamics into engineering principles for self-regulated learning, focusing on optimization, explainability, and robustness. The BNN uses Pyramidal Neurons regulated by Inhibitory Interneurons (for E/I balance) and Neuromodulators (like Dopamine and Adrenaline) as dynamic state variables. Learning is driven by the search for an internal Dynamic Biochemical Equilibrium, converging toward "cognitive quietude" for superior Energy Efficiency. The model features Inherent Explainable AI (XAI): every decision is a measurable, traceable consequence of a specific biochemical state (e.g., Dopamine for success, Adrenaline for risk), offering causal transparency. Robustness is achieved via three control axes: Intrinsic Reinforcement Learning (Dopamine), E/I Stabilization (GABA/Glycine), and Stress Management. This framework is a frontier contribution to Sustainable and Explainable Artificial Intelligence.
Keywords
Computational_neuroscience, Dynamics_of_complex_systems, XAI, Machine_Learning, Reinforcement_Learning.
Assessment of Groundwater Recharge Potential using a Stochastic Analytic Hierarchy Process (SAHP)
Youssouf Koussoube1, Issan Ki1, 2, Noura Dahri3, 1Laboratoire Géosciences et Environnement, Unité de formation et de recherche en sciences de la vie et de la terre, Université Joseph KI-ZERBO, Ouagadougou P.O. Box 7021, Burkina Faso, 2Unité de formation et de recherche en sciences Appliquées et technologies, Université Daniel OUEZZIN-COULIBALY, Dédougou, BP 139, Burkina Faso, 3University of Sfax, BP 1171, 3000 Sfax, Tunisia
ABSTRACT
Mapping groundwater recharge potential is essential for sustainable water resource management, particularly in regions facing increasing hydrological stress. In this context, multi-criteria decision-making (MCDM) techniques that combine remote sensing and GIS have become valuable tools for spatial analysis and environmental assessment. Among these techniques, the Analytic Hierarchy Process (AHP), developed by Saaty (1980), is widely used due to its structured approach to integrating multiple factors influencing recharge. However, the subjectivity involved in assigning criterion weights and defining class boundaries remains a major limitation, often leading to uncertainty in the resulting maps. To address this issue, the present study incorporates a Monte Carlo–based probabilistic approach into the AHP framework to quantify and reduce the influence of subjective judgments. This integration allows a more robust and objective evaluation of groundwater recharge potential by accounting for variability in expert opinions. Applied to the Niger Basin, the method demonstrates improved reliability in delineating areas with favourable recharge conditions, validated through piezometric data from observation wells. Overall, the AHP–Monte Carlo hybrid model enhances the credibility of recharge potential mapping by providing a more transparent and data-driven assessment of spatial uncertainty. These areas are located in the north and south of the basin, in the localities of Gorom Gorom, Dori, Bogandé, Gayérie and Diapaga.
Keywords
Remote sensing and GIS; AHP method; Monte Carlo method; Recharge potential; Niger basin.
Assessment of Groundwater Recharge Potential using a Stochastic Analytic Hierarchy Process (SAHP)
Marco Armoni1 and Imma Orilio2, 1Visiting Professor in Cybersecurity & Computational Neuroscience Research Associate in Artificial Intelligence and Neuroscience Projects, 2CEO Talence SrL
ABSTRACT
Liver cirrhosis, the end-stage of chronic liver disease, is characterized by progressive fibrosis and architectural remodeling. This study introduces a novel multi-zonal computational model based on Ordinary Differential Equations (ODEs) to simulate the diseases dynamic progression. The model mathematically integrates key biological components, including chronic inflammation, Hepatocyte Stellate Cell (HSC) activation, and fibrogenesis, specifically across the distinct zones of the hepatic lobule (Zones 1, 2, and 3). A critical mechanism incorporated is a positive feedback loop that reproduces the chronic, self-reinforcing nature of fibrosis. The primary objective was to evaluate the potential efficacy and cost-effectiveness of targeted anti-fibrotic therapies. Simulations demonstrated that localized intervention, particularly within Zone 3 (centrilobular), significantly decelerates the progression of scarring, offering a high therapeutic ratio for preserving overall liver function.
Keywords
Causal_Inference_and_Causal_Learning, Dynamics_of_complex_systems, Time_Series_Forecasting, Machine_Learning_Model&Application, Optimization_for_ML
Compact CNN: Multi-Objective Optimization for Architecture Search
Wassim Kharrat1, Khadija Bousselmi2, and Ichrack Amdouni1, 1University of Manouba, Tunisia, 2University of Savoie Mont Blanc, France
ABSTRACT
Convolutional Neural Network (CNN) architectures have achieved remarkable success in various image analysis tasks. However, designing these architectures manually remains both labor-intensive and computationally expensive. Neural Architecture Search (NAS) has emerged as a promising approach for automating and optimizing network design. Among NAS methods, gradient-based techniques stand out for their ability to reduce computational costs while maintaining competitive performance. Nevertheless, the architectures they produce can still be demanding in terms of model size and inference time. To address this challenge, we propose a comprehensive image-analysis pipeline that combines the PC-DARTS algorithm with post-training 16-bit quantization and structured pruning. Experimental results show that our pipeline achieves an accuracy of 99.10% and an Intersection over Union (IoU) of 72.03%, while reducing the model size by up to 54%, making it well-suited for deployment in resource-constrained environments.
Keywords
Neural Architecture Search, Convolutional Neural Networks, Compression, Quantization, Pruning, Segmentation.
Ecobin: A Solar-powered Self-cleaning and Deodorizing Trash Bin with Rainwater Collection using AI and Iot System
Isaac Liu1, Jonathan Sahagun2, 1USA, 2California State Polytechnic University, Pomona, CA 91768
ABSTRACT
Convolutional Neural Network (CNN) architectures have achieved remarkable success in various image analysis tasks. However, designing these architectures manually remains both labor-intensive and computationally expensive. Neural Architecture Search (NAS) has emerged as a promising approach for automating and optimizing network design. Among NAS methods, gradient-based techniques stand out for their ability to reduce computational costs while maintaining competitive performance. Nevertheless, the architectures they produce can still be demanding in terms of model size and inference time. To address this challenge, we propose a comprehensive image-analysis pipeline that combines the PC-DARTS algorithm with post-training 16-bit quantization and structured pruning. Experimental results show that our pipeline achieves an accuracy of 99.10% and an Intersection over Union (IoU) of 72.03%, while reducing the model size by up to 54%, making it well-suited for deployment in resource-constrained environments.
Keywords
Neural Architecture Search, Convolutional Neural Networks, Compression, Quantization, Pruning, Segmentation.
Design and Implementation of Heritage Link: An AI Integrated Mobile Application for Heritage Education, Community Engagement, and Cultural Preservation
Mai Tung1 and Jason Moya2, 1Switzerland, 22California State Polytechnic University, USA
ABSTRACT
Heritage Link is an app that supports users desire to have a platform for discussing, uploading, and learning more about ancient heritages [1]. Designed and built entirely using Flutter for UI development and Firebase for database management, Heritage Link offers a socially impactful solution to those interested in heritages [2][3]. As ancient heritages are not mainstream and are considered niche communities, this app will allow those who were hesitant on learning more about their local heritages and other heritages throughout the entire world by utilizing the AI scan feature on Heritage Link. Using the OpenAI API allows the Heritage Link to use artificial intelligence to allow all users to upload images of any ancient heritage near them or found online so they can meaningful responses filled with meanginful information [4]. As a method to give back to the ancient heritage community, there is a dedicated page for those interested to donate heritages preservation organizations.
Keywords
Cultural Heritage, Mobile Application, Artificial Intelligence, Community Engagement.
Software Engineering of AI-IoT Braking Systems: Safety and Performance in Fleets
Suryakant Kaushik, Texas A&M University, USA
ABSTRACT
This paper presents an IoT-enabled AI framework for intelligent braking systems in commercial fleets, focusing on software engineering challenges of integration, reliability, and predictive decision-making. Traditional braking systems are reactive and inefficient, leading to high downtime, energy loss, and safety risks. The proposed architecture leverages distributed IoT sensor networks, edge-cloud data fusion, and machine learning algorithms to enable predictive braking, adaptive driver support, and condition-based maintenance. I conduct a comparative analysis between traditional and AI-enhanced braking systems, showing reductions of up to 75% in collision incidents and 30% in unplanned downtime. Furthermore, I synthesize results from case studies across logistics, passenger transport, and manufacturing to validate real-world applicability. The contribution of this work lies in formalizing a software-centric IoT framework, supported by performance analysis, to advance safe and sustainable fleet operations.
Keywords
AI, IoT, Commercial Fleet Management, Predictive Braking, Software Engineering.
A Mobile App for Tracking Psychological Mood Changes and Providing E-Therapy using Natural Language Processing and GPT-3
Zachary Zhang1 and Austin Amakye Ansah2, 1USA, 2California State Polytechnic University, Pomona, CA 91768
ABSTRACT
Creative professionals face fragmentation across multiple platforms for portfolio sharing, social networking, andfreelance project management, leading to inef icient workflows and missed opportunities. DesignHub addresses thisby providing a comprehensive mobile social networking platform unifying portfolio sharing, community engagement, and task management through Flutter cross-platform development and Firebase Backend-as-a-Service architecture[1]. The system implements three core components: Firebase Authentication for secure user management, real-timedesign portfolio sharing with social engagement features through Firestore synchronization, and comprehensivetask management supporting both paid and volunteer opportunities. Key challenges included maintaining real-timedata consistency, optimizing cross-platform performance, and ensuring authentication security, addressed throughFirestore real-time listeners, cached network images, and granular security rules [2]. Experimental validationdemonstrated sub-300ms synchronization latency on modern networks (142ms WiFi, 298ms 4G), confirmingef ective real-time data propagation for social interactions. The platform validates Flutter-Firebase methodologiesfor social collaboration applications, achieving 60-70% development time reduction while maintaining native- quality user experiences, of ering creative professionals an integrated ecosystem for portfolio showcase andfreelance collaboration.
Keywords
Cross-Platform Development, Firebase Integration, Social Networking, Portfolio Management
AI Driven Solutions in Cybersecurity and the Rise of Biometric Authentication
Mustapha Zeroual, Abderrazek Karim, Youssef Baddi, Faysal Bensalah, Chouaib Doukkali University, Morocco
ABSTRACT
The increasing danger of the landscape threats that we see today reflects just how important cyber security is, especially concerning recent notorious attacks on smartphones stealing user data and privacy in or now digital age. In this article, we uncover the clever but complex cybersecurity scenario and role artifi cial intelligence (AI) plays to provide creative solutions in fighting these crimes. An introduction to cryptography: How cryptography plays a vital role in secure communication and information over the network. Next are some basic ideas of identity and access management like authentication and authorization along with some details on the types by which one could deploy these security mecha nisms emphasising that biometric security systems are being increasingly used for authentication. This study illustrates the benefits and the limitations of biometric authentication cases, in order to expose practical situations where the biometric can operate. Beforehand, it discusses database security by identifying approaches to protect confidential information and the impacts of data breaches on orga nizations. He says it will take a holistic cybersecurity approach that combines new techniques for distributed systems and grid security. In addition, the article presents methods of information hiding and watermarking as tools for secur ing Intellectual Property in the digital spectrum. It illustrates the importance of intrusion detection systems (IDS) in discovering and addressing new threats. Emphasizing mobile computing, the first part of this course covers additional network security issues surrounding secure digital communications. Trusted com puting and its role in protecting the enterprise-controlled environment is then covered. Then, the transformative capacity of blockchain technology with respect to securing and protecting data is scrutinized, revealing its uses in cybersecu rity. This post is intended to give a high-level broad outline of modern trends and modern security practices, and how AI is helping change security practices across the world against evolving cyber threats, which are getting more digitally complex with each passing day.
Keywords
Cybersecurity, Cryptography, Biometric Authentication, Artificial Intelligence (AI), Blockchain.
Big Data in Cybersecurity: Leveraging AI and SDN for Enhanced Threat Intelligence and Network Optimization
Abderrazek Karim, Mustapha Zeroual, Youssef Baddi, Faysal Bensalah, Chouaib Doukkali University, Morocco
ABSTRACT
They can help ease the integration of Big Data analytics in cybersecurity frame works as the landscape of modern cyber threats changes. The following article is a preliminary study on the interaction of BIG DATA, with artificial intelligence (AI), Software Defined Networks (SDN) for improvement of network security in order to yield better results from threats intelligence. We also, consider some of tools components in Big Data ecosyste, key theory as well algorithms and a need for data visualization and mining approach to detect threats. We also address how sensor networks and social networks contribute to data sources for cybersecurity applications. We demonstrate how the performance characterization, evaluation and opti mization practices underpinning our scientific results underscore not only the importance of real time data stream management in addressing cyber risks but also suggest that evaluating systems without taking their timing profile into account may lead to suboptimal and prejudiced conclusions. This means that there is a need for both heterogeneous data management and well rounded analtics to provide information on what security measures should be taken. We also highlight Big Data analytics use cases in cybersecurity, focusing on applica tions with significant outcomes for business processes and threat detection using case studies. In conclusion, this article strongly recommends that cybersecurity professionals need to work hand in glove with technologists and data scientists in order to leverage the power of Big Data and AI to fortify their digital ecosystems.
Keywords
Big Data , Cybersecurity, Artificial Intelligence (AI) , Software Defined Networks (SDN),Threat Intelligence.
Mycroft - Retrieval Augmented Generation for SDK Documentation
Diego Costa, Gabriel Matos, Gilson Russo, Leon Barroso, and Erick Bezerra, SIDIA Manaus - AM, Brazil
ABSTRACT
Information retrieval plays an important role in everyday tasks, especially when it comes to documentation. Retrieving information about private documentation used to build other software is very challenging due to its absence on the internet, meaning there is no information about it beyond its own documentation. Due to concerns about confidential data, using external proprietary systems is prohibited. Motivated by this, in this study, we present Mycroft, a retrieval system that leverages the Retrieval Augmented Generation technique to find a feasible approach that improves search and information retrieval requested by users about the documentation. To implement this system, a dataset of questions and answers about the documentation was generated for evaluation. The system was developed on-premise using open-source Large Language Models and evaluated using Natural Language Processing metrics and human evaluation to validate the generated answers. After evaluating the results, we concluded that the proposed retrieval system had reasonable performance in answering user queries and received good human evaluation, being considered useful.
Keywords
Retrieval Augmented Generation, Large Language Models, Software Documentation.
Design and Development of Neofocus: A Gamified Productivity Application Leveraging Gacha-inspired Reward Systems to Enhance Focus and Motivation
Lawrence Wen Lam1, Success Godday2, 1USA, 2California State Polytechnic University, Pomona, CA 91768
ABSTRACT
NeoFocus is an application that aims to combat distractions on the internet. It was developed as an experimental implementation based on the developer’s personal experiences. Although games, particularly gacha games, are often criticized for its addictive nature, NeoFocus utilizes it with virtuous intentions. With the promise of undecided rewards, this application intends to keep users hooked– in a productive way. Users using this application will inevitably associate hard work with a dopamine-reward, thereby fueling greater focus. Although the main appeal of the game is in its gamification, it also encourages friendly competition and community interactions in its blogs and challenges feature [1]. These social features hope to promote community interactions, which will increase the user’s consistency.
Keywords
Gamification, Productivity, Reward Systems, Digital Wellbeing.
An Explainable Generative Deep Learning Framework For Abnormal Behavior Prediction Using Pose Sequences
Zaineb Liouane1, Oumaima Liouane2, Hela Haj Mohamed3 ,1 MARS Research Unit, University of Sousse, Tunisia ,2Laboratory of Electronics and Microelectronics, EPF Engineering School, Paris-Cachan, France ,3MARS Research Unit, Department of Computing Science, Faculty of Sciences of Monastir (FSM), University of Monastir, Tunisia
ABSTRACT
Monitoring the health and safety of elderly individuals living independently continues to be a major concern in aging societies. In recent years, the global proportion of individuals aged 65 and over has increased significantly, exceeding 10% of the world population as of 2024. Timely identification of abnormal behaviors is crucial to ensuring rapid intervention and reducing potential risks.This paper proposes an explainable generative deep learning framework for abnormal behavior prediction using pose sequences in smart home environments. The framework effectively combines three components: a generative model based on Generative Adversarial Networks (GANs) to learn the distribution of normal human pose dynamics, a predictive model that integrates Convolutional Neural Networks with bidirectional LSTMs (CNN-BiLSTM) and an attention mechanism to capture spatio-temporal features of movements, and an anomaly detection module using an autoencoder to reconstruct pose sequences and identify deviations. The suggested model provides interpretable explanations for detected anomalies insights into the identified ones, emphasizing the specific time frames or body joints that most strongly reflect abnormal behavior. Our approach is designed for unobtrusive monitoring of seniors using pose data (skeleton), preserving privacy while ensuring safety. Experimental results demonstrate that our framework can accurately detect unusual behaviors and provide human interpretable insights, making it a promising solution for intelligent eldercare monitoring in smart environments.
Keywords
Human Activity Recognition, Smart Environments, deep learning, , Abnormal Behavior Prediction.
Thermal Imaging-Based Defects Prediction in High-pressure Die Casting Using Hybrid Neural Networks And Fuzzy Cognitive Maps
T. Michno, R. Holom, S. Schmalzer, P. Meyer-Heye, G. Scampone, E. Riegler, M. Hartmann, U. Repansek, N. Koˇsir, P.Sifrer, and K. Poczeta, Austrian Institute of Technology GmbH, Austria
ABSTRACT
Producing a defect-free, lightweight, high-performance and complex geometry metal components is a highly challenging task. In this paper, we focused on High Pressure Die Casting (HPDC),proposing a hybrid AI model for non-destructive, in-line, and non-process-interrupting defect prediction, using thermal images. For that, a deep neural network model is used to extract features, which are then classified by a Fuzzy Cognitive Map (FCM). Experimental results show that the method improves prediction performance.The main contributions of this research include: (i) a novel hybrid model architecture for processing thermal images, (ii) a feature extractor for a FCM-based classifier, (iii) extension of FCM via three clustering techniques to enhance classification accuracy, (iv) a modular design, allowing easy addition of other data sources and classes without retraining, (v) a thorough evaluation through model comparisons and an ablation study, and (vi) to the best of our knowledge, first usage of FCM for this problem.
Emotional Algorithmics
Fernando Paravano1 and Gabriel Paravano2,1Department of Computing, IPVyDU, Rawson, Chubut,2PHD Philosophy. UNSJ, Capital, San Juan
ABSTRACT
The proposal is to represent algebraically the living conditions, especially for people suffering from some type of emotional disorder, by applying a formula that represents the level of resilience they could aspire to, applying logical actions, in order to create their own emotional immune system.
Keywords
Emotion, algorithmic, immune, system.
Overview And Prospects Of Using Integer Surrogate Keys For Data Warehouse Performance Optimization
Sviatoslav Stumpf1 And Vladislav Povyshev2 1Itmo University, Saint Petersburg, Russia , 2 Itmo University, Saint Petersburg, Russia
ABSTRACT
The paper examines methods for optimizing data warehouse performance using integer-based datetime labels. It is shown that replacing standard DATE and TIMESTAMP types with 32- and 64-bit integer formats reduces storage requirements by 30–60% and speeds up query execution by 25–40%. The paper presents indexing, aggregation, compression, and batching algorithms demonstrating up to an eightfold increase in throughput. Practical examples from finance, telecommunications, IoT, and scientific research confirm the efficiency and versatility of the proposed approach.
Keywords
Integer labels, time series, optimization, performance, data warehouse, indexing, aggregation
Beyond Traditional Retrieval Systems: Leveraging AI With Documents, Knowledge Graphs And Databases
Antony Seabra, Claudio Cavalcante, and Sergio Lifschitz , PUC-Rio - Pontifical Catholic University of Rio de Janeiro, Brazil
ABSTRACT
his study explores techniques for retrieving data from documents, knowledge graphs, and databases using Large Language Models (LLMs), specifically leveraging OpenAI’s GPT models as foundational frameworks for embeddings and conversational models in question-answering (QA) systems. Our research focuses on the utilization of Prompt Engineering, Retrieval-Augmented Generation (RAG), and Text-to-SQL techniques to effectively extract information from these diverse data sources without the need for model retraining. A key aspect of our study is the emphasis on explainability, demonstrating how these techniques can reveal the rationale behind retrieved information and enhance the understanding of results. We highlight the challenges encountered in specific use cases during our tests and present effective strategies and solutions to overcome them. Our findings demonstrate the potential of LLMs to surpass traditional search and retrieval systems, paving the way for more efficient and comprehensible information systems.
Keywords
Information Retrieval, AI, Explainability, Documents, Knowledge Graphs, Databases, Recommendation System
Contact Us
sigemconference@gmail.com
Copyright © SIGEM 2025