SAAD WAZIR
PhD. Candidate | CVPR 2025 | AI Practitioner | Computer Vision | Medical Image Analysis | Multimodal Learning - Auto-ID Labs, KAIST in South Korea
PhD. Candidate | CVPR 2025 | AI Practitioner | Computer Vision | Medical Image Analysis | Multimodal Learning - Auto-ID Labs, KAIST in South Korea
I'm a PhD researcher at AI2 Lab, KAIST, South Korea, advised by Prof. Daeyoung Kim, and previously supervised by Prof. Muhammad Moazam Fraz. My research focuses on developing efficient and robust AI systems for computer vision, with a primary emphasis on medical image analysis, alongside broader interests in machine learning and multimodal learning.
I aim to design practical and scalable methods that improve performance, efficiency, and generalization across diverse real-world applications.
I actively collaborate across academia and industry, contributing to the evaluation of multimodal models in complex real-world scenarios. I also have experience in audio-visual multimodal learning, image super-resolution, and human activity and emotion recognition, along with university-level teaching and industry-driven AI deployment.
Learning from My Professor
I focus on developing efficient and robust deep learning models for computer vision, with a primary emphasis on medical image segmentation and analysis. My research explores novel architectures, particularly advanced decoder designs, attention mechanisms, and multi-scale feature integration to improve feature representation and boundary precision across diverse medical imaging modalities. In addition, I work on related areas including image super-resolution, multimodal learning, MMLLMs evaluation & benchmarking, Reliable audio-visual QA, Human activity and emotion recognition, and real-world AI deployment, aiming to design scalable and high-performance systems that bridge theoretical advancements with practical applications.
Explore my publications for more details on my research contributions. Publication List | Google Scholar
AI Researcher | AI2 Lab, KAIST, Daejeon, South Korea. | (02/2023 - )
Contributed to a Waste Plastic Sorting – Industry Project (robone.co.kr/projects)
Developed and deployed optimized AI models for real-time automated waste sorting. Fine-tuned AI models for classification, detection, segmentation. and distance estimation to enhance sorting accuracy. Designed and implemented APIs to facilitate communication between robotic hardware and AI modules. Built a framework to integrate Intel RealSense and ZED X depth-sensing cameras for improved perception. Deployed AI models on edge devices, using the TensorRT framework.
Contributed to Medical Image Analysis Research (PhD Thesis)
Developed novel architectures for medical image segmentation, focusing on advanced decoder design, attention mechanisms, and multiscale feature fusion to improve feature representation and boundary precision across diverse imaging modalities; this work is published as ReN-UNet (MICAD 2024), HistoSeg++ (ICBBE 2025), and MCADS-Decoder (CVPR 2025).
Curated a dataset including Histopathology, Microscopy, Retinal Vessel Analysis, Colonoscopy, Breast Ultrasound, Thyroid Ultrasound, Skin Lesion, and 3D multi-organ and cardiac segmentation tasks; the dataset, named MedCAGD, is publicly available. Also developed a novel segmentation framework that is efficient and effective, producing SOTA results on the MedCAGD dataset (Under Review).
Collaborative Research
Contributed to the DashBench project, evaluating MMLLMs for understanding challenging and corner-case traffic scenarios, along with research in audio-visual multimodal learning, image super-resolution, human activity/emotion recognition , and medical VLMs evaluation.
__________
Research Assistant | Embedded Systems & Pervasive Computing (EPIC) Lab, Islamabad. Pakistan. | (03/2018 - 06/2019)
Applied machine learning to sensor data for human activity recognition using inertial sensors. Developed multi-device cloud systems, built and configured hardware, and implemented custom software. Conducted research and analysis, and supported project management including budgeting and meetings.
__________
Full Stack Web Developer | Kalsym Systems (Pvt) Ltd, Islamabad. Pakistan. | (08/2016 - 09/2018)
Developed web applications, designed feasible system models, and created custom software solutions. Analyzed existing IT systems and business models, and implemented, configured, and tested solutions.
Graduate Teaching Assistant (School of Computing) | KAIST, Daejeon, South Korea. | (09/2023 - )
Served as a Graduate Teaching Assistant, delivering lectures and supporting instruction for a graduate-level course (CS632).
__________
Lecturer (Faculty of Computing) | Riphah International University, Islamabad, Pakistan. | (09/2022 - 01/2023)
Taught core computing courses (Object-Oriented Programming, Web Programming, Data Structures and Algorithms (DSA), and Artificial Intelligence). Ensure Outcome-Based Education. Prepare and analyze the Course Learning Outcome report. Prepare assignments, quizzes, presentations, and exam reports. Develop question banks and conduct exams.
__________
Lecturer | HAPS&C, Islamabad, Pakistan. | (08/2017 - 07/2022)
Taught Computer Science and conducted internal examinations. Served as discipline and program Officer. Planned lessons aligned with curriculum objectives and FBISE standards. Developed question banks and designed university admission exam questions. Evaluated FBISE Computer Science papers and conducted FBISE examinations.
Computer Vision: Detection, Segmentation, Distance Estimation, Super Resolution, Self-Supervised Learning, Transfer Learning, Domain Adaptation. Medical Image Analysis. 3D MRI/CT Processing, Medical Image Reconstruction.
Multimodal Learning & Evaluation: Vision and Sensor Fusion, Reliable Audio-Visual QA, Parameter-Efficient Audio-Visual Alignment, Evaluating MMLLMs. Medical VLMs Evaluation for Clinical Image Understanding.
LLM Engineering: Model Inference Optimization, Fine-Tuning, Model Parallelization, LangChain, Agent Toolchains, Local Model Deployment (LLMster, Ollama, Tailscale), Multi-Agent Orchestration, MCP, Sandboxed Execution, Web Crawling, Agent Tracing.
Information Retrieval: Retrieval-Augmented Generation (RAG), Dense & Sparse Search (FAISS, Weaviate, Pinecone, Elasticsearch, BM25), Hybrid Search, Reranking (Cohere, Cross-Encoders), Document QA Pipelines.
Edge AI: Human Activity and Emotion Recognition, Smart Systems and Sensor Integration, Multi-Teacher Distillation, Quantization, TensorRT deployment on NVIDIA Jetson, Raspberry Pi, SiMa.ai, and LattePanda.
Software Engineering & MLOps: Python, Java, C++, PHP, JavaScript, TensorFlow, PyTorch, FastAPI, Node.js, Django, Flask, REST APIs, WebSockets, jQuery, Ajax, MongoDB, Hugging Face Transformers, Embedding Pipelines, AWS, Azure, OpenStack, Linux Administration, Networking, Docker, Kubernetes, Terraform, AWS (Lambda, EC2, S3, OpenSearch), CI/CD Pipelines, Edge Clusters.
Academic & Research: Scientific Writing, Grant & Proposal Writing, Teaching & Mentorship, Collaborative Research, Project Supervision.
Ph.D. (Candidate) in Computer Science | 02/2023 -
Korea Advanced Institute of Science & Technology (KAIST), South Korea
Thesis: Medical Image Analysis | Advisor: Prof. Dr. Daeyoung Kim | Expected Graduation: Feb, 2027
Master of Science in Computer Science | 09/2018 - 05/2022
National University of Sciences and Technology (NUST), Islamabad, Pakistan
Thesis: Medical Image Segmentation (HistoSeg - ICPRS 2022) | Advisor: Prof. Dr. M Moazam Fraz
Bachelor of Science in Computer Science | 07/2012 - 06/2016
Preston University, Kohat, Islamabad, Pakistan
Major: Software Engineering
Conference Reviewer
CVPR , AAAI, ICML, MICCAI, ICLR, WACV, ECCV, HONET, ICOMIT, ISIBER
Journal Reviewer
IEEE Signal Processing Letters, IEEE Transactions on Computational Biology and Bioinformatics, IEEE Transactions on Circuits and Systems for Video Technology, IEEE Access, Array Journal
Presentations
Conference Talk (Nov 2025) | HistoSeg++: Delving deeper with attention and multiscale feature fusion for biomarker segmentation. 12th International Conference on Biomedical and Bioinformatics Engineering (ICBBE 2025), Tokyo, Japan. 🏆Best Presentation Award (Oral Session).
Invited Talk (Nov 2025) | Advances in Medical Image Segmentation, U-Net to Foundation Models. The Superior University, Pakistan.
KAIST Talk (Oct 2025) | KAIST Grad School Seminar and Presented on emerging paradigms in medical image analysis, highlighting future research opportunities and effective ways to approach research in the field. NUST, Islamabad, Pakistan.
Poster (Jun 2025) | Rethinking Decoder Design: Improving Biomarker Segmentation Using Depth-to-Space Restoration and Residual Linear Attention. IEEE / CVF 44th Computer Vision and Pattern Recognition Conference (CVPR 2025), Nashville, USA.
Invited Talk (Feb 2025) | Enhancing Biomarker Segmentation. Cardiff-Asan Medical Center Workshop (Collaboration of Cardiff University, Asan Medical Center (AMC), and Korea Advanced Institute of Science and Technology), Seoul, South Korea.
Conference Talk (Nov 2024) | Rethinking the Nested U-Net Approach: Enhancing Biomarker Segmentation with Attention Mechanisms and Multiscale Feature Fusion. 5th International Conference on Medical Imaging and Computer-Aided Diagnosis (MICAD 2024) - MICCAI Society Endorsed Event, Manchester, UK.
Invited Talk (Mar 2024) | KAIST Graduate School Seminar (IEEE Signal Processing Society Student Chapter and IEEE Young professionals Karachi). Sir Syed University of Engineering and Technology, Pakistan.
Best Presentation Award (Oral Session) - 2025
12th International Conference on Biomedical and Bioinformatics Engineering (ICBBE 2025), Tokyo, Japan
For the paper: HistoSeg++: Delving Deeper with Attention and Multiscale Feature Fusion for Biomarker Segmentation
__________
Fully Funded PhD Scholarship - 2023
Korea Advanced Institute of Science & Technology (KAIST)
__________
NUST Financial Award for Publications - 2022
National University of Sciences & Technology
For the paper: HistoSeg: Quick Attention with Multi-Loss Function for Multi-Structure Segmentation in Digital Histology Images
Advisor/Mentor - Muhammad Javed (BS, Riphah International University, Islamabad, Pakistan)
Collaboration - Hamza Ali Imran (Marie Skłodowska-Curie doctoral researcher at Saarland University, Germany)
Collaboration - Rao Faizan (Doctoral researcher, Kyung Hee University, Global Campus, South Korea)
Collaboration - Dinh Phu Tran (Doctoral researcher, Korea Advanced Institute of Science & Technology (KAIST), Daejeon, South Korea)
Collaboration - Patrick Vibild (Doctoral researcher, Department of Energy, Aalborg University, Denmark)
Collaboration - Seongah Kim (MS Student, Korea Advanced Institute of Science & Technology (KAIST), Daejeon, South Korea)
Collaboration - Seungmin Yang (Doctoral researcher, Korea Advanced Institute of Science & Technology (KAIST), Daejeon, South Korea)
Collaboration - Dr. Ataul Aziz Ikram (Professor, National University of Computer & Emerging Sciences, Islamabad, Pakistan)