Research
Project #1:
Interpretable CNNs for Pneumonia Detection using Captum Library
Explainable AI (XAI) is a subset of artificial intelligence (AI) that focuses on creating models and systems that can provide human-understandable explanations for their decisions and actions. Explainable AI models are gaining importance due to the focus on human-centered design principles, taking into account the needs, expectations, and cognitive abilities of users. Unlike other deep learning networks, which are often “black boxes”, explainable AI models are designed to be transparent, meaning they provide visibility into how they arrive at their predictions or decisions. Thus, XAI models are designed to provide interpretable and understandable explanations for their decisions and actions, allowing people to understand how and why the AI system makes certain predictions or decisions.
XAI models can provide valuable assistance to radiologists and other healthcare professionals in interpreting X-ray images in several ways. First, XAI models can provide explanations for their predictions, allowing radiologists to understand the reasoning behind the model's decision. For example, an XAI model could highlight areas of the X-ray image that have most affected the prediction, or provide textual or visual explanations that describe features or patterns that contributed to the prediction. This can help radiologists verify model results, gain insights into the model's decision-making process, and build confidence in the reliability of the model. Second, such improved models can help identify potential errors or biases in the X-ray image interpretation process. For example, if the model's explanation shows that the prediction was based on a small area of the image, the radiologist can double-check that region for any artifacts or misinterpretations. This can serve as a valuable error-checking mechanism and improve the overall accuracy of X-ray image interpretations. Finally, XAI models can enhance trust and transparency in the AI-assisted interpretation of X-ray images. XAI models can provide interpretable explanations that help radiologists explain and communicate the basis for their diagnoses, increasing transparency and confidence in the decision-making process. In this research, we apply explainable AI to X-ray images of the lungs in order to diagnose lung infections such as pneumonia and abnormal buildup of fluid in the lungs.
Overview
Project 1 focuses on implementing eXplainable AI (XAI) techniques using the Captum library to create interpretable convolutional neural networks (CNNs) for pneumonia detection from lung images. By integrating XAI methods into the CNN model architecture, the project aims to enhance transparency, trust, and understanding of the model's decisions, ultimately improving diagnostic accuracy and clinical utility.
Objectives
Implement Interpretable CNNs: Develop CNN models for pneumonia detection that are transparent and interpretable, enabling clinicians to understand how the model arrives at its predictions.
Integrate Captum Library: Utilize the Captum library, a PyTorch-based XAI toolkit, to implement various interpretability methods such as feature attribution and gradient-based techniques.
Enhance Diagnostic Accuracy: Improve the accuracy of pneumonia detection from lung images by leveraging interpretable CNN models and XAI techniques.
Facilitate Clinical Decision Making: Provide clinicians with insights into the model's decision-making process, enabling them to validate predictions and make informed decisions.
Methodology
Data Acquisition and Preprocessing:
Dataset Selection: Gather a dataset of labeled lung images, including images of pneumonia-affected lungs and healthy lungs.
Data Preprocessing: Normalize the images, perform augmentation to increase dataset variability, and split the data into training, validation, and test sets.
CNN Model Architecture:
Architecture Design: Design CNN architectures suitable for pneumonia detection tasks, focusing on transparency and interpretability.
Training: Train the CNN models using the preprocessed dataset, optimizing hyperparameters and monitoring performance metrics.
Integration of Captum Library:
Captum Installation: Install and configure the Captum library within the PyTorch environment.
Interpretability Methods: Implement Captum's interpretability methods, such as Integrated Gradients, Layer-wise Relevance Propagation (LRP), and DeepLIFT, to analyze the CNN model's behavior.
Model Interpretation and Evaluation:
Feature Attribution: Use Captum to attribute importance scores to input features, highlighting regions of lung images crucial for pneumonia detection.
Gradient-based Techniques: Apply gradient-based techniques to visualize gradients and saliency maps, aiding in understanding the model's decision boundaries.
Evaluation Metrics: Assess the interpretability and diagnostic performance of the CNN models using standard metrics such as accuracy, precision, recall, and F1 score.
Validation and Clinical Relevance:
Clinician Feedback: Validate the interpretability of the CNN models and XAI methods through feedback from clinicians and domain experts.
Clinical Decision Support: Integrate the interpretable CNN models into clinical workflows to provide decision support for pneumonia diagnosis.
Expected Outcomes
Interpretable CNN Models: CNN models for pneumonia detection that provide insights into the reasoning behind their predictions.
Transparent Decision Making: Enhanced transparency and understanding of the model's decision-making process through XAI techniques.
Improved Diagnostic Accuracy: Increased accuracy and reliability of pneumonia detection from lung images, leading to more timely and accurate diagnoses.
Clinically Validated Interpretations: Interpretations of CNN predictions validated by clinicians, facilitating trust and adoption in clinical practice.
Conclusion
Project 1 aims to bridge the gap between AI-driven disease detection and clinical practice by developing interpretable CNN models for pneumonia detection. By integrating the Captum library and XAI techniques into the model architecture, the project seeks to provide clinicians with transparent and trustworthy insights into the model's predictions, ultimately improving patient outcomes and advancing the field of medical imaging analysis.
Project #2:
Augmented Reality Molecular Viewer with Speech Recognition
Augmented reality (AR) has many potential applications in chemistry, from teaching and learning to research and development. One of the applications of AR is virtual lab simulations. AR can provide students with a realistic laboratory experience without the need for expensive equipment and hazardous chemicals. Students can manipulate virtual instruments and chemicals, perform experiments, and make observations in a safe and controlled environment. AR can also be used for collaboration and communication by providing a shared virtual workspace. Researchers can visualize and manipulate molecular structures together, share data and ideas, and communicate in real-time. AR can be used to visualize complex molecular structures and interactions in 3D, allowing researchers to understand the behavior of molecules better and design new drugs and materials.
Here we develop an AR mobile application for molecular viewer. AR mobile apps can offer a unique and accessible experience in molecular research which was not achieved in existing AR tools. Such a AR mobile application would not require a high-end computer and AR headset to run and can be accessible by more people. A speech recognition function can also be implemented for this application. Speech recognition function can make an application more accessible to people with disabilities, including those who have mobility problems, visual impairments, or learning disabilities. It can also help users to complete their tasks more quickly and efficiently, as they can dictate text instead of typing it. By allowing users to interact with an application using speech, speech recognition can make the user experience more intuitive and natural, so there is no need to tediously learn the app`s navigation features.
Overview
The VR Molecular Viewer project aims to develop an immersive virtual reality (VR) application for visualizing molecular structures with augmented reality (AR) features. The application will incorporate speech recognition for navigation and a machine learning (ML) model for recognizing handwritten molecular formulas and generating files for 3D molecules.
Objectives
Immersive Molecular Visualization: Create a VR environment where users can visualize and interact with 3D molecular structures in a realistic and immersive manner.
Augmented Reality Features: Implement AR features to overlay virtual molecular elements onto the real-world environment, enhancing the user experience and facilitating interactive learning.
Speech Recognition: Integrate speech recognition capabilities to allow users to navigate through the VR environment and interact with molecular structures using voice commands.
Handwritten Recognition: Develop an ML model capable of recognizing handwritten molecular formulas input by users and converting them into files for generating 3D molecular structures.
User-Friendly Interface: Design a user-friendly interface that enables intuitive navigation, interaction, and manipulation of molecular structures within the VR environment.
Methodology
Development Environment Setup:
Software Selection: Choose appropriate development tools and platforms for VR development, such as Unity or Unreal Engine, and ML frameworks for handwritten recognition.
Hardware Configuration: Configure VR headsets and controllers for immersive user interaction.
Molecular Data Acquisition:
Dataset Collection: Gather a diverse dataset of molecular structures, including 3D models and corresponding handwritten formulas.
Data Preprocessing: Clean and preprocess the data for training the ML model and integrating molecular structures into the VR environment.
VR Environment Development:
Scene Design: Design interactive VR environments where users can explore and manipulate molecular structures.
AR Integration: Implement AR features to overlay virtual molecular elements onto the real-world environment using markers or object recognition.
Speech Recognition Integration:
Speech-to-Text Conversion: Implement speech recognition functionality to convert user voice commands into text input.
Navigation and Interaction: Develop voice command functionalities for navigating through the VR environment, selecting molecules, and accessing additional information.
Handwritten Recognition Model Training:
Data Preparation: Prepare the handwritten molecular formulas dataset for training the ML model.
Model Selection: Choose an appropriate ML model architecture for handwritten recognition, such as convolutional neural networks (CNNs) or recurrent neural networks (RNNs).
Training and Evaluation: Train the ML model on the prepared dataset and evaluate its performance in recognizing handwritten molecular formulas.
Integration and Testing:
System Integration: Integrate the VR environment, AR features, speech recognition, and handwritten recognition components into a cohesive application.
Testing and Validation: Conduct extensive testing to ensure the functionality, accuracy, and user experience meet the project requirements and user expectations.
Expected Outcomes
Immersive VR Experience: A VR application that provides an immersive and interactive experience for visualizing molecular structures.
AR Integration: AR features that overlay virtual molecular elements onto the real-world environment, enhancing the learning experience.
Speech Recognition: Voice command functionalities for navigating through the VR environment and interacting with molecular structures using natural language.
Handwritten Recognition: An ML model capable of accurately recognizing handwritten molecular formulas and generating files for 3D molecular structures.
Conclusion
The VR Molecular Viewer project aims to combine VR technology, AR features, speech recognition, and machine learning to create an innovative tool for visualizing and interacting with molecular structures. By providing an immersive and intuitive interface, the application seeks to enhance learning and research in the field of chemistry and molecular biology, offering new possibilities for education, exploration, and discovery.
Other Projects We Are Working On:
Project 3:
Application of Layer-wise Relevance Propagation (LRP) to a Custom ML Model for Cancer and Pneumonia Detection from X-Ray Images
Overview
This project focuses on the development and application of Layer-wise Relevance Propagation (LRP) methods to a custom machine learning (ML) model designed for the detection of cancer and pneumonia from X-ray images. The goal is to enhance the interpretability and diagnostic accuracy of the ML model by leveraging LRP techniques to understand and visualize the decision-making process of the model.
Objectives
Develop a Custom ML Model: Create a robust ML model capable of accurately detecting cancer and pneumonia from X-ray images.
Apply LRP Techniques: Implement Layer-wise Relevance Propagation to interpret the model’s predictions and provide visual explanations for its decisions.
Evaluate Performance: Assess the performance and interpretability of the ML model using standard metrics and LRP visualizations.
Enhance Diagnostic Confidence: Improve the diagnostic confidence of medical practitioners by providing transparent and interpretable model predictions.
Methodology
Data Collection and Pre-processing:
Data Source: Gather a large dataset of labeled X-ray images for both cancer and pneumonia, along with healthy controls.
Pre-processing: Normalize the images, apply augmentation techniques to increase dataset variability, and split the data into training, validation, and test sets.
Model Development:
Architecture Design: Design a convolutional neural network (CNN) tailored for image classification tasks, focusing on detecting cancer and pneumonia.
Training: Train the CNN model using the training dataset, employing techniques such as early stopping, learning rate adjustments, and regularization to optimize performance.
Evaluation: Validate the model using the validation dataset, and fine-tune hyperparameters to achieve the best possible performance.
Layer-wise Relevance Propagation (LRP):
LRP Implementation: Implement LRP techniques to trace back the contributions of individual pixels in the X-ray images to the model’s final prediction.
Visualization: Generate heatmaps highlighting regions of the X-ray images that significantly contribute to the prediction of cancer or pneumonia.
Model Interpretation and Analysis:
Interpretability: Analyze the LRP-generated heatmaps to understand the model’s decision-making process and ensure that the model focuses on medically relevant features.
Performance Metrics:
Accuracy and F1 Score: Evaluate the model’s classification performance using accuracy, precision, recall, and F1 score.
AUC-ROC: Assess the model’s ability to distinguish between classes using the Area Under the Receiver Operating Characteristic Curve (AUC-ROC).
Interpretability Metrics: Measure the interpretability of the model using qualitative assessments of the LRP heatmaps by medical experts.
Expected Outcomes
High-Accuracy Detection: A high-performing ML model capable of accurately detecting cancer and pneumonia from X-ray images.
Interpretable Predictions: Enhanced interpretability of the model’s predictions through LRP-generated visual explanations.
Clinical Relevance: Increased trust and confidence among medical practitioners in the model’s predictions, facilitating its potential integration into clinical workflows.
Conclusion
By integrating Layer-wise Relevance Propagation with a custom ML model, this project aims to advance the field of medical image analysis by providing a reliable, accurate, and interpretable tool for the detection of cancer and pneumonia from X-ray images. The combination of high diagnostic accuracy and enhanced transparency is expected to significantly aid medical professionals in making informed decisions, ultimately improving patient outcomes.
Project 4:
Mobile Application for Early Detection of Pneumonia and Tuberculosis Using Advanced Image Analysis
Overview
This project aims to develop a mobile application designed to assist in the early detection of pneumonia and tuberculosis using advanced image analysis techniques. The application will leverage a dataset of chest X-ray images to identify four types of lung diseases: bacterial pneumonia, viral pneumonia, tuberculosis, and COVID-19. The software development will utilize XCode for iOS and Android Studio for Android platforms.
Objectives
Develop a Cross-Platform Mobile Application: Create a user-friendly mobile application available on both iOS and Android that can analyze chest X-ray images.
Implement Advanced Image Analysis Techniques: Utilize state-of-the-art machine learning models to accurately detect and classify lung diseases from X-ray images.
Facilitate Early Detection: Provide an accessible tool for early detection of pneumonia and tuberculosis to improve patient outcomes.
Ensure High Accuracy and Reliability: Validate the application’s accuracy and reliability through extensive testing and expert reviews.
Methodology
Data Collection and Preprocessing:
Dataset: Utilize a comprehensive dataset of labeled chest X-ray images, including categories for bacterial pneumonia, viral pneumonia, tuberculosis, and COVID-19.
Preprocessing: Normalize the images, apply image augmentation techniques to enhance variability, and divide the data into training, validation, and test sets.
Model Development:
Architecture Selection: Choose a suitable convolutional neural network (CNN) architecture known for high performance in image classification tasks.
Training: Train the CNN model using the preprocessed dataset, employing techniques such as data augmentation, dropout, and learning rate scheduling to optimize performance.
Validation: Continuously validate the model with the validation dataset and adjust hyperparameters for optimal results.
Mobile Application Development:
Platform Development: Develop the application using XCode for iOS and Android Studio for Android, ensuring a consistent and user-friendly interface across both platforms.
Integration with Model: Integrate the trained machine learning model into the mobile application, enabling real-time analysis of X-ray images.
User Interface: Design an intuitive user interface that allows users to easily upload X-ray images and receive diagnostic results.
Performance Evaluation:
Accuracy and Metrics: Evaluate the model’s performance using accuracy, precision, recall, F1 score, and AUC-ROC.
Testing: Conduct extensive testing of the mobile application to ensure it operates smoothly and provides accurate diagnostics.
Deployment and User Training:
Deployment: Deploy the mobile application on both the App Store and Google Play Store.
User Training: Provide comprehensive user guides and tutorials to help users understand how to use the application effectively.
Expected Outcomes
Accurate Disease Detection: A highly accurate mobile application capable of detecting bacterial pneumonia, viral pneumonia, tuberculosis, and COVID-19 from chest X-ray images.
User-Friendly Interface: An easy-to-use application that allows users to upload X-ray images and receive instant diagnostic results.
Improved Early Detection: Enhanced ability for early detection of pneumonia and tuberculosis, leading to better patient outcomes and more timely medical intervention.
Cross-Platform Availability: A mobile application accessible to users on both iOS and Android devices, ensuring broad usability.
Conclusion
By combining advanced image analysis techniques with mobile technology, this project aims to provide an innovative tool for the early detection of pneumonia and tuberculosis. The application will offer a reliable and accessible solution for medical professionals and patients, contributing to improved healthcare outcomes through timely diagnosis and intervention.
Project 5:
Convolutional Neural Network Model to Determine the Location of Microvasculature Structures (Blood Vessels) within Human Kidney Histology Slides
Overview
This project aims to develop a convolutional neural network (CNN) model to accurately identify and locate microvasculature structures, including blood vessels such as capillaries, arterioles, and venules, within human kidney histology slides. Leveraging a dataset of annotated histology slides, the project will utilize Anaconda and Python for software development to create a robust and precise image analysis tool.
Objectives
Develop a CNN Model: Create an advanced CNN model capable of detecting and localizing microvasculature structures in human kidney histology slides.
Achieve High Accuracy: Ensure the model achieves high accuracy in identifying various types of blood vessels through rigorous training and validation.
Facilitate Histopathological Analysis: Provide a valuable tool for researchers and clinicians to aid in the analysis of kidney histopathology.
Utilize Advanced Software Tools: Implement the solution using Anaconda and Python, taking advantage of their powerful data science libraries and tools.
Methodology
Data Collection and Preprocessing:
Dataset Acquisition: Gather a comprehensive dataset of human kidney histology slides, annotated with labels for capillaries, arterioles, and venules.
Image Preprocessing: Normalize the images, apply augmentation techniques to increase variability, and split the data into training, validation, and test sets.
Model Development:
CNN Architecture Design: Design a CNN architecture tailored for detecting small and intricate structures within histology images. Consider architectures known for fine-grained image analysis, such as U-Net or ResNet.
Training the Model: Train the CNN model using the preprocessed dataset, employing techniques such as data augmentation, dropout, and learning rate scheduling to optimize performance.
Validation and Tuning: Continuously validate the model with the validation dataset, fine-tuning hyperparameters to achieve the best possible accuracy and localization precision.
Implementation with Anaconda/Python:
Development Environment: Set up the development environment using Anaconda, ensuring all necessary libraries (e.g., TensorFlow, Keras, NumPy, OpenCV) are installed.
Model Implementation: Implement the CNN model in Python, utilizing libraries like TensorFlow and Keras for model building and training.
Visualization Tools: Use visualization tools such as Matplotlib and Seaborn to analyze model performance and visualize detected structures.
Performance Evaluation:
Accuracy Metrics: Evaluate the model’s performance using metrics such as accuracy, precision, recall, F1 score, and Intersection over Union (IoU) for localization tasks.
Testing: Conduct extensive testing on the test dataset to ensure the model generalizes well to unseen data.
Expert Review: Collaborate with pathologists to review the model’s output, ensuring the detected structures are clinically relevant and accurate.
Deployment and Documentation:
Model Deployment: Prepare the model for deployment in a clinical or research setting, ensuring it can be integrated into existing workflows.
Documentation: Provide comprehensive documentation, including model architecture, training procedures, usage instructions, and performance metrics.
Expected Outcomes
Accurate Detection: A CNN model capable of accurately detecting and localizing microvasculature structures in human kidney histology slides.
Enhanced Histopathological Analysis: A tool that aids researchers and clinicians in analyzing kidney histopathology, potentially leading to better understanding and diagnosis of kidney diseases.
Robust Model Implementation: An efficient and reliable model implemented using Anaconda and Python, complete with thorough documentation for ease of use and integration.
Conclusion
This project aims to advance the field of histopathological image analysis by developing a state-of-the-art CNN model for detecting microvasculature structures in human kidney histology slides. The combination of high accuracy, robust implementation, and clinical relevance is expected to provide a valuable tool for medical research and diagnostics, ultimately contributing to improved understanding and treatment of kidney diseases.
Project 6:
Super-resolution and Image Quality Enhancement Using a Generative Adversarial Network
Overview
This project aims to design a Generative Adversarial Network (GAN) architecture capable of not only enhancing image resolution but also improving overall image quality by addressing issues such as noise reduction and artifact removal. The enhanced images generated by the GAN will be utilized to improve the accuracy of a ResNet18 model in analyzing chest X-ray images. The software development will be carried out using Anaconda and Python, complementing the image analysis capabilities of Project 1.
Objectives
Develop a GAN Architecture: Design a GAN architecture optimized for enhancing chest X-ray images by addressing noise, artifacts, and resolution limitations.
Improve Image Quality: Generate high-quality chest X-ray images that are free from noise, artifacts, and other imperfections, leading to clearer and more informative images.
Enhance ResNet18 Model Accuracy: Utilize the generated images from the GAN to improve the accuracy of a ResNet18 model in analyzing chest X-rays for diagnostic purposes.
Software Implementation: Implement the GAN architecture and image analysis pipeline using Anaconda and Python, ensuring compatibility with existing tools and workflows.
Methodology
Data Collection and Preprocessing:
Dataset Acquisition: Gather a dataset of chest X-ray images with associated labels for various conditions.
Preprocessing: Normalize the images and preprocess them to remove noise and artifacts, ensuring consistency and quality across the dataset.
GAN Architecture Design:
Architecture Selection: Choose a suitable GAN architecture (e.g., DCGAN, StyleGAN) known for generating high-quality images with fine details.
Training: Train the GAN using the preprocessed chest X-ray images, focusing on enhancing resolution while reducing noise and artifacts.
Quality Control: Implement mechanisms to monitor and control image quality during training to ensure that generated images are clinically relevant and accurate.
Image Enhancement:
Noise Reduction: Incorporate techniques for noise reduction, such as denoising autoencoders or Gaussian filters, into the GAN architecture.
Artifact Removal: Implement strategies for artifact removal, such as inpainting or attention mechanisms, to improve image clarity and consistency.
ResNet18 Model Integration:
Image Augmentation: Augment the training dataset for the ResNet18 model with the enhanced images generated by the GAN to improve model generalization.
Fine-Tuning: Fine-tune the ResNet18 model using the augmented dataset to leverage the enhanced image quality for improved diagnostic accuracy.
Software Development with Anaconda/Python:
Environment Setup: Set up the development environment using Anaconda, ensuring all necessary libraries (e.g., TensorFlow, PyTorch, NumPy) are installed.
Implementation: Implement the GAN architecture, image enhancement techniques, and ResNet18 model integration in Python, leveraging existing libraries and frameworks.
Expected Outcomes
High-Quality Image Generation: A GAN architecture capable of generating high-quality chest X-ray images with enhanced resolution, reduced noise, and minimized artifacts.
Improved Diagnostic Accuracy: Enhanced accuracy of a ResNet18 model in analyzing chest X-ray images due to the utilization of the improved image quality generated by the GAN.
Seamless Integration: A seamlessly integrated pipeline for GAN-based image enhancement and deep learning-based analysis, facilitating accurate and efficient chest X-ray diagnostics.
Conclusion
By combining the capabilities of GAN-based image enhancement with deep learning-based image analysis, this project aims to improve the accuracy and reliability of chest X-ray diagnostics. The generation of high-quality images free from noise and artifacts is expected to enhance the performance of the ResNet18 model, ultimately leading to better patient outcomes and more effective healthcare interventions.