Projects
Projects
Feb 2024 - Apr 2024
This project leverages the versatility of Raspberry Pi to create an innovative lost object locator and tracker system. By integrating advanced image processing techniques with real-time coordinate transmission, the system is designed to quickly detect and locate misplaced items. The architecture combines a specialized color tracking algorithm—implemented through updated code in the 'Latest' folder—with precise object localization functionalities. Users can easily update and customize the system by replacing existing modules, such as the color tracking script, to suit specific tracking requirements. This project not only demonstrates practical applications of Raspberry Pi in everyday scenarios but also fosters collaborative development through its well-organized repository structure, enabling continuous improvements and adaptations for diverse tracking challenges.
Jun 2023 - Aug 2023
The project focused on integrating the O3R camera and autonomous vehicle navigation-vision stack, optimizing and verifying the system for obstacle detection in Automated Guided Vehicles (AGVs). The goal was to enhance the obstacle detection capabilities of hospital material handling vehicles by replacing 2D lidar sensors with 3D cameras. The integration process involved understanding the software architecture of the camera and the AGV, testing and validating the system. The project also involved setting up the required software environment, collaborating with the camera manufacturing company, diagnosing and troubleshooting software issues, and improving algorithm performance. The project resulted in the successful integration of the camera, enhanced vehicle capabilities, optimized software stack efficiency, standardized verification and validation protocols, and the use of Agile practices and version control systems for efficient project management.
Feb 2023 - Apr 2023
The "ShuffVision: Exploring the Benefits of Shuffled Position Embeddings in a ResNet-Transformer Architecture for Image Classification" project was conducted on the ImageNet dataset and achieved superior results compared to other ResNet architectures. The study showed that transformers act positionally on images and the researchers designed a novel objective function which combines cross-entropy and KL divergence loss. The objective function was tested on object detection tasks, where localisation and spatial organization are crucial. The study also highlighted that this technique may not work as effectively on classification tasks since they have less to do with localisation or spatial organization. The team leveraged knowledge transfer from the ImageNet classification task to improve the performance of their model on object detection. The novel objective function developed by the team is based on shuffled input shuffled positions and normal input normal position embedding, with a combination of the suppression loss function. The two losses are added with coefficients, and the resulting loss function is used for optimization.
Oct 2021 - Dec 2021
Publication:
Preliminary Report on Optical Coherence Tomography Angiography Biomarkers in Non Responders and Responders to Intravitreal Anti VEGF Injection for Diabetic Macular Oedema. Diagnostics, MDPI
Abstract
Purpose: To identify Optical Coherence Tomography Angiography (OCTA) biomarkers in patients who were treated for Diabetic Macular Edema (DME) with intravitreal anti- Vascular Endothelial Growth Factor (VEGF) injections and compare the Optical Coherence Tomography Angiography (OCTA) parameters between responders and non-responders
Methods: A Retrospective cohort study of 61 eyes with Diabetic Macular Edema (DME) who received at least one intra-vitreal anti-Vascular Endothelial Growth Factor (VEGF)injection were included between July 2017 and October 2020. The subjects underwent a comprehensive eye examination followed by an Optical Coherence Tomography Angiography (OCTA) examination before and after intra-vitreal Anti- VEGF injection. Demographic data, visual acuity, and Optical Coherence Tomography Angiography (OCTA) parameters were documented, and further analysis was done on pre- and post-intra-vitreal anti-VEGF injection.
Results: Out of 61 eyes who underwent Intra-vitreal Anti- VEGF Injection for Diabetic Macular Edema, 30 were responders (Group -1), and 31 were non-responders (Group 2). We found that the responders (Group-1) had a higher vessel density in the outer ring that was statistically significant (P = 0.022), and higher perfusion density was noted in the outer ring (p=0.012) and full ring (P = 0.044) at levels of Superficial Capillary Plexus (SCP). We also observed a lower vessel diameter index in the Deep Capillary Plexus (DCP) when compared to non-responders (p< 0.00).
Conclusion: Evaluation of Superficial Capillary Plexus (SCP) in Optical Coherence Tomography Angiography (OCTA) in addition to Deep Capillary Plexus (DCP) can help better prediction of treatment response and early management in Diabetic Macular Edema.
Jan 2021 - May 2021
IEEE Transactions on Image Processing
[MANUSCRIPT UNDER REVIEW]
Abstract
Optical Coherence Tomography (OCT) imaging is rising for its significant advantages over traditional methods in studying the cross-sections of tissue microstructures. OCT imaging offers in-situ and real-time tissue imaging without the complications associated with excisional biopsy. However, the noise and artifacts induced during the imaging process warrant multiple scans, causing time delays and rendering OCT scan- based medical diagnosis less effective. While minor denoising can still be achieved at a single frame level, the reliability of reconstructed regions in a frame initially affected by artifacts based on single frame data remains a question. As OCT imaging is volumetric (3D) in nature, we propose a Graph-based De- Noising and Artifact Removal network (GraDeNAR) that takes advantage of features from neighboring scan frames. It exploits the local and non-local relations in the neighborhood aggregated latent features to effectively denoise and reconstruct regions affected by noise and artifacts. Qualitative and quantitative analysis of the network’s performance on our rat-colon OCT dataset proves that the network outperforms existing state- of-the-art models. Additionally, the network’s performance is quantitatively validated on other 3D medical and non-medical datasets, demonstrating the network’s robustness in denoising and artifact removal tasks.
Sep 2020 - Dec 2020
This project was developed during the period of Sept 2020 to Dec 2020 under the supervision of Origin Health, Singapore, with a goal to achieve automatic segmentation of fetal ultrasound images through the use of deep learning segmentation methods. In order to train the segmentation model, we needed to generate a dataset consisting of training pairs: input ultrasound images and segmented labels. While input ultrasound images were provided by medical centers, segmented labels needed to be generated manually. To reduce costs, we proposed the idea of creating a GUI for manual image segmentation. We developed two versions of the tool using Pytkinter and PyQt, which allow users to load images for segmentation and achieve multiclass manual segmentation using different colors for each class. The tool also features two types of brushes (solid brush tool and Lazo brush tool), two different shapes of brushes (circular and square), and allows for the adjustment of the histogram of the image. Additionally, the image can be zoomed and measurements can be taken between two points for data collection purposes. Users can register comments regarding the image in the dataset to aid in the removal of poor images. Other features of the tool include the smoothening of segmented regions, the removal of faulty islands of segments with customized sizes, undo and redo options, and the ability to save a checkpoint of the previously segmented image so that segmentation of a series of image data can be done with breaks. The tool also has code to encrypt the data, allowing images to be sent to the user in an encrypted way as a numpy file so that the user can open the tool and load the image without needing to store the images separately. This tool is not restricted to ultrasound images alone and can be used to generate segmented labels for any semantic segmentation project. The tool is available in exe format in the folder.
Aug 2019 - Nov 2019
This project is focused on implementing Augmented Reality technology to create a 3D model that can be rendered over detected ArUco markers. By leveraging the power of OpenCV and OpenGL, the application is able to quickly detect the markers and position a 3D object on top of it, creating an immersive AR experience.
One of the key achievements of this project is the significant optimization of the 3D model rendering time. By applying various techniques, the application has been able to reduce the rendering time by 50%, resulting in a smoother and more seamless AR experience.
To create the 3D object, I utilized the popular 3D modeling software Blender. With its powerful features, I was able to create a visually stunning object that can be seamlessly integrated into the AR experience.
Dec 2017 - Jan 2018
Utility Bill Payment System is an online platform developed using C++ that streamlines the process of paying monthly electricity bills for consumers of the Tamil Nadu Electricity Board (TNEB). The system is designed to provide a seamless user experience, enabling consumers to view their electricity bills online and make secure payments through an integrated e-wallet account.
The primary objective of this project is to simplify and modernize the bill payment process by offering an accessible, efficient, and reliable method for managing utility expenses. Users can easily maintain their e-wallet accounts, deposit funds, and track their transaction and payment histories, ensuring transparency and convenience in every interaction.
Developed with a focus on performance and scalability, this project leverages the strengths of C++ and C to deliver a robust solution tailored to the needs of the power distribution sector. It serves as a practical reference for building similar digital payment systems, combining essential functionalities such as bill viewing, secure transactions, and comprehensive account management into one cohesive system.
Overall, the Utility Bill Payment System exemplifies innovation in digital utility management, providing a valuable tool that enhances the customer experience while paving the way for future developments in online payment solutions for public utilities.
Oct 2023 - Oct 2023
Metaphoria is an innovative web application designed to transform the way readers discover and engage with literature. In an era overwhelmed by digital content, Metaphoria simplifies the process of finding stories that resonate with individual tastes by recommending narratives similar to users' favorites while introducing them to new literary gems. This application serves as a digital companion for avid readers, making the discovery of captivating content both effortless and personalized.
At its core, Metaphoria leverages advanced natural language processing and data analysis techniques to understand user input—whether a memorable line from a beloved story or a brief description of literary preferences. The intelligent recommendation engine then sifts through a vast database of literary works to suggest stories that share similar themes, writing styles, or content. Users can preview snippets of recommended stories, enabling them to make informed choices before diving into full narratives.
Built on a robust technology stack that includes Flask for backend development, the Metaphor Python library for dynamic content matching, and BeautifulSoup for effective web scraping, Metaphoria ensures a seamless and user-friendly experience. The application’s intuitive interface and streamlined navigation empower users to effortlessly explore and engage with a curated collection of literary works.
Overall, Metaphoria redefines the literary discovery process by combining cutting-edge technology with a passion for storytelling, ultimately offering readers a personalized and enriching journey through the world of literature.
Nov 2022 - May 2024
This project was sponsored by DARPA as part of a Point-of-Care Ultrasound (PoCUS) program. The team was one of the five selected in the US for this program. They collaborated with experts and clinicians from UPMC to prepare the dataset.
The main goal of the project was to build an explainable PoCUS AI stack driven by clinician heuristics. The team used techniques such as optical flow maps, pleural ROI selection, region masking, and difficulty scoring to address the challenges posed by limited training video clips.
The team achieved a significant milestone by developing a model that could diagnose pneumothorax in real-time with 88.9% accuracy. They utilized the Temporal Shift Module (TSM) method for 2D CNN video classification to achieve this result. To ensure practicality, the model was designed to be deployable on an iPad, considering the constraints of space and time.
In addition to accuracy, the team also focused on explainability. They leveraged techniques such as gradCAM maps, occlusion sensitivity, and Visual Activation Layers to provide insights into the model's decision-making process.
The team conducted tests to evaluate the impact of attention pooling and gradCAM map aggregation on pneumothorax detection across different pleural ROI. The results were promising, indicating the potential for further improvements in the model's performance.
The project has pending patent(s) and a publication, suggesting the significance of the team's contributions in the field of PoCUS AI.
Overall, this DARPA-sponsored project successfully developed an extensible AI model that incorporated clinician heuristics to diagnose pneumothorax accurately in real-time with limited training data. The project aimed to advance generalized learning for PoCUS AI and contributed to the field's understanding of explainable AI in medical applications.
Jul 2022 - Feb 2023
The project involved using unsupervised contrastive representation learning to detect particles in CryoET data. The team, led by the author, worked with 5 interns and obtained sponsorship from the NSF. They used 3D volume contrastive representation learning and applied a novel input-pair generation scheme. The project achieved a 71.6% AUCROC and an F1 Score of 0.672 in detecting particles in the SHREC 2021 CryoET dataset. The team utilized 3D electron microscopy and edge detection algorithms to identify frames containing particles. Augmentations and contrastive learning techniques were applied to address similarity conflicts. The project demonstrated the potential of this approach to extract biological information from complex imaging data.
Aug 2021 - Dec 2021
Purpose
In curriculum learning, the idea is to train on easier samples first and gradually increase the difficulty, while in self-paced learning, a pacing function defines the speed to adapt the training progress. While both methods heavily rely on the ability to score the difficulty of data samples, an optimal scoring function is still under exploration.
Methodology
Distillation is a knowledge transfer approach where a teacher network guides a student network by feeding a sequence of random samples. We argue that guiding student networks with an efficient curriculum strategy can improve model generalization and robustness. For this purpose, we design an uncertainty-based paced curriculum learning in self-distillation for medical image segmentation. We fuse the prediction uncertainty and annotation boundary uncertainty to develop a novel paced-curriculum distillation (P-CD). We utilize the teacher model to obtain prediction uncertainty and spatially varying label smoothing with Gaussian kernel to generate segmentation boundary uncertainty from the annotation. We also investigate the robustness of our method by applying various types and severity of image perturbation and corruption.
Results
The proposed technique is validated on two medical datasets of breast ultrasound image segmentation and robot-assisted surgical scene segmentation and achieved significantly better performance in terms of segmentation and robustness.
Conclusion
P-CD improves the performance and obtains better generalization and robustness over the dataset shift. While curriculum learning requires extensive tuning of hyper-parameters for pacing function, the level of performance improvement suppresses this limitation.
May 2021 - Aug 2021
Oral Presentation on Bioimaging and Biosignals was given at the IUPESM WC2022
This is a research project focused on developing a novel approach to medical image super-resolution. Our proposed method utilizes a cross-scale neighbor aggregative graph topology with temporal coherence regularization and laplacian constrained convolution kernels (posterior sharpening) to achieve state-of-the-art results.
In our manuscript, which is currently under preparation, we present our methodology in detail, along with the experimental results we obtained. Our approach achieved a peak signal-to-noise ratio (PSNR) of 31.19 and a structural similarity index (SSIM) of 92.33 for an 8x resolution boost, demonstrating its superior performance compared to existing methods.
Sep 2020 - Dec 2020
I developed deep learning architectures for the purpose of image enhancement of ultrasound fetal scans. Through extensive research, I have proposed a generalized solution for enhancement and artifact removal in ultrasound fetal images using deep neural networks (DNN).
In my work, I have experimented with various image enhancement filters and architectures such as U-Net, autoencoders, GANs, and perceptual loss networks. As a result, I have achieved harmonization of ultrasound images with artifacts such as intensity inhomogeneity, contrast loss, speckle noise, shadowing, and reverberation.
Mar 2020 - Jul 2020
Publication:
“CADSketchNet - An Annotated Sketch dataset for 3D CAD Model Retrieval with Deep Neural Networks”®, published by Computers & Graphics, Elsevier, 3D Object Retrieval’21 - Journal Track
Abstract
Ongoing advancements in the fields of 3D modelling and digital archiving have led to an outburst in the amount of data stored digitally. Consequently, several retrieval systems have been developed depending on the type of data stored in these databases. However, unlike text data or images, performing a search for 3D models is non-trivial. Among 3D models, retrieving 3D Engineering/CAD models or mechanical components is even more challenging due to the presence of holes, volumetric features, presence of sharp edges etc., which make CAD a domain unto itself. The research work presented in this paper aims at developing a dataset suitable for building a retrieval system for 3D CAD models based on deep learning. 3D CAD models from the available CAD databases are collected, and a dataset of computer-generated sketch data, termed ‘CADSketchNet’, has been prepared. Additionally, hand-drawn sketches of the components are also added to CADSketchNet. Using the sketch images from this dataset, the paper also aims at evaluating the performance of various retrieval system or a search engine for 3D CAD models that accepts a sketch image as the input query. Many experimental models are constructed and tested on CADSketchNet. These experiments, along with the model architecture, choice of similarity metrics are reported along with the search results.
Aug 2019 - Dec 2019
I worked on a 10 Degree-of-Freedom (DoF) Bipedal Walking Robot, which is designed to serve as a platform for further research on the design and control of humanoid robots.
This project was initiated in 2017, with the primary objective of developing a stable, static walking bipedal robot capable of traversing a plane surface and tolerant to external disturbances. To achieve this goal, we have developed real-time simulations of the walking gaits of the robot, which serve as a testing platform for various gaits, mechanisms, and control algorithms.
The robot was designed in PTC Creo and simulated in MATLAB-SimMechanics. We implemented motion planning and control algorithms for developing the static walking gait, in which the robot is in static equilibrium throughout its motion. We developed mechanisms such as parallelogram linkages, belt and gear drives to actuate the 2 DoF joints at the ankle and hip.
The fabrication of the real-life prototype based on the simulation results is complete. Currently, we are developing the trajectories for the gait of the bipedal robot, as well as control strategies for a static walking gait.
Aug 2022 - Nov 2022
As part of the "Autonomous Delivery of Trauma Care in the Field" project under the Trauma Care in a Rucksack (TRACIR) initiative, I led the Tracir Deform Model team to address the challenges of ultrasound scanning deformation and needle rolling during trauma care. Our team received sponsorship from the Department of Defense to develop a synthetic deformed-mesh generation pipeline to train a physics informed conditional variational autoencoder (cVAE) and model 3D deformation point clouds.
Using a PointNet++ encoder and cycle consistency loss, we were able to train our cVAE with reduced supervision on a mesh dataset synthesized through a pipeline that utilized mechanics modeling via a FEA solver. Our goal was to combat the effect of deformation and rolling, improve false trajectory generation, and compensate for the subsiding impact of these factors in the TRACIR Robot's US-guided needle insertion.
We operated the TRACIR Robot on pigs to collect data, specifically focusing on ruptured vessels caused by injuries imaged using a Fukuda probe. We then segmented these images on blue gel phantoms and pig data to test our model, with future plans to test it on cadavers.
Our team created a custom dataset using Python SOFA software and a FEA solver as there was no readily available dataset for 3D deformations. Our pipeline learned using 3D point clouds of the object deformation, material properties, force, and its point of application to predict a deformed version of the object. We also used Chamfer Distance as the metric to quantify performance.
Overall, our Tracir Deform Model successfully achieved a 4x precision boost in US-guided needle insertion, thanks to our pipeline's ability to model deformation and compensate for it during the surgical procedure.
Jan 2023 - Apr 2023
Course Project - 16824: Visual Learning and Representation - Spring 2023
The "ShuffVision: Exploring the Benefits of Shuffled Position Embeddings in a ResNet-Transformer Architecture for Image Classification" project was conducted on the ImageNet dataset and achieved superior results compared to other ResNet architectures. The study showed that transformers act positionally on images and the researchers designed a novel objective function which combines cross-entropy and KL divergence loss. The objective function was tested on object detection tasks, where localisation and spatial organization are crucial. The study also highlighted that this technique may not work as effectively on classification tasks since they have less to do with localisation or spatial organization. The team leveraged knowledge transfer from the ImageNet classification task to improve the performance of their model on object detection. The novel objective function developed by the team is based on shuffled input shuffled positions and normal input normal position embedding, with a combination of the suppression loss function. The two losses are added with coefficients, and the resulting loss function is used for optimization.
Jan 2022 - Mar 2022
Publication: [Bachelor thesis]
Measurement of Retinal Blood Vessel Fractal Dimensions - 20th Dr EVM Scientific Session
Abstract
Background: Fractals dimension (Df) is the quantification measure of blood vessels branching pattern. The retinal Df is reported to be a sensitive indicator in early vascular changes in various ocular vascular diseases like diabetic retinopathy. To the best of our knowledge, there is no inbuilt method to measure the Df by the commercially available retinal imaging devices.
Aim: The paper describes a new custom method that helps in quantifying Df using the Graphical User Interface.
Method: Sixty-six consecutive clinically normal subjects who underwent retinal images by optical coherence tomography angiography (OCTA; Cirrus 5000, Carl Zeiss Meditec Inc., Dublin, CA) enrolled in the study. The 6x6 mm en-face images of superficial and deep retinal layers image and 512x128 fundus images were used for Df the analysis. The Df was calculated using the newly developed automated algorithm.
Results: The median (IQR) age and axial length (AL) of the study subjects was 22.00 (21.00-34.00) years and 23.26 (22.94-23.84) mm. The median (IQR) Df for retinal fundus, superficial and deep retinal layers were 1.48 (1.47-1.49), 1.25 (1.24-1.28), and 1.27 (1.25-1.29) respectively.
Conclusion: The Df of the retinal vessels can be reliably measured from OCTA-generated images. However, the Df didn’t influence by either aging or AL. There was a weak negative correlation found between fundus Df and Best-corrected visual acuity (r = -0.243, p=0.050) and superficial retina Df and AL (r = -0.284, p=0.021).
Aug 2021 - Sep 2021
Publication:
What is the role of magnification correction in the measurement of macular microvascular dimensions in emmetropic eyes? Ophthalmic Technologies XXXII, SPIE
Abstract
The foveal avascular zone (FAZ), as visualized by optical coherence tomography angiography (OCTA), has distinct parametric characteristics. These metrics can help us understand FAZ variations in various ophthalmic conditions such as diabetic retina, retinopathy of prematurity, glaucoma, and pathological myopia. One of the several factors that influence the accuracy of these measures is the eye's axial length (AXL). Even though the OCTA is designed to image the retina with a standard AXL of 23.95 mm, there is considerable variation even in normal healthy eyes; for example, the average Indian's AXL is 23.34 ± 1.12 mm, which would result in retinal image magnification changes It has been reported that, if the FAZ area is not corrected for AXL, there can be up to a 51.0 % deviation in the measured parameters. Bennett's correction (and variations) are commonly employed to determine axial magnification. This study compares the effects of magnification in emmetropic Indian eyes with and without Bennett's correction. The FAZ dimensions were measured in healthy normal Indian subjects with a mean ± SD of 27.38 ± 11.62 years, AXL 23.40 ± 0.88 mm, and mean spherical equivalent of 0.08 ± 0.24 D using a newly designed automated image processing approach. Our results indicate no need to correct axial length variations over a 23.18 to 24.01 mm range in emmetropic eyes. This implies that any AXL longer than 24.01 mm and smaller than 23.18 mm may require axial magnification correction to precisely measure FAZ parameters.
Feb 2021 - Jul 2021
Publication: FAZSeg: A New Software for Quantification of the Foveal Avascular Zone - Clinical Ophthalmology, Dove Medical Press
Introduction
Various ocular diseases and high myopia influence the anatomical reference point Foveal Avascular Zone (FAZ) dimensions. Therefore, it is important to segment and quantify the FAZs dimensions accurately. To the best of our knowledge, there is no automated tool or algorithms available to segment the FAZ’s deep retinal layer. The paper describes a new open-access software with a Graphical User Interface (GUI) and compares the results with the ground truth (manual segmentation).
Methods
Ninety-three healthy normal subjects included 30 emmetropia and 63 myopic subjects without any sight-threatening retinal conditions, were included in the study. The 6mm x 6mm Angioplex protocol was used, and all the images were aligned with the centre of the fovea. Each FAZ image corresponding to dimensions 420×420 pixels were used in this study. These FAZ image dimensions for the superficial and deep layers were quantified using the New Automated Software Method (NAM). The NAM-based FAZ dimensions were validated with the ground truth.
Results
The age distribution for all 93 subjects was 28.02 ± 10.79 (range, 10.0–66.0) years. For normal subjects mean ± SD age distribution was 32.13 ± 16.27 years. Similarly, the myopia age distribution was 26.06 ± 6.06 years. The NAM had an accuracy of 91.40%. Moreover, the NAM on superficial layer FAZ gave a Dice Similarity Coefficient (DSC) score of 0.94 and Structural Similarity Index Metric (SSIM) of 0.97, while the NAM on deep layer FAZ gave a DSC score of 0.96 and SSIM of 0.98.
Conclusion
A clinician-based GUI software was designed and tested on the FAZ images from deep and superficial layers. The NAM outperformed the device’s inbuilt algorithm when measuring the superficial layer. This open-source software package is in the public domain and can be downloaded online.
Dec 2019 - Mar 2020
This work was published under the title: "Deep Learning Based Muscle Intent Classification in Continuous Passive Motion Machine for Knee Osteoarthritis Rehabilitation" at the 2021 IEEE Madras Section Conference (MASCON): https://doi.org/10.1109/MASCON51689.2021.9563370
An extended version of this paper, titled: Muscle intent-based continuous passive motion machine in a gaming context using a lightweight CNN, is published in the International Journal of Intelligent Robotics and Applications, Springer.
Abstract
Knee-osteoarthritis is one of the most common forms of arthritis that people above age 45 suffer from. Physiotherapy and post-surgery rehabilitation are essential stages of the treatment to gain control over the knees and strengthen muscles around the knees. These are conducted under the guidance of therapists and physicians. Robotic therapeutic tools such as CPM machines cut down the massive expenditure of frequent consultations with physicians. However, the available devices in the market are passive as they do not dynamically adapt to a patient's needs as it follows pre-set functions. In this paper, a novel approach is presented to control and actuate a CPM machine by integrating a deep learning based control strategy using CNNs. EMG and IMU sensors are interfaced with the patient's thigh muscles to classify the patient's intent as three states: forward, backward and rest. For implementing the algorithms, a low cost, ecofriendly alpha prototyped CPM machine is developed. Dataset is collected by performing experiments on three healthy subjects under different conditions. Experimental performance shows the feasibility of this home rehabilitation device and accurate intuitive motion predictions with CNN.