Experience
Experience
SambaNova Systems
Machine Learning Engineer
Jun 2024 - Present
Palo Alto, California, United States
Machine Learning Infrastructure Team (Manager: Vaidehi Venkatesan) & Multimodality Team (Manager: Anand Sampat) - SambaNova Systems
Drive high-priority model bring-ups & urgent customer requests, as well as ML pipeline automation & infra efficiency initiatives, collaborate on cross-layer projects, accelerating end-to-end development velocity on Reconfigurable Dataflow Units (RDUs), deploying LLMs training & inference—for developers, national laboratories & private enterprises within critical deadlines over Cloud & On-Prem.
Notable Work:
• Crafted LLM-based synthetic dataset (charts & tables) generation pipeline & prompt-engineered specialized chat templates, enhancing LLaVA & Llama3.2 11B & 90B RDU pre-training & fine-tuning accuracy on datasheets and enabled Agentic AI systems, integrating multimodal embedding-based RAG pipelines for document intelligence.
• Built MLOps automation pipelines for internal workflows—including app version migration, gated unit/integration testing & CI/CD processes—accelerating developer productivity & deployment efficiency.
• Unified cloud and on-prem infrastructure enabling 2× faster model delivery and developed training engine for SambaNova Cloud with multi-tenant LoRA fine-tuning on BYOCed checkpoints, incorporating custom kernel design achieving faster inference at 1/4th compute.
• Orchestrated SambaNova's Bring Your Own Checkpoint automation framework supporting 100+ open-source vision models to RDU-compatible checkpoint conversion with greedy & multinomial sampling legalization smoke tests, reducing batched model delivery time in scaled production by 1.5x.
• Architected RDU-compatible optimized modelling code & designed training & inference L0, L0+, L1, L2 unit test automation infrastructure for model response consistency checks & benchmarking LLM-as-a-judge metrics, TTFT & throughput at 6x model bring-up velocity over SambaNova Cloud, SambaStudio on-prem (2 chip generations - SN30 & SN40L), Mixture of Experts, & SambaNova ModelZoo for Llama, Tulu, Whisper, Qwen, and DeepSeek.
• Crafted an LLM-based synthetic dataset (charts & tables) generation pipeline & prompt-engineered specialized chat templates, enhancing LLaVA and Llama3.2 (11B & 90B) RDU pre-training & fine-tuning accuracy on datasheets and enabling Multimodal RAG for Document Intelligence.
• Contributed to and end-to-end tested SambaNova ModelZoo's developer experience, an open-source repository with RDU-compatible model code & applications, enhancing community engagement by 60% and building platform trust.
• Served as mentor for the SambaNova x 42 Abu Dhabi – Agentic AI Hackathon, organized the AI Tinkerers – Palo Alto Meetup, and won the Cloud API Referral Contest—earning a bonus for driving API adoption among maximum developers.
JBT Corporation
Software Engineering Intern
Jun 2023 - Aug 2023
Chalfont, Pennsylvania, United States
R&D Vehicle Team - JBT Automated Systems (Manager: Vijay Chhabria) in collaboration with ifm, Oppent and Ohio State University
Project:
1. O3R Camera and Navigation-Vision Stack Integration, Optimization, Verification & Validation for Automated Guided Vehicle Obstacle Detection System Deployment
• Orchestrated integration of O3R camera with autonomous vehicle baseline navigation-vision C++ stack on edge VPU device
• Enhanced vehicle capabilities by enabling 3D point cloud probability map-based obstacle detection, elevating previous 2D lidar-based system
• Optimized vehicle software stack efficiency by achieving a 2x reduction in latency through identification and elimination of architectural inefficiencies, code flow bottlenecks, and logical errors in backend components
• Liaised with ifm team to diagnose and troubleshoot camera-related software issues
• Generated comprehensive client requirements and incorporated essential features into .Net framework vehicle configurer emulation software, ensuring compatibility with the integrated software stack
• Migrated software architecture documentation from Visio, leveraging industry-standard tools: Doxygen and Graphviz
• Standardized robust verification & validation protocols, including thorough code unit testing
• Employed Agile practices, utilizing Confluence and Jira, to ensure seamless collaboration and efficient project management throughout the development process
• Demonstrated proficiency in version control systems, such as Git, to effectively manage software development iterations and facilitate collaborative coding
• Leveraged CI/CD pipelines, Docker, and GitHub Actions to streamline the software development lifecycle, enabling continuous integration, testing, and deployment for enhanced productivity and quality assurance
Stealth Startup
Machine Learning Engineer
Apr 2023 - Sept 2023
Singapore, Singapore
• Providing technical consulting expertise to a startup operating in stealth mode, specializing in the development and application of machine learning techniques.
• Collaborating closely with the startup's team to design and implement innovative machine learning algorithms and models, tailored to address specific business challenges and objectives.
• Offering strategic guidance on data collection, preprocessing, feature engineering, and model selection to ensure optimal performance and accuracy.
• Participating in regular meetings and presentations to share insights, and make data-driven recommendations to drive decision-making.
Carnegie Mellon University - School of Computer Science
Graduate Student Researcher
Sep 2022 - May 2024
Pittsburgh, Pennsylvania, United States
Biomedical Imaging Guidance Lab & Biorobotics Lab - The Robotics Institute
(Co-advised by Dr John Galeotti & Dr Howie Choset for my Master's Thesis)
Projects:
1. Heuristics-Guided AI-Powered Point-of-Care Ultrasound (PoCUS) Interpretability and Explainability:
• Selected as one among five teams in the US by DARPA for sponsored PoCUS program.
• Collaborated with experts and UPMC clinicians alike for the dataset preparation.
• Built an explainable PoCUS AI stack driven by clinician heuristics such as optical flow maps, pleural ROI selection, region masking, and difficulty scoring to tackle the challenges of limited training video clips.
• Designed two paradigms to incorporate heuristics and experimented on identifying significance of each: stacked heuristics as input channels and CLIP model's contrastive network to align input frames and their heuristics in common embedding space.
• Diagnosed pneumothorax in real-time with 88.9% accuracy, leveraging Temporal Shift Module (TSM) method for 2D CNN video classification.
• Leveraged gradCAM maps, occlusion sensitivity and Visual Activation Layers for explainability.
• Ensured model deployability on an iPAD, accounting for space and time constraints.
• Pending patent(s) and a publication.
2. Modelling Tissue Scanning Deformation and Needle Rolling for TRAuma Care In a Rucksack (TRACIR) Robot:
• Received sponsorship from the DoD.
• Architected synthetic deformed-mesh generation pipeline to train a physics-informed PointNet++ cyclic conditional variational autoencoder and model 3D deformation point clouds.
• Attained 4x precision boost in US-guided needle insertion, compensating with modelled deformation, subsiding the impact of scanning deformation and needle rolling in TRACIR Robot.
• Operated the surgical robot on pigs for data collection.
Carnegie Mellon University - School of Computer Science
Graduate Research Assistant
Jul 2022 - Feb 2023
Pittsburgh, Pennsylvania, United States
Xu Lab - Computational Biology Department (Advised by Dr Min Xu)
Project:
1. Contrastive Unsupervised Representation Learning for Cellular CryoET Particle Detection
• Obtained sponsorship from the NSF.
• Led 5 interns for SHREC unsupervised CryoET particle detection task with 3D volume contrastive representation learning and k-means clustering & cosine similarity heat maps, employing novel input-pair generation scheme.
• Attained 71.6% AUCROC and F1 Score of 0.672.
University of Waterloo
Computer Vision Intern
Feb 2021 - May 2022
Waterloo, Ontario, Canada
Theoretical and Experimental Epistemology Lab - School of Optometry and Vision Science (Advised by Dr Vasudevan Lakshminarayanan) in association with Sankara Nethralaya Eye Hospital, Chennai, India (Advised by optometrist cum researcher Dr Janarthanam Jothi Balaji)
Projects:
1. "FAZSeg: A New Software for Quantification of the Foveal Avascular Zone", published by Clinical Ophthalmology, Dove Medical Press [Lead Author]
• Developed open-source clinician-friendly app: 'FAZSeg' to estimate 15 OCT-A metrics, improving system accuracy by 2x in the prognosis of ophthalmic conditions.
• Achieved SOTA results: Superficial Layer (F1: 0.94, SSIM: 0.97) & Deep Layer (F1: 0.96, SSIM: 0.98).
• Organized double-blind clinical trials of FAZSeg on 93 normal subjects (30 emmetropes and 63 myopics).
2. "What is the Role of Magnification Correction in the Measurement of Macular Microvascular Dimensions in Emmetropic Eyes?", published by SPIE Photonics West'22, Ophthalmic Technologies XXXII [Lead Author]
• Researched the influence of the eye's axial length on the accuracy of OCT-A metrics.
• Identified the role of the Bennett correction factor in measuring macular microvascular dimensions in emmetropic eyes.
3. "Preliminary Report on Optical Coherence Tomography Angiography Biomarkers in Non Responders and Responders to Intravitreal Anti VEGF Injection for Diabetic Macular Oedema", published by Diagnostics, MDPI
• Investigated statistical significance of OCT-A biomarkers, predicting response to treatment of 96 DME patients.
4. "Measurement of Retinal Blood Vessel Fractal Dimensions", presented at 20th Dr EVM Scientific Session [Bachelor thesis]
• Quantified fractal dimension in retinal fundus & OCT en-face images and probed the influence of age & axial length
National University of Singapore
Undergraduate Research Assistant
Jan 2021 - Apr 2022
Singapore, Singapore
Medical Mechatronics Lab - Department of Biomedical Engineering - College of Design and Engineering (Advised by Dr Hongliang Ren) in partnership with Chinese University of Hong Kong, Shangdong Qilu Hospital, and Imperial College London.
Projects:
1. "Paced Curriculum Distillation with Prediction and Label Uncertainty for Image Segmentation", published by IJCARS
• Introduced a novel paced-curriculum distillation that integrates prediction and annotation boundary uncertainty to pace the student model training for adapting image-guided robotic intervention segmentation tasks at 2.5x latency reduction.
• Proved our method's robustness by studying the trained model's performance on inputs augmented with different types of corruptions and perturbations with varying severity.
• Validated the model's performance qualitatively and quantitatively on: MICCAI18 instrument segmentation challenge dataset (F1: 0.6644, IoU: 0.6237) and Breast Ultrasound dataset (F1: 0.789, IoU: 0.719).
2. "GraDeNAR: Graph-based DeNoising and Artifact Removal Network for Optical Coherence Tomograph", under review by IEEE Transactions on Image Processing [Lead Author]
• Designed dual-net framework, 3x enhancing OCT scans at 9x speed with novel spatiotemporal graph architecture with gaussian constrained convolution kernels for denoising & artifact removal in medical images containing multiple artifacts (SOTA; PSNR: 32.47, SSIM: 94.89)
• Devised augmentation algorithms to imitate multiplicative speckle noise, hyperreflective, motion and shadow artifacts to transform clean OCT scans for training set preparation.
3. "CLEARNESS: Cross-scaLe tEmporAl gRaph NEtwork for Super-reSolutions", Oral Session - Bioimaging and Biosignals @ IUPESMWC'22
• Proposed a cross-scale neighbour aggregative graph topology with temporal coherence regularization and laplacian constrained convolution kernels (posterior sharpening) for medical image super-resolution (SOTA; PSNR: 30.92, SSIM: 89.12 for 8x resolution boost).
Origin Health Pte. Ltd.
Deep Learning Engineer Intern
Aug 2020 - Dec 2020
Singapore, Singapore
Deep Learning and Data Team (Advised by Dr Sripad Krishna Devalla, Co-Founder & CTO), funded by Entrepreneur First.
Projects:
1. Harmonization of fetal ultrasound
• Experimented and compared standard enhancement filters & architectures such as U-Net, Autoencoders, GANs and Perceptual Loss Networks for image harmonization in fetal screening.
2. Ultrasound Segmentation GUI
• Prototyped in-house PyQt and PyTkinter-based tissue delineation GUI.
Indian Institute of Technology, Madras
Undergraduate Research Intern
Mar 2020 - Jul 2020
Chennai, Tamil Nadu, India
Advanced Geometric Computing Lab (Advised by Dr Ramanathan Muthuganapathy)
Project:
1. “CADSketchNet - An Annotated Sketch dataset for 3D CAD Model Retrieval with Deep Neural Networks”®, published by Computers & Graphics, Elsevier, 3D Object Retrieval’21 - Journal Track
• Curated 'CADSketchNet' dataset of 801 hand-drawn (Dataset - A) & 58,696 computer-generated sketches (Dataset - B) from ESB & MCB datasets across 42 & 68 categories, respectively, ideal for developing search engines for 3D CAD models.
• Analysed numerous computer sketch generation methods for Dataset - B synthesis and performed quality checks on Dataset - A.
• Tested on five sketch-query-based 3D CAD Model Retrieval methodologies: Histogram of Oriented Gradients, Auto Encoders, Stacked Auto Encoder, 3D CNN, and Siamese Network and benchmarked 'CADSketchNet' at 97.5% accuracy in 4e-5 sec retrieval time.
Robotics and Machine Intelligence
Vice President Research & Development
Aug 2019 - May 2022
Tiruchirappalli, Tamil Nadu, India
Technical Responsibilities (Projects):
1. Machine Assisted Rehabilitation of Knee Osteoarthritis [MARKO] (Advised by Dr Ezhilarasi Deenadayalan)
• Lead-authored my first publication: "Deep Learning Based Muscle Intent Classification in Continuous Passive Motion Machine for Knee Osteoarthritis Rehabilitation", published by IEEE MASCON'21.
• Fabricated a bio-inspired continuous passive motion machine for rehabilitation of patients with knee osteoarthritis, controlled and actuated by processed EMG and IMU signals from thigh muscles: hamstrings and quadriceps, and classified intents with our hypothesized lightweight CNN.
• Experimented device on three healthy subjects and achieved a classification accuracy of 97.4% and a prediction rate of 1140 samples/sec.
• An extended version of this paper, titled: Muscle intent-based continuous passive motion machine in a gaming context using a lightweight CNN, is published in the International Journal of Intelligent Robotics and Applications, Springer.
2. Humanoid Robot System [HuRos]
• Constructed a 10 DoF life-size biped from scratch.
• Designed PCB schematics for the control unit using EasyEDA. • Simulated static walking with linear inverted pendulum tracking on Simulink.
3. Augmented Reality (AR) Application with ArUco Markers
• Optimized AR 3D model rendering time over detected ArUco markers by 50%.
4. Mini projects
• Object detection, tracking and speed measurements in occluded environments.
• 2 DoF manipulator maze solver.
• Invisible cloak.
• Painting app replica.