Sanjay Das
sanjay@sanjaydas.com | 732-443-0838
Long Branch, NJ, USA | US Citizen
Blog | LinkedIn | Github | Stackoverflow
Dedicated and results-driven Engineer/ Leader with over a decade of experience in architecting and implementing robust data platforms and AI/ML solutions. Skilled in leading cross-functional teams to deliver high-quality results within deadlines. Adept at leveraging cutting-edge technologies for scalability, performance, and reliability.
Education: M. Sc (Tech) Computer Science, First Class
Birla Institute of Technology and Science (BITS), Pilani, India
Technical Skills:
Programming Languages: Java, Scala, Python, C++
Big Data: Spark, Kafka, Hadoop, Clickhouse, Databricks, Dremio, Airflow
Cloud Platforms: AWS, Microservices, Kubernetes
Databases: Oracle, MySQL, ElasticSearch, Vector Databases
AI/ML: LLMs, RAG, Vector Embeddings, Fine-tuning, Hugging Face, LangGraph
Other: AdTech, FinTech, HealthTech
Work Experience:
Principal Engineer, TD Securities, New York, NY (May 2025 - Now)
Optimized GenAI app data pipelines for inference-time requirements, ensuring efficient data delivery and processing.
Evaluated and proof-of-concept tested MCP servers from multiple vendors to determine best-fit solutions.
Designed and implemented robust authentication and authorization solutions to secure sensitive data access using Unity Catalog.
Developed versatile data delivery solutions, supporting multiple formats (Delta, Iceberg, Parquet) and channels (Databricks, Dremio, etc.) for diverse consumer needs.
Director / Staff Engineer - Data Engineering, AdMarketplace, New York, NY (October 2020 - Nov 2024)
GenAI Backend:
Evaluated and Fine-tuned vector embedding models
Created data ingestion pipelines for Apache Lucene Vector Database
Developed microservices for embedding generation
Implemented RAG pipelines with KNN-search capabilities
Design, build, and maintain core tools and Data products.
Improve and optimize data pipelines for scalability, performance, and reliability.
Lead the migration from legacy Data Warehouse to modern Data Lake, Warehouse, and Data Mart.
Develop proof-of-concept/reference implementations for components and pipelines.
Consultant, Cigna/CVS-Health, New York, NY (Remote), October 2019 - September 2020
Developed, deployed, and monitored a Big Data platform for internal/external access to Med/Rx Claims.
Handled high-volume data transactions and resolved throughput/latency issues.
Provided mentorship to junior engineers and conducted code reviews.
Principal Software Engineer, ETrade, New Jersey/New York, April 2017 - October 2019
Engineered a Big Data platform for internal/external access to various data sets.
Managed near-live market data and ingested event streams from external vendors.
Developed Java Kafka-KStream modules and provided mentorship to junior engineers.
Earlier Roles:
Consultant, New Jersey/New York, June 2014 - January 2017
Designed and developed solutions for the ingestion and processing of massive batch data for CitiBank.
Managed and maintained consumer-facing Android/iOS apps and responsive web front-end for esPronto.
Cofounder/CTO, New Jersey/India, Apr 2009 - January 2014
Architected and Developed an on-line Stock Trading and cloud based Algo Trading System that can spawn off an Amazon EC2 container with private access to Market Data Cloud, Virtual Exchange and suite of Algorithms.
Software Engineer: Archeus Capital, New York, NY, Morgan Stanley, New York, NY, Wit SoundView (Start-up Broker), New York, NY, Goldman Sachs, Princeton, NJ