Major Focus: Data Engineering (Algorithms & Systems), Machine Learning, Computer Vision, NLP
Related Focus: Distributed Systems, Scalable System Design, Software Engineering (Backend), Databases (SQL, NoSQL), Graph Algorithms & Databases, Approximate Algorithms, Information-Retrieval, AI Fairness.
Mini Focus: DevOps, Cloud, Optimization, Visualization, Management (Organizational, Project, Leadership)
Dynamic, independent, self-motivated, adaptable, and transformational engineering professional with demonstrated skills in building high-quality Artificial Intelligence, Data Sciences, and Analytics products and solutions. Extensive industrial experiences in engineering scalable Machine & Deep Learning, Natural Language Processing, and Computer Vision-based products and solutions. Advanced exposure in software and cloud engineering. Expert in unveiling deep insights from messy, complicated data and in building sophisticated data engineering pipelines supporting end-to-end automation, self-serve analytics. Highly skilled in project management, communication, pair-programming, and in working with distributed cross-functional teams. A demonstrated track record of surpassing challenging expectations and orchestrating professionalism of the highest quality makes me a competitive applicant fit for a suitable role in engineering teams.
My research interests are in engineering advanced intelligent solutions (using Machine/Deep Learning, Natural Language Processing, Computer Vision, and Reinforcement Learning) geared towards large-scale adoption and usage. Besides, I have serious proficiency in Optimization Programming, Predictive and Prescriptive Analytics, and Security Engineering.
Mission statements:
Move things at light speed in organizations when things sometimes move at a snail's pace.
Working with totally abstract ideas and shaping them into reality.
Drive crisp, clear, and candid communication at all times.
Fail, but fail fast. Never hesitate to accept failures and learn to correct shortcomings.
Build deep expertise in the industry, organization, and people.
AI Tools: TensorFlow, PyTorch, Keras, Caffe, CNTK, MXNet, LightGBM, XGBoost, Scikit-Image
Big Data & Analytics: Apache (HDFS, MapReduce, Beam, Spark, Kafka, Sqoop, Zeppelin, Solr, Pig, Flink, Storm, Kinesis, Zookeeper, Ranger, Atlas, Ambari), Elastic (Search, Kibana, Logstash), Pentaho, Snowflake
Cloud Platforms: Advanced (AWS, Azure, Databricks), Intermediate (Google)
DevOps Stack: Docker, Kubernetes, CircleCI, Spinnaker, Jenkins, GitLab CI, Git, Dependency Management (Maven, Gradle, Ant), Schedulers (Airflow, Azkaban, SSIS, Luigi)
Software Languages: Scala, Java, Python, C++, Go, MATLAB, Haskell, Perl, Ruby
Databases: Relational (Postgres, Redshift, Teradata, Hive, Kudu, Microsoft SQL Server, Oracle, IBM DB2, MySQL), Distributed (Presto, Druid), Document (MongoDB, CouchDB, Cosmos DB), Wide Column (HBase, Cassandra), Graph (Apache Giraph, Neo4J, Redis Graph), Redis.
Optimization Programming: CVXOPT, CVXPY, IBM CPLEX, GUROBI
General: OpenCV, Cuda, MPI, H5Py, Valgrind, LLVM Sanitizers , Google Dataflow, Scalding, Mockito, NumPy, SciPy, Scikit-Learn, Akka, PIL/Pillow, Mahout, Verilog, VHDL, GraphQL, SSIS, SSAS, SSRS, MDX, Spring Framework (MVC, Boot, Data, Mobile, Batch), Scrum (Agile), Kanban.
Open Source Contributions: Apache Camel, Apache Mahout, Apache Dubbo, Apache PredictionIO, Databricks Delta Lake
Drove platform engineering innovations for decentralized Intelligent Pricing Hierarchy platform with state-of-the-art high-frequency time series and recommendation models.
Independently and in-teams supported production for critical AI models deployed across the globe, hybrid cloud.
Architected, Engineered several key modules in the in-house chaos automation delivery platform with capabilities to run 100,000 concurrent large-scale experiments with ultra-low latency.
Actively collaborated, drove meetings for formulating novel use-cases and new data platforms initiatives across geographies, business units.
Contributed to full-stack architecture, engineering of deep anomaly segmentation, detection algorithms. Lead engineering, adoption of the associated data engineering pipelines, and self-serve dashboarding solutions.
Lead architecture, production of early threat modeling solutions for the in-house ingestion analytics platforms. Contributed to product experimentation, metrics development. Lead architecture, production of intelligent inventory management solutions, and drove key-experimentation efforts in engineering KPIs, metrics. Independently architected, supported pre-production of automated asset model collations, metrics.
Contributed to architecture, production of real-time deep NLP solutions with the implementation of novel, production-grade VADER algorithm & Multi Attention Network, based aspect-level sentiment modeling.
Actively collaborated in use-case formation/enrichment, data modeling & acquisition across geographies, business verticals. Remarkable proficiency in time-critical product engineering deliveries. Awarded with promotion, recommendations for driving consistent excellence in engineering.
Drove several units' architecture developmental efforts in hybrid (cloud, on-premises) data sciences & engineering, analytics (predictive, prescriptive), DevOps, and full-stack software engineering. Extensive production experience in owning four end-to-end critical DS engineering solutions as a full-stack Data Scientist.
Lead architecture, engineering of novel distributed NLP system for smart-home, scheduling & delivery big-data platform supporting storage, analytics, and Data Sciences of media services. Actively contributed to software engineering, cloud-based cloud-architectures (including DevOps, Networking, Security), and novel voice-based user interfaces.
Successfully lead the production of three full-stack intelligent solutions and in-production experimentations. Assisted as product-owner, subject-matter expert in stakeholder management and novel use-case modeling across teams, business verticals, geographies.
Collaborated in new architecture development and characteristics studies for new-products and in-development utilities. Lead development of novel product metrics for commercial smart-home products.
Lead architecture, engineering of NLP-based summarization platform (abstract, extractive) for text, audio data, instance & semantic segmentation-based auto-tagging utility for video contents.
Lead architecture, engineering of novel data platform - StreamFlux with key-driver of data engineering, sciences utilities, and full-stack solutions: data-quality, data warehousing, disaster recovery, job scheduler, auto-feature.
Co-lead in production engineering of mission-critical solutions deliveries, post-production engineering support for early identification, resolution of critical bugs, metrics. Engineered novel product metrics for new feature development, optimization with both open-source, commercial tools in the Apache, Snowflake, & Databricks ecosystems.
Actively collaborated with business in driving sales engineering across Americas, Europe. Frequently recognized as a star-contributor in the development team and awarded several additional responsibilities, promotions. Enriched exposure in full-stack data science product engineering, management.
In collaboration with Prof. Hui Guan.
Novel research area at the intersection of Systems for Machine Learning and Reinforcement Learning.
Preparing to submit to prestigious ICML 2021
In collaboration with Oracle Labs, Massachusetts.
Accepted to prestigious ACM WSDM 2021 (Acceptance Rate: <10%).
In collaboration with the Federal Bureau of Investigation (FBI) and UMass Cybersecurity Institute.
Advisor: Prof. Brian Levine
Advisor: Prof. Subhransu Maji (in COMPSCI 670)
Advisor: Prof. Erik Learned-Miller (in COMPSCI 682)
Coursework : Big Data Systems & Algorithms, Edge-Computing, Hardware Accelerators, Information Security, ML Systems
Research Collaborations : FBI (696E), Oracle Labs (696DS), Independent Research (682, 670, 590U, 692S, 692M, 697LS).
Concentration: Distributed Systems, Wireless Computing and Internet of Things
Relevant Coursework: Core Computer Sciences (Electives: Computer Networks, Software Engineering, HCI), Core Electronics (Electives: Embedded Systems, VLSI Design, Wireless Communication, Advanced Hardware Design).
Predictive Analytics & Statistical Modelling based solutions with R, SAS, Tableau, Hive, Teradata
Spring Framework , Java EE, Play, Hibernate, AJAX, jQuery, JSP, IBM DB2, AngularJS, NodeJS, and IBM Bluemix
A/B Testing, SEO, SEM, Analytics [Web/Mobile], Ad Platform Management [Facebook Ad Manager, Google Analytics & Google AdWords]
LAMP and MEAN Stack, CodeIgniter, AJAX, Javascript, Kotlin, Swift, and Java.