Subrata Mitra
I am a Senior Research Scientist at Adobe Research.
Email: <first_name>.<last_name>@adobe.com
Bio: I am a Senior Research Scientists at Adobe Research. I obtained my PhD in Computer Engineering from Purdue University, West Lafayette. Prior to Adobe, I worked at Intel and Synopsys on new product development and interned at Microsoft Research, AT&T Research and Lawrence Livermore National Labs. I have co-authored 35+ publications in top systems and ML venues and have co-invented in 25+ filed US patents.
I was recently featured in a spotlight blog from Adobe Research:
https://research.adobe.com/news/researcher-spotlight-subrata-mitra-uses-machine-learning-to-make-big-data-systems-more-efficient-and-reliable
Note to prospective PhD interns: I am always looking for PhD students who is interested in interning at Adobe Research, Bangalore, on "Systems for ML" and "Optimizing Big-Data processing using ML". If you are a passionate PhD student, interested in the above topics "and" have co-authored at least one paper in top Systems (SIGMOD/VLDB/NSDI/OSDI/ATC etc. ) or ML/AI/Vision (ICML/Neurips/AAAI/WWW/CVPR/ICCV etc. ) conferences, please drop me an email.
Research Interests: Efficient ML, ML for Systems, Cloud Computing, Approximate Computing.
I explore how machine-learning techniques and different form of approximations techniques can be used to build large-scale and cost-efficient systems for big-data-processing and machine-learning.
NEWS
Paper accepted at PAKDD 2024 (Oral): "ScaleViz: Scaling Visualization Recommendation Models on Large Data" [Adobe Research]
Paper accepted at NSDI 2024: "Approximate Caching for Efficiently Serving Text-to-Image Diffusion Models" [Adobe Research]
Paper accepted at SIGMOD 2024: "R2D2: Reducing Redundancy and Duplication in Data Lakes" [Adobe Research]
Paper accepted at ICML 2023: "Flash: Concept Drift Adaptation in Federated Learning" [Adobe Research + UMass]
Paper accepted at VLDB 2023: "SEIDEN: Revisiting Query Processing in Video Database Systems" [GaTech + Adobe Research]
Paper accepted at ICDCS 2023: "STASH: A comprehensive stall-centric characterization of public cloud VMs for distributed deep learnin" [Penn State + Adobe Research]
Paper accepted in SIGMOD-2023 (Demo): "Fast Natural Language Based Data Exploration with Sample" [Adobe Research]
Paper accepted in AAAI 2023: "Reinforced Approximate Exploratory Data Analysis" [Adobe Research]
Paper accepted in Neurips 2022: "Root Cause Analysis of Failures in Microservices through Causal Discovery" [Adobe Research + Purdue University]
PC/reviewing:
2024: USENIX Annual Technical Conference (ATC), ACM Middleware
2023: USENIX Annual Technical Conference (ATC), AAAI
2022: USENIX Annual Technical Conference (ATC), ACM Middleware, AAAI, COMSNET.
2021: USENIX Annual Technical Conference (ATC), AAAI, ACM Symposium on Edge computing, ACM Multimedia Asia
Publications (Chronologically) :
A* conference (according to CSRankings.org) are in green.
NSDI-2024 "Approximate Caching for Efficiently Serving Text-to-Image Diffusion Models"
PAKDD 2024 (Oral): "ScaleViz: Scaling Visualization Recommendation Models on Large Data"
SIGMOD-2024 "R2D2: Reducing Redundancy and Duplication in Data Lakes"
VLDB-2023 "SEIDEN: Revisiting Query Processing in Video Database Systems"
ICDCS-2023 "Stash: A Comprehensive Stall-Centric Characterization of Public Cloud VMs for Distributed Deep Learning"
ICML-2023 "Flash: Concept Drift Adaptation in Federated Learning"
ACL (Findings)-2023 "Federated Domain Adaptation for Named Entity Recognition via Distilling with Heterogeneous Tag Sets"
[SIGMOD-2023] (Demo) "Fast Natural Language Based Data Exploration with Sample". In: SIGMOD 2023 Demo Track #ML, #Approximate-computing
[AAAI-2023] "Reinforced Approximate Exploratory Data Analysis". In: 37th AAAI Conference on Artificial Intelligence (AAAI) 2022 (Acceptance rate: 19%) #ML-for-systems, #Approximate-computing
[Neurips-2022] "Root Cause Analysis of Failures in Microservices through Causal Discovery" #ML-for-systems
[SIGMOD-2022] (Demo) "Efficient Insights Discovery through Conditional Generative Model based Query Approximation". In: SIGMOD 2022 Demo Track #ML-for-systems, #Approximate-computing
[ACL-2022] "Few-Shot Class-Incremental Learning for Named Entity Recognition". #ML
[AAAI-2022] "Conditional Generative Model based Predicate-Aware Query Approximation". In: 36th AAAI Conference on Artificial Intelligence (AAAI) 2022 (Acceptance rate: 15%) #ML-for-systems, #Approximate-computing
[UCC 2021] "Scheduling ML Training on Unreliable Spot Instances". UCC 2021: 14th IEEE/ACM International Conference on Utility and Cloud Computing In: 14th IEEE/ACM International Conference on Utility and Cloud Computing (DML-ICC Workshop), 2021 #Cloud-computing
[ACM SenSys 2021 + ACM TOSN] "ApproxNet: Content and Contention-Aware Video Object Classification System for Embedded Clients" In: ACM Transactions on Sensor Networks (TOSN), pp. 1-27, 2021 #Approximate-computing
[USENIX ATC 2021] "SONIC: Application-aware data passing for chained serverless applications" In: USENIX Annual Technical Conference (ATC), 2021 #Cloud-computing
[AAAI 2021] "Scheduling of Time-Varying Workloads using Reinforcement Learning" In: 35th AAAI Conference on Artificial Intelligence (AAAI) 2021 (Acceptance rate: 21%) #ML-for-systems
[WSDM 2021] "Data-Sharing Economy: Value-Addition from Data meets Privacy" (Demo Paper) In: Proceedings of 14th ACM Conference on Web Search and Data Mining (WSDM) 2021
[ACM SenSys 2020] "ApproxDet: content and contention-aware approximate object detection for mobiles" In: Proceedings of the 18th ACM Conference on Embedded Networked Sensor Systems (SenSys) 2020 (Acceptance rate: 20.7%) #Approximate-computing
[USENIX ATC 2020] "OPTIMUSCLOUD: Heterogeneous Configuration Optimization for Distributed Databases in the Cloud" In: USENIX Annual Technical Conference (ATC), 2020 (Acceptance rate: 18.7%) #Cloud-computing
[ACM/SIGOPS APSys 2019] "DeepPlace: Learning to Place Applications in Multi-Tenant Clusters" In: Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys), 2019. [Paper] #ML-for-systems
[USENIX HotEdge 2019] "Edge-based Transcoding for Adaptive Live Video Streaming" In USENIX Workshop on Hot Topics in Edge Computing (HotEdge), 2019. [Paper]
[ USENIX ATC 2019 ] "SOPHIA: Online Reconfiguration of Clustered NoSQL Databases for Time-Varying Workload" In: USENIX Annual Technical Conference (ATC), 2019 (Acceptance rate: 19.9%) . [Paper]
[ Middleware 2018 ] "Pythia: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads" In: Middleware 18: The 2018 ACM/IFIP/USENIX International Middleware Conference (Middleware), pp. 1-14, Dec. 10-14, 2018, Rennes, France. ( Acceptance rate: 23.2% ) . [Paper] #Cloud-computing
[ USENIX ATC 2018 ] "VideoChef: Efficient Approximation for Streaming Video Processing Pipelines" In: USENIX Annual Technical Conference (ATC), 2018 ( Acceptance rate: 20.1% ). [Paper] #Approximate-computing
[ Middleware 2017 ] "Rafiki: a middleware for parameter tuning of NoSQL datastores for dynamic metagenomics workloads" In: Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Middleware), December 2017 ( Acceptance rate: 23.5% ) #Cloud-computing
[ CGO 2017 ] "Phase-Aware Optimization in Approximate Computing" (with M. K.Gupta, S. Misailovic, S. Bagchi) In: International Symposium on Code Generation and Optimization (CGO), February 2017 ( Acceptance rate: 22.8% ) #Approximate-computing
[ EuroSys 2016 ] "Partial-parallel-repair (PPR): a distributed technique for repairing erasure coded storage" (with AT&T Research) [ pdf ] In: European Conference on Computer Systems (Eurosys), April 2016, (Acceptance rate: 21%) [ Acceptance rate: 21.1% ]
[ SRDS 2016 ] "Sirius: Neural network based probabilistic assertions for detecting silent data corruption in parallel programs" (with T. Thomas, A. J. Bhattad, S. Bagchi) In: International Symposium on Reliable Distributed Systems (SRDS), October 2016 ( Acceptance rate: 32% )
[ ISSRE 2016 ] “A Study of Failures in Community Clusters: The Case of Conte” (with S. Javagal, A. K. Maji, T. Gamblin, A. Moody, S. Harrell, S. Bagchi) In: International Symposium on Software Reliability Engineering (ISSRE), October 2016
[ DSN 2016 (Fast Abstract) ] "Cluster Workload Analytics Revisited" (with Purdue Research Computing and LLNL) In: International Conference on Dependable Systems and Networks (DSN), June 2016
[ PACT 2015 ] “Dealing with the Unknown: Resilience to Prediction Errors” (with G. Bronevetsky, S. Javagal, S. Bagchi) [ pdf ] In: International Conference on Parallel Architectures and Compilation Techniques (PACT), October 2015, (Acceptance rate: 21.2%)
[ ICAC 2015 ] “ICE: An Integrated Configuration Engine for Interference Mitigation in Cloud Services” (with A. Maji, S. Bagchi) [ pdf ] In:International Conference on Autonomic Computing (ICAC), July 2015, (Acceptance rate: 20.3%) #Cloud-computing
[ WCNC 2015 ] “VIDalizer: An Energy Efficient Video Streamer” (with A. Raha, V. Raghunathan, S. Rao) [ pdf ] In: IEEE Wireless Communications and Networking Conference (WCNC), March 2015
[ Middleware 2014 ] "Mitigating Interference in Cloud Services by Middleware Reconfiguration" (with A. Maji, B. Zhou, S. Bagchi, A. Verma) [ pdf ] In:International Middleware Conference (Middleware), December 2014, (Acceptance rate: 18.8%) #Cloud-computing
[ PLDI 2014 ] "Accurate application progress analysis for large-scale parallel debugging" (with I. Laguna, D. H. Ahn, S. Bagchi, M. Schulz, T. Gamblin) [ pdf ] In: Programming Language Design and Implementation (PLDI), June 2014, (Acceptance rate: 18.1%)
[ SC 2013 ] "Scalable Parallel Debugging via Loop-‐Aware Progress Dependence Analysis" (with I. Laguna, D. H. Ahn, M. Schulz, T. Gamblin, S. Bagchi) [ pdf ] In:Supercomputing Conference (SC), November 2013
[ SRDS 2013 ] "Automatic Problem Localization via Multi-dimensional Metric Profiling" (with I. Laguna, F. Arshad, N. Theera-Ampornpunt, Z. Zhu, S. Bagchi, S. P. Midkiff, M. Kistler, A. Gheith) [ pdf ] In: International Symposium on Reliable Distributed Systems (SRDS), October 2013, (Acceptance rate: 32.8%)
News Features:
[2018] We successfully deployed a scalable, diverse and fair recommendation engine for Behance (an Adobe social network product for creators) that hosts several millions of creative projects and serves several millions creative professionals and users. [Links to NEWS coverages: link1, link2, link3]
[2016] Our research (Eurosys'16) on distributed storage with AT&T Research received VURI award from AT&T and is in Purdue spotlight [ Link ]
[2015] Purdue news features our research on cluster workload analytics! [ Link ]
[2014] Our work is highlighted by LLNL Science & Technology Review magazine with the title “Supercomputing Tools Speed Simulations”. [ Link ]
Patents (Not updated):
Parallel partial repair of storage
Integrated configuration engine for interference mitigation in cloud computing
Self-learning Scheduler for Application orchestration on shared compute cluster
Tenant-Side Detection, Classification, and Mitigation of Noisy-Neighbor-Induced Performance Degradation
Cooperative Platform for Generating, Securing, and Verifying Device- Graphs and Contributions to Device Graphs
I am really fortunate to get to closely mentor these awesome interns, graduate students and research associates.
Nikhil Sheoran (Research Associate at Adobe ) => Graduate student at UIUC => Databricks
Aashaka Shah (undergrad at IIT-Roorkee) => Adobe Research Intern => PhD student at UT Austin => Microsoft Research
Shanka Subhra Mondal (undergrad at IIT-Kharagpur) => Adobe Research Intern => PhD student at Princeton
Pradeep Dogga (undergrad at IIT-Kharagpur) => Adobe Research Intern => PhD student at UCLA
Ran Xu (PhD student at Purdue University, advised by Prof. Saurabh Bagchi) => Adobe Research Intern => NVIDIA
Ashraf Mahgoub (PhD student at Purdue University, advised by Prof. Saurabh Bagchi and Prof. Somali Chaterji)
Piyush Bagad (undergrad at IIT-Kanpur) => Adobe Research Intern => Wadhwani AI => University of Oxford
Sheng Yang (PhD student at University of Maryland, College Park, advised by Prof. Samir Khuller) => Adobe Research Intern
Ayush Chauhan (undergrad at IIT-Roorkee) => Adobe Research Intern => Research Associate, Adobe Research => Microsoft