Amitabha Roy

I am a founding engineer at My primary area of interest is machine learning - I enjoy building machine learning systems to solve challenging problems at scale. At Kumo I am working on applying graph neural networks to modern data stacks.


amitabha dot roy at gmail dot com




Neural Image Captioning:

X-Stream (single machine graph processing):

Chaos (distributed graph processing):


Hybrid MPNN-PPR Graph Convolutional Network (filed)

Automatic Detection And Mitigation Of Denial Of Service Attacks

Hardware apparatuses and methods for memory performance monitoring

Community Service

PC member for EuroMLSys 2021.

PC member for USENIX 2020.

PC member for SYSTOR 2018.

PC member for EuroDW 2018.

ERC member for ASPLOS 2018.

PC member for WWW 2017, systems track.

Sponsorship co-chair for Eurosys 2017.

PC member for HPGP 2016.

PC member for Eurosys 2015.

Co-organized a Dagstuhl workshop on "Systems and Algorithms for Large Scale Graph Analytics".

Invited talk on the Post-Doctoral experience at EuroDW2014.

I am a reviewer for Computing Reviews.

Publications [Google Scholar DBLP]

Siavash Samiei, Nasrin Baratalipour, Pranjul Yadav, Amitabha Roy, and Dake He, Addressing Stability in Classifier Explanations. Bigdata 2021

Subramanya R Dulloor, Amitabha Roy, Zheguang Zhao, Narayanan Sundaram, Nadathur Satish, Rajesh Sankaran, Jeff Jackson, Karsten Schwan.

Data Tiering in Heterogeneous Memory Systems. Eurosys 2016

Amitabha Roy, Laurent Bindschaedler, Jasmina Malicevic, Willy Zwaenepoel

Chaos: Scale-out Graph Processing from Secondary Storage. SOSP 2015

Jiaqing Du, Amitabha Roy, Calin Iorgulescu, Willy Zwaenepoel.

GentleRain: Cheap and Scalable Causal Consistency with Physical Clocks. SoCC 2014

Karthik Nilakant, Valentin Dalibard, Amitabha Roy, Eiko Yoneki.

PrefEdge: SSD Prefetcher for Large-Scale Graph Traversal. SYSTOR 2014

Amitabha Roy, Ivo Mihailovic, Willy Zwaenepoel.

X-Stream: Edge-centric Graph Processing using Streaming Partitions. SOSP 2013

Jiaqing Du, Sameh Elnikety, Amitabha Roy, Willy Zwaenepoel.

Orbe: Scalable Causal Consistency using Dependency Matrices and Physical Clocks. SoCC 2013

Amitabha Roy, Steven Hand, Tim Harris.

Weak Atomicity for the x86 Memory Consistency Model, Journal of Parallel and Distributed Computing, 72(2012) [preprint]

Amitabha Roy, Steven Hand, Tim Harris.

Hybrid Binary Rewriting for Memory Access Instrumentation, VEE 2011

Amitabha Roy, Steven Hand, Tim Harris.

A Runtime System for Software Lock Elision, Eurosys 2009

Amitabha Roy, Stephan Zeisset, Charles J. Fleckenstein,John C. Huang.

Fast and Generalized Polynomial Time Memory Consistency Verification, CAV 2006

Amitabha Roy, K. Gopinath.

Improved Probabilistic Models for 802.11 Protocol Verification, CAV 2005

Short Papers/Early Ideas

Christopher Schmidt, Markus Dreseler, Berkin Akin, Amitabha Roy

A Case for Hardware Supported sub-Cacheline Accesses. DAMON 2018

Ehsan Totoni, Subramanya R. Dulloor, Amitabha Roy

A Case Against Tiny Tasks in Iterative Analytics. HotOS 2017

Jiaqing Du, Calin Iorgulescu, Amitabha Roy, Willy Zwaenepoel.

Closing the Performance Gap between Causal Consistency and Eventual Consistency, PaPEC 2014

Jasmina Malicevic, Amitabha Roy, Willy Zwaenepoel.

Scale-up Graph Processing in the Cloud: Challenges and Solutions, CloudDP 2014

Amitabha Roy, Timothy Jones.

ALLARM: Optimizing Sparse Directories for Thread-Local Data, DATE 2014

Eiko Yoneki, Amitabha Roy.

Scale-up Graph Processing: A Storage-centric View, GRADES 2013

Amitabha Roy, Steven Hand, Tim Harris.

Exploring the Limits of Disjoint Access Parallelism, HotPar 2009

Technical Reports

Amitabha Roy, Subramanya Dulloor. Cyclone: High Availability for Persistent Key Value Stores, Arxiv 2017

Eiko Yoneki, Amitabha Roy, Derek Murray. Systems and Algorithms for Large-scale Graph Analytics. (Dagstuhl Seminar 14462).

Eiko Yoneki, Amitabha Roy. A Unified Graph Query Layer for Multiple Databases, UCAM-CL-TR 2012

Amitabha Roy. Memory Hierarchy Sensitive Graph Layout, Arxiv 2012

Past Associations

[2019-2021] Software engineer at Google, Canada. I led a team focused on applying machine learning to abuse fighting in the Ads Ecosystem. In particular, I led the successful deployment of graph neural networks and explainable AI to the abuse fighting use-case.

[2017-2019] Software engineer at Google, US. I led a small team that builds machine learning models to protect Google's ads ecosystem from malicious advertisers. In the past, I led a team that designed and built the first machine learning driven DDoS protection solution for Google Cloud.

[2015-2017] Research scientist at Intel Labs. My work at Intel Labs was in the area of non-volatile memory. I described a lot of the work and other interesting facets of developing software for non-volatile memory at QCon 2017.

[2012-2015] Post-doc at LABOS in EPFL, where I was primarily responsible for the X-Stream and Chaos graph processing systems and contributed to the Orbe geo-replicated key value storage system.

[2011-2012] Post-doc in the Computer Architecture Group at the University of Cambridge Computer Laboratory, where I worked on a technique to adapt directory controllers for thread-private data with Timothy Jones. Simultaneously I also worked with Eiko Yoneki in the Systems Research Group on ways to mitigate IO stalls when processing large graphs - a collaboration that later became the Graphcam project.

[2011] Short stint at Acunu where I worked on analysing the performance of their filesystem.

[2007-2011] PhD student in the Networks and Operating Systems (NetOS) research group at the Computer Laboratory in the University of Cambridge. I was supervised by Steven Hand and Tim Harris. My PhD thesis, titled "Software lock elision for x86 machine code", argues for separation of mechanism and policy in the context of software transactional memory. It presents the design and implementation of a system that can be used to automatically elide legacy locks in x86 machine code. Transactional memory therefore becomes an optional mechanism for synchronization, with legacy locks or atomic blocks implemented using legacy locks being used to specify synchronization policy. A talk I've given describes many of the basic ideas.

[2004-2007] Design Engineer at Intel where I worked for some time on memory consistency verification and later did performance modeling and analysis of CPUs. Among other things, I designed a parallel algorithm to verify memory consistency test results using graphs, which is widely used in post-silicon validation at Intel.

[2002-2004] Master's degree student in computer science from the Computer Science department at IISc Bangalore. While there, I worked in the Computer Architecture and Systems Laboratory under Prof. K. Gopinath on probabilistic timed automata for 802.11 networks.

[1998-2002] Undergraduate engineering student at Jadavpur University in the Department of Computer Science and Engineering.