Education

1. Integrated MS+PhD. in Computer Science:

Sept 2011 - April 2017

GPA: 4.0/4.0

Coursework: Data Mining, Design and Analysis of Algorithms, Web Search, Mining, and Integration, Special Topics in Advanced Database Systems, Distributed Computing, Data Analysis and Modeling, Advanced Data Mining, Data Exploration, Advanced Combinatorial Models and Algorithms, BigData with Apache Spark (Berkeley Edx)

Projects:

  • Distributed Graph Mining: In this project we have used map/reduce to iteratively find concepts/substructures of varied sizes in a large single graph and then analyze which of these substructures can help us achieve the best compressibility in the graph using the minimum description length principle. This resulted in 2 publications in BDA 2013 and DaWak 2015. An optimization over the existing algorithms that reduces mining response time and storage cost by reducing duplicate generation has also been published in TKDE 2018. Cost analysis of the same is under review in a special issue in Journal of Information Sciences, 2018.
  • Graph Query System: In this project we developed a graph search system over the DBLP authorship and citation graphs to find out the top k potential collaborators for a group of people. In case of absence of exact matches inexact matches were found which abided by a goodness metric. Querying on large distributed graphs using parallel paradigm is an ongoing part in this project. Partitioning strategies for answering single as well as a stream of queries is also underway. This was developed as a class project. A cost analysis of partitioned graph querying has been accepted as a full paper in DaWak 2018.
  • Classifying Wikipedia Articles : In this project we developed a system to assign infoboxes to Wikipedia pages which do not have an infobox. We used Wikipedia features: words, categories and entities to create a 337 class setup. SVM classifiers were used to assign one of these 337 templates to Wikipedia pages without infoboxes. The work was published in CIKM 2012.
  • Graph Catalog Generator: In this project, a notion of graph catalog (similar to DBMS catalog) was developed and used to discover query plan for faster query answering and evaluation. Experimental results show query plans using graph catalog heavily outperform traditional graph querying techniques. This work was published in DaWak 2016 conference.

Find a copy of my resume here

2. Bachelor in Computer Science and Engineering

Jadavpur University, Kolkata

2005-2009

CGPA: 8.67 out of 10

Relevant courses: Data Structure, Design and Analysis of Algorithms, Operating Systems, Image Processing, Artificial Intelligence,

Abstract Mathematics, Bio-Informatics

3. Higher Secondary Examination ( West Bengal Board)

South Point High School

2003-2005

National Scholarship holder. Obtained a percentage of 93.3%

4. Secondary or Madhyamik Examination ( West Bengal Board)

South Point High School

2001-2003

National Scholarship holder. Obtained a percentage of 92.25%