Medha Atre, Ph.D.
Medha Atre, Ph.D.
email: firstname.lastname at gmail.com (use my first and last name)
BitMat
BitMat is a system, originally developed as a part of my Ph.D. thesis. The proposed algorithms and system indexes an RDF graph using compressed bit-vectors, and processes SPARQL Basic Graph Pattern queries, using a novel 2-phase query processing algorithm. The algorithm gives tighter upper bounds on the memory consumption than the conventional join query processors. See the source code (Git and Sourceforge) of the project.
Relevant publications:
Medha Atre: The Case of SPARQL UNION, FILTER and DISTINCT, TheWebConf (The World Wide Web conference) 2022, PDF (acceptance rate: 17.7%).
Medha Atre: Algorithms and Analysis for the SPARQL Constructs (PDF).
G. Singh*, D. Upadhyay*, Medha Atre: Efficient RDF Dictionaries with B+ trees, CoDS-COMAD 2018 (PDF).
Medha Atre: For the DISTINCT clause of SPARQL queries, WWW Posters Track, 2016 (PDF).
Medha Atre: Left Bit Right: For SPARQL Join Queries with OPTIONAL Patterns (Left-outer-joins), SIGMOD 2015 (PDF).
Medha Atre, V. Chaoji, M. J. Zaki, J. A. Hendler: Matrix "Bit"loaded: A Scalable Lightweight Join Query Processor for RDF Data, WWW 2010 (PDF) (Presentation) (BitMat source code).
G. T. Williams, J. Weaver, Medha Atre, J. A. Hendler: Scalable Reduction of Large Datasets to Interesting Subsets, Journal of Web Semantics (Special Issue: Science, Services and Agents on the World Wide Web), 2010 (winner of the 2009 Billion Triple Challenge, ISWC, October 2009) (Paper)
Medha Atre, J. A. Hendler: BitMat: A Main Memory Bit-matrix of RDF Triples, in SSWS workshop at ISWC 2009 (PDF).
Medha Atre, J. Srinivasan, J. A. Hendler: BitMat: A Main-memory Bit Matrix of RDF Triples for Conjunctive Triple Pattern Queries, ISWC Poster and Demo track, October 2008 (PDF)Â (first runner up among 85 poster/demos)
Graph Path Queries
With the advent of the web, graphs have become richer where edge labels represent the type of relationship between two nodes which are connected by that edge. Exploring paths in the graphs has been a well studied problem in the context of data like XML, but general purpose graphs, it is a hard problem due exponential number of possible paths. E.g., the RDF graph of DBLP data with 13 million edges and 5 million nodes has more than 1025 distinct paths. In our work, we have focused on a different type of path pattern and constrained reachability queries.
Relevant publications:
Medha Atre, V. Chaoji, M. J. Zaki: BitPath -- Label Order Constrained Reachability Queries over Large Graphs, (PDF).
Opensource contributions:
Source Code for BitMat (Ph.D. thesis work)
Reverse engineering MSN Messenger's old file transfer protocol (Look up right at the end of the page)
Neon WebDAV client library updates (Search for my name)