2021-2022 PRojects


Project Title: Making Big Data Small: Sketching Graph Distances


Professor: Hung Le


Lab/Research Group: Theory Group

Members of the Theory Group use mathematical techniques to study problems throughout computer science. Our work is extremely diverse -- it includes graph and network algorithms, randomized approximation algorithms, streaming algorithms, combinatorial optimization, computational geometry, dynamic algorithms and complexity, model checking and static analysis, database theory, descriptive complexity, parallel algorithms, and computational complexity theory. Members of the Theory Group wear other hats as well and collaborate throughout the department and the world beyond. Every week we all get together for a research seminar, featuring either one of our own professors or Ph.D. students, or a visitor from another institution. You can find more about the group and the weekly seminars at https://groups.cs.umass.edu/theory/



Graphs are mathematical objects used to model entities, called vertices, and the relationships between entities called edges. Social networks are examples of a graph: vertices are people, and edges indicate friendships. In modern applications, graphs are massive: Facebook has billions of users, the Internet has billions of websites linked to each other. A principled way to deal with a massive graph is to remove edges/vertices of the graph while (approximately) preserving some important properties. The output obtained after removing edges/vertices of the input graph is called a graph sketch. The idea is that the analysis can be applied to the sketch, which is smaller and has the same property as the input graph, instead of the massive input graph.

In this project, we will look at algorithmic techniques to obtain a graph sketch while (approximately) preserving distances between every pair of vertices of the graph up to an error factor. There are often trade-offs between the error factor and the size of the sketch. A higher error factor implies a smaller sketch. Determining the best possible trade-offs is significant in both theory and practice. This project's subject is three classes of graph sketching problems: multiplicative spanners, additive spanners, and distance preservers. The projects have three stages:

  1. Survey literature on sketching graph distances and understand the design and analysis of basic algorithmic techniques.

  2. Identify common techniques successfully applied to these problems and barriers underlying open problems of the field. Ideally, the project should make progress on some special cases of open problems.

  3. Investigate the practical performance of known theoretical algorithms on real-world graphs.