Massive Graph Processing

About the project

Relationships between many discrete objects can be represented using a very flexible abstraction called Graph. Graphs have been adopted widely, and are basic building blocks of wide range of problems, from scientific computing to social networks, to World Wide Web. This trend has resulted in increased traction in Graph processing for solving real-world problems, both from academia and industry. For example, recent statistics show that there are 342 million active users on Twitter and World Wide Web graph contains more than 4.75 billion pages and 1 trillion URLs. Processing such large graphs is non-trivial due to the nature of graphs themselves.

Popular Graph algorithms such as Breadth-First Search (BFS), Single Source Shortest Path (SSSP), Betweenness Centrality (BC), PageRank, Connected Components (CC), Collaborative Filtering (CF), etc., are used in many of Graph Analytics.

Most of these algorithms have: Intensive Data Access, Irregular Computation, Poor locality, and High Data Dependency. These characteristics result in performance heavily relying on memory design. In this project we will consider working of such algorithms from architectural perspective, with specific focus on memory hierarchy and look for optimizations.


Objectives of the project

Below multi-dimensional exploration of Memory Optimizations considering Graph applications:

  • Improve memory bandwidth utilization.

  • Request aware memory controller optimizations for improved memory access time.

  • Tailored DRAM based main memory organization for Graph applications.

  • Use of emerging non-volatile memories for processing-in-memory.

Typical path of exploration

  • Infrastructure:

    • Work with System level simulators like Sniper/Gem5 along with specialized simulators like DRAMSim2, NVSim, etc.

    • Various Graph benchmarks and workloads from real-world equivalents like Graph500, SNAP, GAP, etc.,

  • In-depth analysis of these benchmarks' memory access patterns.

  • Including the state-of-the art optimizations in the infrastructure and analysis.