Previous year offerings: Fall 2021, Fall 2022, Fall 2023, Fall 2024; Course feedback
Complex data can be represented as a graph of relationships between objects. Such networks are a fundamental tool for modeling social, technological, and biological systems. This course focuses on the computational, algorithmic, and modeling challenges specific to the analysis of massive graphs. By studying the underlying graph structure and its features, students are introduced to machine learning techniques and data mining tools that are apt to reveal insights on a variety of networks.
Why is Graph Mining important?
In the LLM era and beyond, graphs are emerging as a critical component of modern AI-driven systems. Many leading organizations are already investing heavily in this direction, for example:
Graph Foundation Models at Google,
Geometric Deep Learning for Neural Artifacts at NVIDIA
HOW DO LARGE LANGUAGE MODELS UNDERSTAND GRAPH PATTERNS? by Microsoft
REASONING OF LARGE LANGUAGE MODELS OVER KNOWLEDGE GRAPHS WITH SUPER-RELATIONS by MIT, Univ of Virginia.
And many more.
The global research community is also emphasizing on importance of graph-inspired research in domains such as NLP, GenAI, and LLMs, by organizing global events such as:
Learning on Graphs conference (LoG 2025)
Stanford graph workshops,
NeurIPS workshop
These ecosystems offer a wide range of research and internship opportunities (PhD/Postdoc/UG Thesis) worldwide in academia and industry.
Lectures: Mon/Wed 11 AM -12 PM and Tues 5:00 PM- 6:00 PM in room 6104.
Lab sessions: Sat 3:00 PM - 5:00 PM in room 6116.
Office contact hours: Saturday, 12:00 PM to 1:00 PM.
Course Handout: CS F426 - Graph Mining
WhatsApp group for students:
Please join the WhatsApp group by clicking the following link: https://chat.whatsapp.com/EqwrpC4I7AbEtwgcoPRUAw
Course Plan and material→ Details of learning outcomes and topics to be covered.
Previous years material→ Lecture notes or Supplementary material of previous years
Assignments and Labs→ Weekly lab assignments , Group Assignment
Announcements→ Announcements related to lectures, labs etc.
Programming in Python: All lab assignments will be in Python (using Numpy and Pytorch).
A basic knowledge of probability and statistics, matrix/vector notation, and operations.
Foundation of Machine Learning: Don’t worry if you’re new—having a little prior knowledge of ML concepts will make the journey smoother, and you’ll pick things up quickly as the course progresses.
A short quiz on ML concepts to brush up on your concepts: Try Quiz
A few resources to brush up on ML concepts are given below:
Supervised learning: NPTEL video, Notes by Google
Unsupervised learning: NPTEL video
Classification: Video
Clustering:
K-Means: Video
❓ Frequently Asked Questions
● Why is this a 4-credit course?
This course falls under the category of advanced machine learning. The combination of rigorous theory, regular weekly programming labs, and semester-long research-style projects significantly broadens the learning outcomes compared to those of a standard 3-credit elective. In addition, the course was intentionally designed as a 4-credit offering to make it suitable and accessible for postgraduate students.
● Is it harder than other 3-credit elective courses?
In my opinion, not. Most CS electives, including many 3-credit courses, have already adopted similar assessment components—a combination of theory, lab work, and a semester-long project. This is essential to ensure that students gain exposure not only to theoretical concepts but also to practical implementation and real-world problem-solving skills.
In that sense, this course follows a balanced evaluation structure, allowing students to leverage both their theoretical understanding and coding skills. Students who have a strong interest in either of these aspects generally find the course quite manageable and often perform very well. The difference here is not difficulty, but rather the breadth and depth of engagement with the subject.
● Do I need prior knowledge?
Basic Class 12 mathematics (linear equations, differentiation, probability) is sufficient. A basic understanding of machine learning concepts is also beneficial. Resource links covering fundamental machine learning concepts are provided in the Prerequisites tab above.
● Which programming language is used?
Python. Familiarity with arrays, matrices, and basic data structures is enough. Libraries such as NetworkX and SciPy will be introduced during the course.