Class Timing: Wednesday (13.30-14.30), Thursday (14.45-16.15), and Friday (16.30-17.00) at C305
Self-study Timing: Friday (17.00-18.00)
Mid Sem Examination: 23rd September, 2024
Project Problem Formulation: 20 September, 2024
Project Presentation & Implementation: between 12 November, 2024 and 22 November, 2024 at F221
End Sem Examination: 19th November, 2024
3-0-0-4-4
Brief introduction to the field of Information Retrieval (IR).
Explanation of the importance of IR in various domains, including search engines, digital libraries, and data mining.
Week 1:
Lecture 1: Overview; Foundations of Information Retrieval; Source: CS54701 Information Retrieval by Clifton
Lecture 2: Indexing and Querying; Boolean Retrieval; Source: Chapter 1 & 2, Manning, Raghavan and Schütze
Self-Study 1: Web Crawling
Week 2:
Self-Study 2: Indexing and Searching using pySolr
Week 3:
Lecture 3: Text Encoding; Source: Chapter 2, Manning, Raghavan and Schütze
Lecture 4: Dictionaries and Tolerant Retrieval; Source: Chapter 3, Manning, Raghavan and Schütze
Self-Study 3: Boolean Retrieval using Python
Week 4:
Lecture 5: Index Construction; Source: Chapter 4, Manning, Raghavan and Schütze
Lecture 6: Index Compression; Source: Chapter 5, Manning, Raghavan and Schütze
Self-Study 4: Vector Space Retrieval using Python
Week 5:
Lecture 7: Scoring, Term weighting; Vector Space Model; Source: Chapter 6, Manning, Raghavan and Schütze
Week 6:
Lecture 8: Computing Scores; Source: Chapter 7, Manning, Raghavan and Schütze
Lecture 9: Evaluation in Information Retrieval; Source: Chapter 8, Manning, Raghavan and Schütze
Week 7:
Midsem
Week 8:
Lecture 10: Relevance Feedback and Query Expansion; Source: Chapter 9, Manning, Raghavan and Schütze
Week 9:
Lecture 11: XML Retrieval; Source: Chapter 10, Manning, Raghavan and Schütze
Self-Study 5: Probabilistic Retrieval using Python
Week 10:
Lecture 12: Probabilistic Information Retrieval; Source: Chapter 11, Manning, Raghavan and Schütze
Week 11:
Lecture 13: Language Models; Source: Chapter 12, Manning, Raghavan and Schütze
Self-Study 6: Classification in Retrieval using Python
Week 12:
Lecture 14: Text Classification; Source: Chapter 13, Manning, Raghavan and Schütze
Week 13:
Lecture 15: Distributed Representations; Source: Chapter 14, Manning, Raghavan and Schütze
Week 14:
Lecture 16: Learning Ranking; Source: Chapter 15, Manning, Raghavan and Schütze
Project Presentation
Week 15:
Endsem
Introduction to Information Retrieval, by C. Manning, P. Raghavan, and H. Schütze (Cambridge University Press, 2008).
Search Engines: Information Retrieval in Practice. Croft, W. Bruce; Metzler, Donald; Strohman, Trevor. Addison Wesley (2008)
Information Retrieval: Implementing and Evaluating Search Engines, Stefan Buettcher, Charles L. A. Clarke, Gordon V. Cormack. MIT Press. (2010)
Modern Information Retrieval, Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Addison-Wesley, (1999)
Project: Problem Formulation (10%), Midterm Exam (30%), Project: Presentation (10%), Project: Implementation (20%), Endterm Exam (30%)
The project aims to conduct research on contemporary topics in Information Retrieval (IR), with a strong preference for research publications.
Collaboration is encouraged, with groups limited to a maximum of four students. Groups are required to email their composition, project title, and an abstract before the midsem.
It is recommended that projects include a dedicated web page.
A comprehensive report detailing the project's objectives, methodology, findings, and conclusions must be submitted before the end of the semester. You can find the latex template and detailed guidelines for report preparation here.
A presentation will be scheduled before the endsem, during which projects will be evaluated based on criteria including the quality of work, originality, clarity of presentation, and the content of the report. You can find the presentation template here.
Any form of plagiarism will be subject to strict penalties.