Unit I
Basic Concepts of IR, Data Retrieval & Information Retrieval, IR system block
diagram. Automatic Text Analysis, Luhn's ideas, Conflation Algorithm,
Indexing and Index Term Weighing, Probabilistic Indexing, Automatic
Classification. Measures of Association, Different Matching Coefficient,
Classification Methods, Cluster Hypothesis. Clustering Algorithms, Single Pass
Algorithm, Single Link Algorithm, Rochhio's Algorithm and Dendograms
(8 Hrs.)
Unit II
File Structures, Inverted file, Suffix trees & suffix arrays, Signature files, Ring
Structure, IR Models, Basic concepts, Boolean Model, Vector Model, and
Fuzzy Set Model. Search Strategies, Boolean search, serial search, and clusterbased
retrieval, Matching Function
(6 Hrs.)
Unit III
Performance Evaluation- Precision and recall, alternative measures reference
collection (TREC Collection), Libraries & Bibliographical system- Online IR
system, OPACs, Digital libraries - Architecture issues, document models,
representation & access, Prototypes, projects & interfaces, standards
(6 Hrs.)
Unit IV
Taxonomy and Ontology: Creating domain specific ontology, Ontology life
cycle
Distributed and Parallel IR: Relationships between documents, Identify
appropriate networked collections, Multiple distributed collections
simultaneously, Parallel IR - MIMD Architectures, Distributed IR – Collection
Partitioning, Source Selection, Query Processing
(8 Hrs.)
Unit V
Multimedia IR models & languages- data modeling, Techniques to represent
audio and visual document, query languages Indexing & searching- generic
multimedia indexing approach, Query databases of multimedia documents,
Display the results of multimedia searches, one dimensional time series, two
dimensional color images, automatic feature extraction.
(8 Hrs.)
Unit VI
Searching the Web, Challenges, Characterizing the Web, Search Engines,
Browsing, Mata searchers, Web crawlers, robot exclusion, Web data mining,
Metacrawler, Collaborative filtering, Web agents (web shopping, bargain
finder,..), Economic, ethical, legal and political issues..
(6 Hrs.)
Text Books :
1. Yates & Neto, "Modern Information Retrieval", Pearson Education, ISBN 81-297-0274-6
2. C.J. Rijsbergen, "Information Retrieval", (www.dcs.gla.ac.uk)
3. I. Witten, A. Moffat, and T. Bell, “Managing Gigabytes”
4. D. Grossman and O. Frieder “Information Retrieval: Algorithms and Heuristics”
Reference Books :
1. Mark leven, “Introduction to search engines and web navigation”, John Wiley and sons
Inc., ISBN 9780-170-52684-2.
2. V. S. Subrahamanian, Satish K. Tripathi “Multimedia information System”, Kulwer
Academic Publisher
3. Chabane Djeraba, ”Multimedia mining A highway to intelligent multimedia documents”,
Kulwer Academic Publisher, ISBN 1-4020-7247-3