Design, query, and evaluate information retrieval systems
Competency E represents the bridge between information and the user. They manifest as user interfaces; the most visible of which are search engines. From library information retrieval systems to artificially intelligent, natural language processing, information retrieval is the history and future of the information professions. Historically, information retrieval systems (IRS) were used by specialists, now they are used by everyone to accomplish most digital tasks: finding an e-mail sent from a specific company, locating an invoice, searching for an image or You Tube video or searching for a book passage on the internet. Everyone is using information retrieval systems. Competency E is the most important aspect of bringing data to users because it moves information from storage to user.
A firm understanding of information retrieval is important to me because being proficient in the design, query and evaluation of IR systems means being relevant and adaptable. Librarianship is connecting users to information, therefore understanding the tools used to bridge information from source to user is central. Understanding the underlying structure and query rules of information retrieval systems means will understand changes to and advances in IR systems and be able to communicate those changes to users.
My industry-specific experience using IRSs centers on legal platforms such as Westlaw and Loislaw, which had poor natural language variant capability in their early incarnations. For example, I remember that a query like, “Intellectual Freedom Legislation”, would not yield a list of landmark legislation, as specialized knowledge or prior knowledge of the specific federal codes were needed. The natural language searches yielded false hits. We used the physical collection to identify cases and codes then entered them into IRS platform. Now these functions have been updated. Today natural language, keywords with Boolean operators and citation tools are available on legal research platforms. What is different from the early days of legal IRSs is results are pulled from many mashed data sources and the search functions are malleable and powerful.
My other pre-MLIS experience with Competency E was designing a searchable database for a junior college's collection. I researched library cataloging software, but none met my budget requirements at the time, so I trained the Federal Work Study Students (FWSS) to enter the complete collection into an Excel spreadsheet. When finished, I loaded it onto the student workstations. At student orientations, I trained students to use the Find/Replace Text field to search for keywords. During subsequent semesters, the FWSSs and I added the table of contents for each non-code resource. It worked very well because we used keywords from the curriculum that were relevent to the user, topic and professor. However, the system was tedious to maintain. Now free apps do this and more on cell phones: Just scan the barcode! I recently had the pleasure of testing these free indexing and retrieval apps for an immersion school. The platforms I tested could accommodate the romance languages, but not Mandarin and Farsi.
Course material from LIBR 202 and INFO 242 neatly tracked the three-prong competency. I evaluated an OPAC, its indexing mechanisms, interface design, search functions, collection and effectiveness. Both classes used a team assignment to teach database design, though the database management assignment was far more in depth than the former. Both databases required query writing. I read Introduction to Modern Information Retrieval by G. Chowdhury, 2010, and discussed the text with classmates.
LIBR 202; Information Retrieval System Design introduced me to the file structure of IR systems. Then, the course covered inverted file and keyword indexing followed by Boolean logic. From there, we branched out to full text-based information retrieval and relevance judgement. We studied interface design and evaluation. The assignments provided practice for applying these concepts to various information retrieval systems including several online public access catalogs. The course required a team assignment: Design a database that recorded attributes of artwork for an online charity auction. First, my team created descriptive records for the artwork: Size, media, artist, subject, style, price range and color scheme. We populated tables and tested the retrieval system with some simple queries. Next, we used SQL to write more complex queries that answered multistep questions like, of the paintings that are less than $1,000, how many are indexed as “large”? The query writing process greatly benefitted from the beginning of class theory training.
LIBR 202 included several assignments and many text readings on IR evaluation. For example, I evaluated the University of Wyoming and Colorado University OPACs. This process began by challenging the query function with natural language and Boolean phrases. I performed searches for a variety of content using all the tools available on the OPAC. The tests led to a critique of the design and interface. The last step was to contrast the schools’ approaches to their OPACs’ design, query and information retrieval functions. I remember being surprised by the University of Wyoming’s new, simplified interface. It was direct, modern and efficient. The coursework shows my understanding of entity-based information retrieval system design, query and evaluation.
Click on the link below to view a sample of the design and query preparation.
Coursework for INFO 242 centered on a group project: Build a relational database for a hypothetical business; Happy Face Dolls. My team designed and implemented a relational database in the Oracle database management system. We learned about database management systems, database administration, and database querying with SQL. The class provided incremental individual experience practicing each concrete skill before the group project. I included a link to a SQL query practice exercise below. The group project’s first step was to create a list of entities and their values. Then, we deliberated over the relationships of the entities, which resulted in an entity-relationship model. Then we designed a table showing those relationships in a hierarchy. Next, the entity-relationships had to be normalized. Normalizing means making sure the entities are unique and specifying the nature of the relationships, e.g. one to one, many to many (which requires a new entity), one to many or many to one. Then, we learned SQL query writing concepts and put them into practice in Oracle. The coursework supports my declaration of competency designing relational databases and writing queries. Moreover, it shows my eagerness to learn new skills and apply patience to unfamiliar problems.
Click in the link below to view a sample of the entity-relationship model, entity hierarchy and query practice exercise described above.
The design, query and evaluation coursework taught me about the complexity and specificity that underlies information retrieval systems. Having to rework relational structures, syntax, vocabulary and queries repeatedly taught me to slow down and re-view lectures and reread the course texts. I think the training will be most useful to my information literacy classes because I can teach students to form better queries through their awareness of IRS functions e.g. full text searching, Boolean or improved Boolean operators and spelling variants. The query writing practice will improve my overall reference desk performance, especially speed because I will recognize IRS structure immediately and adjust the queries accordingly. Although many library districts in my region employ teams of specialists for these tasks, I am comfortable providing interface, query and design evaluation.