Competency E

Design, query, and evaluate information retrieval systems

Statement of Competency

Introduction

In today's world, "googling" something has become a popular way of retrieving all types of information. If we need to find any sort of information, we are basically trained now to just "google it." While Google may be a helpful tool sometimes, it's important to understand the innerworkings of information retrieval systems when we need to use more research-based retrieval systems. When I think of information retrieval systems, my mind automatically thinks of databases. Since beginning my college career at the community college level, I have used scholarly databases extensively. Even in my current work experience, I use legal databases on an almost daily basis. I have even done teaching about databases both in my school work and in my work experience. Of course, an information retrieval system doesn't only include databases, but also includes websites, search engines, catalogs, and even social media tools. This specific competency has us look at three elements in regards to information retrieval systems; design, query, and evaluation.

Design

As information professionals, we have to know the overall design of information retrieval systems (IRS) in order to teach, use, and evaluate them. Essentially, the main goal of any IRS to aggregate (i.e., retrieve all desired items) and discriminate (i.e., retrieved only desired items) (Weedman, 2018). While keeping that main goal in mind, the design process can begin. The first step of IRS design is generally referred to as "requirements analysis" (Weedman, 2018). In this step, two different types of information gathering and data analysis occur; first, data is gathered from stakeholders on what form the new IRS should take (Weedman, 2018). Stakeholders include anyone who has a "stake" in the new IRS such as professionals, management, governance bodies, users, and anyone else who cares about the outcome (Weedman, 2018). The second type of information gathering involves investigating what users specifically need from the IRS and what they need it to do (Weedman, 2018). Existing industry standards should also be considered in this step by determining if a requirement of the IRS "must conform to industry standards" (Weedman, 2018). This first stage of the design process is all about gathering as much information as possible and analyzing it rigorously to move on into the concept design phases.

The second and third phases of the design process include the "concept design" step and the "functionality and interface design" step. Within the concept design step, the set of requirements and data gathered from the requirements analysis will be used to begin to "sketch", or create some kind of model of what the potential IRS will look like (Weedman, 2018). The sketch itself may be a 3-D model, a list of terms, a drawing, a verbal description, or a Word document describing fields and values (Weedman, 2018). The type of sketch used depends on the type of IRS being designed. Once a concept model is in place, focus then turns towards functionality and interface design. In this step, designers will begin working on a prototype of what users will "see" on screen and "how" they will use it (Weedman, 2018). Generally some initial testing may occur in this step.

The fourth, fifth, and sixth steps of the design process each deal with the "prototype implementation", "design implementation", and the "system enhancement and evolution" (Weedman, 2018). With the prototype implementation, an initial working, but incomplete, model of the IRS may be unveiled and used for beta testing (i.e., allowing users to test the system under real-life conditions) (Weedman, 2018). Feedback from the prototype implementation will add in further development of the IRS leading to the fifth step "design implementation" where the full system will be implemented (Weedman, 2018). After implementation, the sixth, and final, step of the design process begins, system enhancement and evolution. In this ongoing step, the IRS goes through constant evaluation and enhancements (Weedman, 2018). Bugs are worked out, updates are created, and additional features and improvements are added in this step (Weedman, 2018). Generally, this last step is ongoing and constant as no IRS design is entirely perfect.

Query

When thinking about searching using different IRSs, it's important for an information professional to understand the search process and the differences between them. I was fortunate enough to have taken Info-244 (Online Searching) which had us use and search from numerous databases. In this class, we experimented with the different levels of searching including basic (i.e., using the "Google" or "big white box"), advanced (i.e., using a series a rows with connectors and fields), and command line (i.e., using sets to build complex queries in which the searcher specifies fields to use) (Steiner, n.d.). In Info-244, we used all three methods when completing assignments, as well as other methods that can be used when planning your search.

When considering search strategies, it's important for an information professional to know various types to be able to solve different types of search requests (Steiner, n.d.). One strategy that I actually used frequently without knowing it is called "pearl growing." With this strategy, a searcher uncovers one or two really good articles (i.e., a pearl) and reviews the full records to gather descriptors, subject headings, author(s), and other terms to perform further searches using the terms identified to find similar records (Steiner, n.d.). This is a search technique that I used quite frequently in my undergrad career without knowing the theory or name behind it! Pearl growing is a search technique that is intuitive to many people.

Lastly, for many commercial databases (e.g., Westlaw, Lexis, ProQuest Dialog) searchers need to have some knowledge of search commands to use the command-driven interface. Some search commands to be familiar with include:

Boolean Operators - using AND, OR, NOT in your search query. The "OR" operator searches terms inclusively, retrieving any documents containing any one of the terms used (e.g., peanut butter OR jelly). The "OR" operator generally increases the number of records retrieved. The "AND" operator retrieves documents where two or more search terms must both occur in the same document (e.g., peanut butter AND jelly). The "AND" operator will typically restrict (narrow) the number of documents in your search results. The "NOT" operator excludes records that contain irrelevant search terms from being retrieved (e.g., peanut butter NOT jelly). The "NOT" operator should be used carefully since it could be possible to unintentionally eliminate useful records.
Wildcards/Truncation - this search command is particularly useful when searching with terms with different endings or variable spellings. Symbols for wildcard searching includes !, ?, *. An example of a wildcard search would be "bird*" to retrieve documents containing bird, birding, birdman, birds, etc.
Field restrictions - these can can be used to restrict to certain fields such as author, title, year, descriptor, images, URL/site, subject headings, etc. This search command can be helpful if looking for documents by a certain author or documents published during a particular time period (Tucker, 2018a).

Evaluation

I think for the most part, when it comes to evaluating IR systems, we must look at it from two different perspectives; from the perspective of the user, and from the perspective of the information professional. When evaluating from the user's perspective, we need to focus on the usability of the IR system in which case it's important to address four general elements of an IR system:

Visibility - Does the page layout make it easy to find information needed? Is the layout of the page easy for the user's eyes (accessibility issues should be addressed here)? Is it easy to tell what the user is supposed to do?
Mappings/Organization - Is it easy to tell what state the system is in and how to get back to a pervious state? Can the user easily tell how to do another search? Can the user tell what their most recent action was?
Interaction - Do menus, search boxes, and other navigation tools maximize clarity and convenience for the user?
Searches - Is it easy to find information on how to search using the particular IR system? Is it easy to find information on what retrieval rules the IR system is using? Can the user find out what parts of the documents it is searching? Are there ways to restrict to specific parts of the document(s) (Tucker, 2018b).

With evaluation from the perspective of the information professional, we might look at more in-depth components such as the hierarchical, controlled vocabulary, menus, or other component of the IR system. Regarding controlled vocabularies, a good controlled vocabulary enables a searcher to retrieve all the information that's relevant and avoid all information that isn't relevant to the user's information need. How can we determine if a controlled vocabulary is good or bad? Well, there's three general aspects of a controlled vocabulary that can be evaluated:

Exhaustivity - Is the entirety of the "aboutness" of the document captured? How exhaustive the indexing is usually depends on the indexer's ability to recognize all the important aspects of the document and the editorial policy that may govern the number of descriptors that can be assigned (there may be limits).
Specificity - Does the searcher have to ask a question more broadly or more narrowly? Specificity is looked at in regards to the purpose of the IR system, the needs of the users, and the quantity of the documents about a particular subject.
Disambiguation - How clear is it what term should be used in a search? Terms like Indian (Native American or person from India), China (dish or country), and Mercury (planet or element) have more than one meaning. A good controlled vocabulary in an IR system should be able to disambiguate well or explain to users when one term would be used and when the other may be used (i.e., scope note) (Tucker, 2018b).

Evaluation of an IR system is so much more complex than what's described here. Rigorous evaluations of IR systems are needed in order to make interfaces that are user-friendly, reliable, and accessible.

Evidence

Artifact 1

Artifact # 1 - Project 1 alpha prototype (Info-202, Information Retrieval System Design)

This first piece of evidence demonstrates my ability to work in a group to design and implement a simple database. For this group project, we were tasked with having to create a database via determining a type of collection to focus on, defining a target user, designing the overall database structure, and creating database content using WebDataPro to create at least five records. For this alpha prototype portion of the project, I was mainly responsible for creating the data structure plan and assisting with the writing of the content rules. In addition to the data structure plan and rules, we also had to write up a statement of purpose for our database describing the purpose for it and the target user, provide a supplemental document for reference for the testers, and provide a URL to the database in WebDataPro. Completing this project helped me further understand the design process of IRSs and the creation of data structure plans. Note: the database can be viewed in WebDataPro by following the link provided and by entering the username and password provided in the document.

Artifact 2

Artifact # 2 - Database evaluation (Info-202, Information Retrieval System Design)

This second piece of evidence demonstrates my ability to query and evaluate multiple components of an IRS. For this assignment, my group and I were tasked with having to use and evaluate another group's database creation using WebDataPro. Specially for the evaluation, my group and I had to act as indexers and using another group's statement of purpose and rules, had to create records in the database and run searches. After careful use and evaluation, my group then wrote an evaluation which included comments and recommendations for the other group's database. We had to consider whether the designers meet the statement of purpose's objectives, if the database was designed appropriately, and if the rules were successful, among other observations. Lastly, each group member had to write up an individual learning reflection of working in a group and what we learned about IR systems throughout the semester. Completing this portion of the project helped me further understand the evaluation aspects of IR systems.

Artifact 3

Artifact # 3 - Website redesign project (Info-202, Information Retrieval System Design)

This third piece of evidence demonstrates my ability to design, query, and evaluate multiple components of an IR system. For this group project, we were tasked with having to locate a real website, evaluate it's usability based on site organization, and redesign the site organization based on findings from the evaluation. My group and I decided to use a public library's website for our project. My individual contributions to the project included contributing to the existing site map discussion, contributing to the redesign site map discussion, and contributing to the discussion on further testing and recommendations. I also contributed to the initial use and overall evaluation of the existing website including usability, and organization of the site map. Completing this project helped me further understand the three components of this competency (design, query, and evaluate) as it relates to IR systems including how the reorganization of an IR system can contribute to the usability of the system and how to evaluate the system pre and post redesign.

Conclusion

When I first began drafting this competency, I didn't realize the length of time and re-researching I would spend on this one competency. I started writing up this competency awhile ago and come back to it every so often. Now, I've come to realize that there is so much more to the design, query, and evaluation of IR systems mentioned here. When reflecting more on this competency, I see how the three concepts associated with it, design, query, and evaluation, are embedded in almost aspect of what we do as information professionals. We are almost always designing something for libraries, we are always using something to complete our tasks, and we are constantly evaluating. As it relates to IR systems, designing, querying, and evaluating are tasks we must be proficient at as information professionals to ensure that our users can locate their information need.

References

Steiner, V. (n.d.). Dialog and the searcher's toolkit [Lecture slides] San Jose State University School of Information.

Tucker, V. M. (2018a). Lecture 6: Search [PDF file] San Jose State University School of Information.

Tucker, V. M. (2018b). Lecture 7: Evaluation [PDF file] San Jose State University School of Information.

Weedman, J. (2018). Lecture 4: The design process [PDF file] San Jose State University School of Information.

Photo credit: Joshua Tree National Park, CA by: Michael Van Aken

Page updated

Report abuse