Author: Alan C. Dube
Date: 8/19/95
Introduction
A usability evaluation can be a valuable tool for analyzing and improving a software product. What may appear intuitive to an expert application developer is often incomprehensible or confusing to a novice end-user. Poorly designed software creates apprehension and avoidance, inhibits productivity, and generates technical support calls. Each dollar spent on usability tests, when conducted early in the software design process, can save a company $100 later on (LaPlante, 1992). By conducting usability evaluations on software applications, developers can better understand the problems of software development by focusing on user needs, and the feedback generated from the evaluation process, to improve software design.
This paper will report on the methodology used for a usability evaluation conducted on the Microsoft Windows File Manager application, outlining and discussing the steps executed in the evaluation process, the results obtained, and conclusions drawn from the results. The paper will close with some recommendations for improving the File Manager application.
What is File Manager?
File Manager is an application included with the Microsoft Windows 3.1 operating environment. File Manager provides a graphical representation of files and directories to help one organize files and simplify file maintenance. In using File Manager, a Windows user no longer has to deal with the character- based "DOS Prompt" to accomplish file and directory management. File Manager allows the user to perform many tasks, including:
View the contents of directories and start applications.
Search for, print, move, copy, rename, and delete files and directories.
Change to other disk drives, including network drives.
Change the kind and amount of information displayed about the files.
An example of the File Manager application screen (Microsoft, 1993) appears below (p. 31).
Methodology
Planning Stage
Identify the test goals and test method(s).
The first step in the usability evaluation was to identify the test goals and the methods used to achieve them. As Reed (1994) points out, not all features of a product can be evaluated, and not all decisions should be made solely upon user feedback; Testing and evaluation must be focused on whether or not the system meets its goals. LaPlante (1992) emphasizes that usability testing and evaluation is a separate process from merely testing for bugs; test subjects are not trying to crash the system. The assumption is that the application works from a technical point of view, and that usability evaluations are conducted to see whether the system makes sense from the user's point of view.
Thus, the goal of this usability evaluation of File Manager was to measure the success of the application at performing the file management tasks for which it was designed, noting times of task completion, error rates, success rates, and recovery factors experienced by the test subjects. The methods used to reach this test goal were based in part on Tipton's (1993) guidelines:
Define the target audience and test subjects. Who will use the product and why? What is their level of education and experience?
Define a set of measurable tasks for the test subjects to complete. The tasks should be comprehensive and test the major components of the application.
Collect the data. Create a scenario to be followed by all subjects that will test how the defined audience uses the product. Collect the data from and about the test subjects in multiple ways: the "Think Aloud" method, videotape, and questionnaire.
Review the collected data and perform statistical analysis where appropriate. Code videotape segments and questionnaires to record paths of movement through the application, types of errors, levels of frustration, and other relevant information.
Identify the test subjects.
The next step was to solicit test participants. To identify the test subjects and record their reactions to File Manager, a questionnaire based on the Questionnaire for User Interaction Satisfaction 5.0-S, developed by Ben Shneiderman and refined by Dr. Kent Norman of the Human-Computer Interaction Laboratory at the University of Maryland (Shneiderman, 1992), was created. The three test subjects who willingly agreed to participate in this evaluation, hereafter referred to as subjects A, B, and C, completed part one of the document prior to the testing. Parts 2, 3, 4, 5, and 6 of the questionnaire were completed after testing.
A review of the data contained in part one of the questionnaire indicates a broad range of education and experience levels. All subjects had completed high school; Two subjects had college degrees. Novice, intermediate, and advanced personal- computer (PC) skill levels were represented. All of the subjects had less than one year of experience with Windows, and none of the subjects were familiar with the File Manager application. Two of the participants were female. Two of the subjects were in the 25-35 year-old age group, and the remaining subject was in the 60+ year-old age group. None of the participants had any physical or mental handicaps that would prohibit them from accomplishing the tasks presented.
Create workable tasks that test the product design.
Now that a good and varied selection of test participants was identified, the next step in the process was to develop a set of workable tasks that tested the design of File Manager. Fifteen distinct tasks were chosen to test most of the major features of File Manager: viewing file and directory information, sorting and viewing the file lists, and moving, copying, deleting, renaming, and searching files. The tasks were ordered and prioritized by their right-to-left position on the menu bar of the File Manager application, starting with the help system and ending with the "core" functions of File Manager, to give the subjects a structured map for completing the tasks and a thorough review of the application. Each task description began with a brief summary of what was to be performed, and outlined in detail the steps required to complete the task. All testing and task execution was designed to be done on an individual basis (i.e., no group tasks were to be completed).
Determine performance and subjective measurements.
The test was designed to be completed in 20 minutes, although no specific time limit was given to the participants upon administration. The evaluation was itself "tested," as suggested by Morley (1995), prior to delivery to ensure that it functioned as it should, that there were no missing pieces or incomplete instructions, and to gauge the length of time it took to complete the test. The participants' performance during the test was recorded by videotape for later review. Times of task completion, error rates, and success rates were calculated. Other factors, such as error recovery and requests for assistance, were also noted.
Subjective measurements of File Manager were provided by the participants via the User Questionnaire, quantifying their overall reactions to File Manager, the screen layout, use of terminology, and learning factors on a scale from 1-9 (with 1 being a negative response and 9 being a positive response). User comments on the File Manager application were also solicited by the questionnaire.
Create the scenario to conduct the test.
To conduct the evaluation properly, a "test lab" needed to be established. To be flexible to the test subjects' needs, and sensitive to their location and operating environment, it was decided to bring the lab to the participants, rather than the participants to the lab. This approach has a low implementation cost and avoids the creation of an unnatural or superficial test environment. Testing that includes everyday interruptions or distractions such as phone calls, visits from colleagues or relatives, or trips to the copy machine makes for more realistic results and the best overall perspective (Lindquist, 1993).
The workstation area that was used for testing consisted of a large desk and chair in a secluded home office. The space had plenty of natural lighting, was clean and comfortable, and had a minimum of noise and distractions. The File Manager evaluation itself was conducted on a notebook PC: a Toshiba T1910 486SX 33Mhz processor with 8 MB of RAM, a track-ball mouse, and a monochrome LCD screen capable of displaying 64 gray scales. The version of File Manager tested was that of Windows release 3.11.
Conducting the Test/Collecting Data
Procedures.
Now that the test goals were established, the participants identified, the task list drafted, performance measurements defined, and the test lab created, the next step was to conduct the actual evaluations. All of the evaluations were completed during the late morning of July 5, 1995. Due to the limited personnel and computing resources available, the test was structured to be administered to the subjects one at a time. Each participant was briefed on the test procedures before the evaluation began. The procedures as discussed were:
1. Fill out part one of the User Questionnaire.
2. Read through the Evaluation Description and Task Directions.
3. Complete the tasks in order from 1-15, with no time limit for each task or the overall evaluation.
4. Remember to "Think Aloud" during the evaluation.
5. Complete parts 2, 3, 4, 5, and 6 of the User Questionnaire once the evaluation was completed.
Additional instructions were given to the subjects prior to the evaluation. The participants were shown the differences in a notebook PC vs. a full-size keyboard, how to use the track-ball mouse, and how to adjust the LCD screen contrast. The subjects were informed, as suggested by Shneiderman (1992), that "it is not they who are being tested, but rather it is the software and user interface that are under study" (p. 479). The subjects were also informed on how they should "Think Aloud," as explained Gomoll (1990):
It may be a bit awkward at first, but it's really vary easy once you get used to it.
All you have to do is speak your thoughts as you work.
If you forget to think aloud, you'll be reminded to keep talking (p. 88).
The "Think Aloud" method was chosen because of its subjective documentation and the informal atmosphere it creates: prompting the users and listening for clues as to how the subjects are dealing with the application.
Finally, the participants were told that the administrator would not provide help or assistance during the evaluation in order to create a realistic situation and observe how they worked through any difficulties with the software.
Record the subjects' actions.
As previously mentioned, the subjects' evaluations were recorded on videotape. The recorder, a hand-held Panasonic VHS-C video camera, was mounted on a tripod for image stability. The tripod was situated behind the subjects during filming. The camera was set to record the day and time of the session for future reference. The subjects were cognizant of the recorder and were assured that it was there to capture only their screen activity and the "Think Aloud" dialogue and comments.
Videotape was used on site to record the subjects' actions during the evaluation because of its advantages: Portable video is less threatening to users than a lab or artificial surrounding (Lindquist, 1993), all user-administrator dialogue and details of the evaluation can be captured and catalogued for future reference, and video recording permits the gathering of data without constant supervision during taping (Brun-Cottan, Wall, 1995). Review of videotapes is tedious job, and videotape equipment can be costly and time consuming (Shneiderman, 1992), but the ability to pinpoint the exact moment of a user's error or frustration with a system is worth the time and effort.
After the evaluations were completed, the videotapes were viewed and notes made on each subject's performance with regards to: time spent per task and on the evaluation overall, the number of errors made and how the user recovered from them, the number of successes the subject gained, if the user sought on-line help, and the number of times the user sought assistance. Time elements were clocked using a stop watch and the time stamping present on the videotape itself. Any comments made by the participants in a "Think Aloud" fashion were also noted.
Results
Comments
Overall the vocal, "Think Aloud" comments and written responses (from part 6 of the User Questionnaire) on File Manager were positive. Subject A was highly vocal during the evaluation, and voiced positive comments, negative comments, and posed questions during the process. Subject A found the dragging and dropping capabilities within the product to be "user friendly," and liked the ability to change fonts, the sorting order, and the ability to use "shortcuts" to perform file manipulations. Subject A found the multiple or split directory views to be confusing at times, and found the "move" command to be in error if a file path (e.g., "C:\") was not given (the "move" command would rename the file in the current directory in this case). Subject A wrote that some of the task commands were similar and therefore confusing, but that the tool as a whole was very useful when compared to the DOS environment.
Subject B was somewhat less vocal than Subject A. Subject B voiced difficulty with the "Split Bar" in File Manager, but overall voiced positive comments upon task completion. Subject B wrote that File Manager was "informative" and again alluded to the fact that performing the tasks here was easier than their DOS- prompt counterparts.
Subject C was the least vocal of the participants and had to be prompted often to "Think Aloud." Subject C was the least experienced of the participants and voiced frustration not with File Manager itself, but the Windows environment and direct- manipulation computing as a whole. Subject C did not leave any written comments on File Manager.
Time per Task
Time per task, and the overall time it took the subjects to complete the evaluation, was measured in minutes and varied in duration according to their levels of PC experience. Table 1 illustrates individual and average time of task and evaluation completion.
Table 1
Individual and Average File Manager Task Completion Times
Subject Time (in minutes) Task A B C (Avg.) 1 1.50 1.00 1.50 1.30 2 1.50 1.00 3.00 1.83 3 2.00 1.00 5.00 2.66 4 1.50 1.00 3.00 1.83 5 2.00 0.50 2.00 1.50 6 0.75 0.50 3.00 1.41 7 0.50 1.00 1.00 0.83 8 0.50 1.00 1.00 0.83 9 2.50 2.50 4.00 3.00 10 1.50 1.00 1.00 1.16 11 4.00 5.00 6.50 5.16 12 1.00 0.75 1.00 0.91 13 1.00 0.50 1.00 0.83 14 1.00 0.50 1.00 0.83 15 1.00 0.50 1.00 0.83 Totals 23.25 17.75 35.00 24.91
Subjects A and B completed the test near the target time of 20 minutes. Subject C was 15 minutes over that time primarily due to the environmental difficulties previously mentioned, and lack of computing experience. The average time per task was 1.66 minutes. Tasks 9 and 11 had the greatest variance to this average time for all participants and indicates an area of potential improvement for File Manager. With the exception of tasks 9 and 11, all of the subjects became more comfortable with File Manager as the evaluation progressed, as indicated by the shorter task completion times and the comments from the participants.
Errors and Recovery
Relatively few errors were made by the subjects in the evaluation process. This could be attributed in part to the simplicity of the tasks and the detail of the instructions. Subject A made a total of three errors on tasks 5, 9, and 11. Subject B made two errors on tasks 9 and 11. Subject C had a total of four errors on tasks 2, 3, 6, and 11. Again, it appears that the actions involved with tasks 9 and 11 warrant further study.
All of the subjects recovered from their errors by either solving the problem themselves or eliciting queues from the test administrator. File Manager, by design, often prompted the subjects for confirmation on an irreversible action (file delete, copy, rename, or move). The actions that were reversible were graphical in nature, dealing primarily with directory views and content sorting.
Successes
Each subject was able to complete every task presented. Some of the tasks were more complicated and involved than others, as illustrated by the times of completion and the error rates. All the tasks were reasonable and thorough for the purpose of evaluating File Manager, and the fact that all of the participants completed all of the tasks speaks for File Manager's solid design.
Help and Assistance
None of the users sought out assistance, on their own initiative, from the File Manager help system during the evaluation. All of the subjects sought assistance from the test administrator for various tasks in the process. Subject A sought assistance on tasks 1, 3, 9, and 11. Subject B sought assistance on tasks 6, 7, and 11. Subject C sought more assistance than the others, on tasks 1, 2, 3, 4, 6, 9, and 11. Again the primary reason for this is due to a lack of computing experience with PCs and the Windows environment.
Questionnaire Results
The evaluation subjects had a positive reaction (7.27 total average) to File Manager, based on the data contained in Table 2. Part 2 of the questionnaire dealt with the participants' overall reaction to File Manager: The average score was 6.4, indicating a positive response. Part 3 asked the subjects about File Manager's screen design: The average score here was 7.0, again indicating a favorable response. Part 4 dealt with the terminology and system information used by File Manager and yielded an average score of 7.83; a highly positive result. Part 5 dealt with learning factors for File Manager, and received a very favorable average score of 7.87.
Table 2
Individual and Average User Questionnaire Results
. Subjects . . Averages . Sub-part A B C Sub-part Part 2.1 7.40 6.40 5.40 6.40 6.40 3.1 4.00 9.00 7.00 6.66 3.2 4.00 9.00 8.00 7.00 3.3 6.00 8.00 8.00 7.33 3.4 7.00 6.00 8.00 7.00 7.00 4.1 4.00 8.00 9.00 7.00 4.2 6.50 9.00 9.00 8.16 4.3 8.00 9.00 9.00 8.66 4.4 8.00 7.00 -- 7.50 7.83 5.1 7.00 9.00 7.00 7.66 5.2 6.00 9.00 7.00 7.33 5.3 7.00 8.00 9.00 8.00 5.4 -- 8.00 9.00 8.50 7.87 Total Avg. 6.24 8.10 7.95 7.27
Table 2 also illustrates the average score given by each subject. The highest average was recorded by subject B, the participant with the most computing experience. Suprisingly, subject C, who had the hardest time with the evaluation, recorded the second highest average.
Conclusions
The goal of this usability evaluation was to measure the success of File Manager at performing the file management tasks for which it was designed. By all measures, File Manager is an excellent and well-designed application for file management. File Manager conforms to the Microsoft application standards of consistency, clarity, feedback, and forgiveness (Microsoft, 1992). Of course the conformity is intentional; As part of the Microsoft Windows operating environment, all Windows accessories and applications should exhibit the same "look and feel" to contribute to the success of the platform.
Direct comments from the participants were mostly favorable, and two of the three participants were able to complete the evaluation near the target time of 20 minutes. Error rates on the tasks for all of the participants were low, and in all cases the users were able to recover from errors. All of the test subjects rated File Manager favorably and were able to successfully accomplish the entire set of comprehensive tasks put before them with a minimum of help and assistance. The design aspects involved in tasks 9 and 11, which dealt with sorting a directory view and moving a file to a different directory respectively, did give the test subjects the greatest problems during the evaluation and can be improved upon.
A usability evaluation, such as the one accomplished here, is an informative and rewarding process. Indeed, many companies, such as Microsoft itself, have invested heavily in usability labs and have integrated usability testing into the early process of the software development cycle (Parker, 1994). However, usability testing is a fairly new tool and far from an exact science. Still, it is far better for companies and developers to engage in this exercise than not; surely their competitors will, and will reap the benefits of doing so.
Recommendations for Improving File Manager
The usability evaluation conducted here directly revealed some design flaws in File Manager and indirectly pointed out some areas where the application could be improved. The View...Sort by File Type and the View...By File Type menu options were too similar to each other in wording and functionality, thus contributing to the confusion experienced by the test subjects. A simple fix to this problem would be to change the View..By File Type menu option to View...Select Files.
Perhaps the biggest frustration the test subjects dealt with was moving a file to a different directory. When selecting a file by highlighting it, then selecting the menu option File...Move, the users were prompted to enter the destination file name and path. Entering the full directory path here is not intuitive, and neither is entering a different file name in order to rename the file upon moving it. A better solution would be to display a graphical dialogue box when the destination field receives a mouse-click focus, allowing the user to select the destination directory. If the user wanted to rename the file in the process, the user could choose to type in the complete path and the new file name in this field, or better yet, use the File...Rename option.
When the test subjects attempted to move a file using direct manipulation (drag and drop), they often found that the destination directory did not appear on the left panel of the File Manager screen before they had selected the file, and thus they could not complete the operation. Most of the participants got around this by first scrolling the left panel of the File Manager display to show the destination directory, and then selecting the file to be moved and completing the drag and drop operation. A good solution to alleviate this situation would be to allow the "drag handle" -- the pointer used to indicate that the file is selected for direct manipulation by the left mouse button (Microsoft, 1992) -- to "auto-scroll" the left display panel when the handle is positioned over the scroll bar for that panel. In this way, the user could select, scroll, and drop the file in one operation.
Direct manipulation could also be used for deleting files by simply incorporating a Macintosh-like "Trash Can" icon into File Manager. Users could select, drag, and drop files to be deleted onto the icon, and then empty the trash when exiting File Manager or retrieve the files during the session by dragging them out of the trash area to a destination directory.
Livingston (1993) also points out that selecting multiple files for direct manipulation operations would be easier if File Manager allowed users to use a "rubber band" over the desired file names, changing their color when selected as in the Macintosh Finder metaphor. Strom (1995), points out that it would be helpful to view files quickly in File Manager simply by clicking on them, rather than launching them by association with a file editor or some other type of application.
All of the suggestions noted could make File Manager a better application to use. Some of the improvements may already be available from add-in or shareware products that modify the existing File Manager application. However, Microsoft did not get where it is today by letting others improve upon its own products. It is better for Microsoft's sake to improve upon them itself.
References
Brun-Cottan, F. & Wall, P. (1995). Using video to re-present the user. Communications of the ACM, Vol.38, No. 5, 61-70.
Gomoll, K. (1990). Some techniques for observing users. In S. J. Mountford & B. Laurel (Ed.), The art of human-computer interface design (pp. 85-90). Reading, MA: Addison Wesley.
LaPlante, A. (1992). Put to the test. Computerworld, Vol. 26, No. 30, 75-80.
Lindquist, C. (1993). User software testing takes to the road. Computerworld, Vol. 27, No. 13, p. 94.
Livingston, B. (1993). Undocumented File Manager feature speedily selects files. InfoWorld, Vol. 15, No. 10. p. 23.
Microsoft Windows user's guide. (1993). Redmond, WA: Microsoft Press.
Morley, E. (1995). The SilverPlatter Experience. CD-ROM Professional, Vol. 8, No. 3, 111-118.
Parker, R. (1994). Don't test for usability; design for it. InfoWorld, Vol. 16, No. 29, p. 58.
Reed, S. (1993). IS managers turn to in-house usability tests. InfoWorld, Vol. 15, No. 16, p. 72.
Strom, B. (1995). Adding functionality to File Manager. Accounting Technology, Vol. 11, No. 3, 53-55.
Shneiderman, B. (1992). Designing the user interface: Strategies for effective human-computer interaction (2nd ed.). Reading, MA: Addison Wesley.
The windows interface: An application design guide. (1992). Redmond, WA: Microsoft Press.
Tipton, M. (1993). Some general tips for testing usability. CD-ROM Professional, Vol. 6, No. 4, p. 124.