Skilled Computer Users - Experiment 1


Eric J. Fimbel - Research results and data sets - home page    back to skilled users

Citation: Fimbel, E.J. (2009) Skilled Computer Users. Available at http://skilledusers.googlepages.com/. Last retrieved : (mm/dd/yyyy)


Objective

Measure the execution time of basic operations (key press, pointing movements, button clicks) during repetitive tasks on graphical interfaces.

The data was used to tune the KPC model, i.e., a model that predicts user's performance from the number of operations required to complete a computer task.

One half of the participants was used to determine the parameters of the KPC model.

The other half was  used to verify the accuracy of prediction of the model.


Where and when

LESIA Laboratory, Ecole de technologie supérieure, Montreal, September-December 2004.


Ethic statement

The experiment was conducted according to the ethical regulations of Ecole de technologie supérieure. Participants signed informed written consent prior to the test. 

consent form (doc)


Participants 

20 young healthy volunteers  (average age 25 years, stdev 3.7 years; level of studies: college or higher; 4 females; all right-handed), with no motor, sensory or neurological impairments, with normal or corrected vision and hearing. 

All participants were french native speakers familiar with computers interfaces in English. 

All participants were experienced computer users according to the following criteria: 

1) use of a personal computer more than 10 times per week during the past year.

2) level of familiarity with the Windows(R) operating system self-rated as above average on an analog scale unfamiliar-expert


Material and environment

The experiment was conducted in a silent room.

Participants used a desktop computer equipped with a Pentium 4, 2 GHz, 512 MB RAM, a 17’’ LCD screen, a standard keyboard and a 3 buttons optical mouse. The operating system was Windows XP Professional(R)

The experimenter was present in the room, sitting in diagonal behind the participant, out the participant's visual field.


Test software

The software was written in Java. It contains the following modules.

i) 6 interfaces in Java Swing.

ii) A controller that checks the inputs,  goes to the next screen or displays an error message.

iii) An internal key logger that records the inputs and the interface events.

iv) A series of scripts (.bat files) that execute the 6 tasks in sequence. These scripts are launched by the participants themselves.

download test software (.zip). See instructions in Section Run the test


Test session

Participants had to execute repetitively 6 tasks with an interface developed especially for the test (see details below). The session was as follows.

Participants signed a consent form (doc), filled a  sociodemographic data form (doc) and were given verbal instructions. 

Participants read carefully the printed instructions (zip) that gave the details of the tasks. As a practice, they executed twice the sequence of 6 tasks while consulting the instructions.

Then, they executed 40 times the sequence of 6 tasks (always in the same order) for a total of 240 task executions (trials).

Participants could consult the instructions between the trials, and in case of necessity during a trial.


Pauses and duration

The test was self-paced. The participants could take a break and/or consult  the instructions between trials. A one-minute pause was imposed every 10 repetitions. The total duration was between 2 and 3 hours.


Printed instructions

The instructions were presented in printed sheets. Whereas the interfaces were in English, the instructions were written in French. This is a common situation for Francophone students in Quebec.

The instructions were consulted openly during the practice. During the 40 repetitions, instructions could be consulted between trials and in case of necessity, during a trial. Instructions were placed on the desk, in a closed folder so that the experimenter could note easily when they were consulted.

printed instructions (zip)

consultations of instructions during the test (xls)


Tasks

The 6 tasks are summarized in Table 1 (see Printed Instructions and Screenshots for additional information). They were  designed to meet the following requirements.

i) use a variety of widgets so that the number of operations (keypress, point, click) is more or less balanced and the execution times are collected in different contexts

ii) be short so that they can be learned quickly and repeated many times during a session.

iii) allow alternative input strategies (mouse- and keyboard-based operations, e.g., click a button and/or press enter, copy-paste buffers, etc.).

iv) require always the same number of operations (by using always the same entry values or entries values with a constant length).


The 6 tasks have the following structure.

The first screen displays a START button. The execution starts when the button is pressed.

A sequence of screens are presented. The participant fills the screen with predetermined values. 

The values are defined in written instructions (most common case), or  presented in tooltips (see Table 1). The tooltips are used to force the participant to look at the screen.

When the screen is completed, the participant presses the NEXT button. If the fields are correct, the next screen is displayed (for the last screen, the task terminates).

If the fields are incorrect, an error message is displayed. The wrong field is not specified so that  the participant has to find the error. 

This drastic method forces the participant to learn the task completely, instead of relying on 'helpful' error messages.


Table 1. Tasks. The steps are those listed in the written instructions (coarsely, one step per field to fill and/or button to use). The tool tips define a variable content for a field, generally one digit. The execution time and number of operations are the average of the participants (1-20) over 40 repetitions.

TaskDescription Number of steps / tool tips Execution time (s) Number of  operations
1 Fill an on line payment form.   12 (1 tool tip) 35 88
2 Cut and paste a section of text, modify the text. 8 (1 tool tip) 5 54
3 Install an application on a computer. 11 (1 tool tip) 19 49
4 Save a document, print it and exit the application. 8 22 58
5 Log on a database, search a book. 6 (1 tool tip) 20 45
6 Consult list of students working with a professor. 7 19 49

 

Screenshots (click to enlarge)


Task 1



Task 2



Task 3



Task 4




Task 5



Task 6






Data capture and processing


Log files

There was one log file per task execution (trial). The log files contained one event per line. The original events were low level input operations (e.g., key press and key release), compound operations (key typed), interface events (e.g., focus gained/lost, enter/leave a widget).


Filtering

The log files were filtered to keep only the following events.

key pressed, key released and key typed (a character generated by key pressed+released)

mouse moved (elementary displacement of mouse)

mouse button pressed and/or  released
 

KPC sequences

The filtered log files were converted into sequences of basic operations K(key press), P(pointing movement) and C( mouse button click) according to the following rules.

A key typed produced a K. The compound keys (e.g., shift+A) produced a single K. Auto-repeated keys were converted into sequences of Ks.

A block of consecutive mouse moved produced a P.

Additional filtering. Pointing movements of less than 2 pixels were discarded in order to remove involuntary and/or useless mouse displacements. 

A consecutive mouse pressed - mouse released produced a C. Double clicks (and in general multiple clicks) were converted into a single C. Drag-and-drops (button press - pointing movement - button release) were normalized as C-P-C.

Note. the first operation (click or key press on the start button) was removed from the sequence.

The result was a set of 4800 sequences (i.e., 6 tasks x 40 repetitions x 20 participants).

Execution times of KPC operations

For all the operations of the KPC sequences we determined:

the execution time (or duration), as the difference between the end times of consecutive operations.

The end time was extracted from the event log as the time stamp of the last event of the operation.

ET( operation k) = end time( operation k ) - end time (operation k-1)

The temporal resolution was about 15ms.

Note. for the first operation of a sequence, we used the start time of the session (in fact, it is the end time of the click/key press of the start button)

the  previous operation  (or context). This parameter is important because the context affects the execution time.

For instance in a C-K sequence, the K will be on average slower than in a K-K sequence (because there is often a hand displacement from mouse to keyboard after a click).

Note. the context is X for the first operation of a sequence, because we cannot know if the sequence was initiated by a click or a key press on the start button.
 
The result was a set of 274315 operations.

operations KPC for all participants (N=20), tasks(N=6) and repetitions(N=40) (xls in zip) 

(see legend in spreadsheet) 


Estimates of execution times for the KPC model

The KPC model uses estimates of execution times of K P and C operations and a reference sequence to predict how long it takes to execute a task.  The estimates were obtained by averaging the execution times on a subset of operations determined as follows.

tasks 1-6. This was the rationale of the tasks: to be globally representative of the basic operations on an interface: enter text, select check boxes or radio buttons, select from scroll-down menus, copy-paste, etc.

repetitions 6-15. In this range, participants had learnt the task but were not over-trained. We determined this a posteriori from the frequency of consultation of the instructions. The consultations decreased abruptly after repetitions 1-5:

repetitions 1-5: consultation in 48% of the sequences,
repetitions 6-10: 10%,
repetitions 11-15: 5%,
repetitions 16-40: 1%

see consultations of instructions during the test (xls)

participants 1-10. This was the original experimental design: 10  participants to build the model, additional participants (as much as possible given the time constraints) to validate the model. We nonetheless verified a posteriori that 10 participants were sufficient.

We compared estimates of execution times for random subsets of 10 participants and estimates obtained with the full group of 20 participants.

We found that the differences between estimates computed with 10 vs. 20 participants were less than half the expected accuracy of the KPC model (difference for C: 8%, K: 1%, P: 0.2%, expected accuracy: 20%, based on the literature on KLM).

The subset contained 5376 Cs, 22592 Ks and 5675 Ps. On this subset, we computed average and standard deviation of the execution times and we rounded them +- 10ms to obtain the estimates (Table 2).

Table 2. Estimates of execution times of K, P, C operations for tasks 1-6, participants 1-10, repetitions 6-15. In milliseconds. Estimates are the averages, rounded at 10ms.

  estimateaveragecount stdev 
360 360.55 22592 415 
 P820 821.72 5675 670 
 C260 257.155376327



Reference sequences in KPC and KLM

The KPC and KLM models use reference sequences to predict the execution time of a task. 

Reference sequences are presumably optimal, i.e., whatever solution adopted by an expert user or a skilled user should be roughly equivalent (in terms of execution time). 

Reference sequences are classically determined by an analyst. Here, the analyst was one of us (EF) that had not previously practiced the tasks. 

The analyst determined his strategies during a first exploration of the interfaces. The strategies were coarse-grained, i.e., defined by their main steps, e.g., "fill field A then move with tab, then copy-paste result", etc.


KPC sequences

The strategies of the analyst were executed with the test software and the KPC sequences were produced by the software Basic Key Logger running in parallel

More about Basic Key Logger

Note. The KPC sequences differ slightly from the sequences recorded during the experiment, because of the way Basic Key Logger works. The most important differences are:

compound keys produce multiple Ks, auto-repeated keys produce a different operation, named A.

double clicks produce multiple Cs

the duration of Cs and Ks are counted between press and release (not from previous operation)

pointing movements last while the mouse is moving, even if some other operation occurs in parallel, and small pointing movements are not filtered

there may be pauses (Q operations) between operations and conversely operations can overlap

there is an additional operation W (scroll mouse wheel), not used here.

The KPC sequences were manually filtered in order to obtain the reference sequences. 

Errors and redundant operations were removed. 

As were transformed into Ks, 

Qs and Ws were removed.

A Q operation of  1s was inserted whenever the participant had to read a tool-tip.

The predicted execution times were obtained by adding the estimates given by table 2 (see below) for the operations of the reference sequences.

KLM sequences 

The KLM sequences were produced from the KPC reference sequences. 

The additional Ks of the compound keys, the additional Cs of double clicks and the Qs were removed

Cs were transformed into Ks (the original KLM does not make a difference)

Hs were inserted at each transition between keyboard and mouse or vice versa. C-K, P-K, K-P, K-C became C-H-K, P-H-K etc.

Ms were inserted according to the heuristics of CRITIQUE (Hudson & al., 1999)

when the participant started working with an interactive component (field, button)

when the participant switched the interaction mode with a widget.

Note. In case of ambiguity in the application of these heuristics, no M was inserted.

The predictions of execution times were computed using the estimates of (Card & al., 1980) for skilled typists (Table 3).

Table 3. Estimates of execution times of operations in the original KLM (Card et al., 1980) in milliseconds

 operationdescriptionestimate
 Hhoming (hand movement) 400
 Kkey press for skilled typist or mouse click 200 
 Mmental activity (pause) 1350 
 P pointing movement1100 


The result was a set of 12 sequences with their predicted execution times (6 tasks x 2 models)


(see legend in spreadsheet) 


Comparison of KPC and KLM

We compared the prediction of execution time obtained with KPC and KLM with the actual performance of participants 11-20 (the second half of the group) on tasks 1-6. 

We grouped the trials in 4 training levels (repetitions 1-10 to 31-40). This was done because KPC and KLM consider different levels of training.

KLM is a model of expert users (skilled with computers and over-trained with the task and/or the interface)

KPC a model of skilled users (skilled with computers, but with a minimal practice of the task) we 

We then computed the bias (difference between average execution time of participants and model prediction) and the dispersion of the participants' execution time (dispersion = standard deviation / average).

The dispersion indicates the limit of precision of any model, biased or not.

For instance, a dispersion of 0.5 means that the average distance between the prediction of an unbiased model and the actual execution time of a trial is 50% of the predicted execution time.


The result is a 6 x 4 matrix (task x level of training) that contains the bias of KPC, the bias of KLM and the dispersion of participants' execution time. Figure 1 depicts the bias for each task (all levels of training together) and Figure 2 depicts the bias at different levels of training for each task, and the average (across all tasks). The numbers are in




Figure 1. Bias for each task.  X-axis task. Y-axis: relative bias (average bias of task for participants 11-20, repetitions 1-40 as a percentage of the average execution time) for KPC ( light grey) and KLM (dark grey) . Transparent bar: dispersion. The rightmost bars represent the grand average of bias and dispersion (averages across tasks 1-6)



Figure 2. Bias for each level of training for all tasks. X-axis
 training level. Y-axis: relative bias (average bias of task i at level of training j expressed as a percentage of average execution time of participants 11-20 on task i and level j): KPC = lozenges KLM = squares, dispersion of execution times = dotted line. The last chart represents the marginal means of the biases and dispersion averaged across tasks 1-6.

Comments on Figure 1. At first glance, KPC was markedly more accurate than KLM. However remember that KPC had an unfair advantage: it was tuned on these tasks, these interfaces and this computer.




All results & documents

see legends in spreadsheets


 



estimates of execution times of KPC operations: see table 2





How to run the test


If required, install java 1.4.2 Standard Edition( download -  API documentation). 

Download test software (.zip)  and  printed instructions (zip). Unzip anywhere

Open instructions files

In the sub-folder blocks_training/, run script trainingBlock1.bat. 

Follow instructions to complete the screens (instructions are straightforward, no matter that you are not a French speaker).

After the tasks are completed, the logs are in subfolders 

EstimatorUI-01_20040829/log/.. EstimatorUI-06_20040829/log/


Copyright: (c) 2009 E.J. Fimbel, P-S Dube. This is open-access content distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.