Skilled Computer Users - Experiment 2


Eric J. Fimbel - Research results and data sets - home page    back to skilled users

Citation: Fimbel, E.J. (2009) Skilled Computer Users. Available at http://skilledusers.googlepages.com/. Last retrieved : (mm/dd/yyyy) 

 

Objective


Measure the performance of skilled computer users on commercial software and investigage their strategies.

The data was used to validate the KPC model, i.e., a model that predicts user's performance from the number of operations required to complete a computer task.

The strategies of skilled computer users were compared to those of analysts (highly trained with the software and with a professional background). The strategies of the analysts (called reference sequences) are used to predict the execution times of the tasks.


Where and when

Institut de Cognitique, Université Bordeaux 2, march-april 2007.


Ethic statement

The experiment was conducted according to the ethical regulations of Institut de Cognitique. Participants signed informed written consent prior to the test. 

consent form.


Participants

20 healthy volunteers (average age 24.6 years, stdev 4.5 years; level of studies: college or higher, i.e. average of 18.6 years of study, stdev 3.6 years; 11 females; 3 left-handed), with no motor, sensory or neurological impairments and with normal or corrected vision and hearing. 

All participants were native French speakers familiar with computer interfaces in English. In addition, all participants were skilled computer users according to the following criteria: 

1) level of familiarity with computers self-rated as above average on a visual analog scale (VAS) (from 0=totally ignorant to 1=totally at ease, average score 0.88, stdev 0.13), 

2) regular use of computer for more than 5 years (average 13 years, stdev 4 years)

3) practice of the software used in the experiment for at least one year (Excel(r) : average 7 years, stdev 3.1 years; Powerpoint(r) : average 5 years, stdev 2.6 years; Word(r) : average 9 years, stdev 2.7 years).


Analysts

3 analysts established references sequences for the tasks. 

Analyst 1 is a professional highly skilled with Microsoft Office(R) that has taken several courses and uses the package every day.

Analysts 2 and 3 are university lecturers, highly familiar with Office(R) and specialists in human computer interaction.


Material and environment

The experiment was conducted in a silent room. 

Participants used a laptop computer (processor: Penthium M 733 1.1 GHz, 1GB RAM, Windows XP Professional R, wireless standard keyboard and wireless 3-buttons mouse) equipped with dual screen, one for the task (17" CRT screen) and one for the control of the experiment (12" LCD screen).

The experiment used a separate laptop in order to capture on line sociodemographic data and self-ratings of the participants, and to take observation notes.

In addition, a digital recorder was used unless participant refused (this occurred for 1 participant).


Test session

Participants had to execute several times 6 tasks with Microsoft Office(R). The tasks were defined by their initial and final configurations but the steps to attain the final configuration were left to the participants' own judgment.

The session was as follows.

Participants first signed a consent form. The experimenter filled on-line sociodemographic data, then the participants self-rated their skills with computers and software by means of visual analog scales presented on the assistant's computer. 

They were then given verbal instructions (see below) and the training phase started. Participants read the printed instructions (zip) that gave the details of the tasks. Then they executed the 6 tasks in any order for as long as they wanted while consulting the instructions. When they told so, the performance test started.

The performance test consisted in 5 consecutive executions of each task for a total of 30 task executions (trials). The first execution was considered an additional practice and was discarded (because excessive self-confidence and/or impatience may lead some participants to shorten the training time), but participants were only told so afterward.


Pauses and duration

Participants self-paced the test by means of two buttons on the secondary monitor that respectively indicated the beginning and the completion of a trial. This dual-button control allowed to take a break of arbitrary duration between trials. The total duration was between 2 and 3 hours.


Test software

The software used for the test was as follows.

1) Microsoft Office 2002 in English (Excel, Powerpoint and Word).
2) Basic Key Logger , an external key logger to capture inputs from any application into key logs (low level events) and KPC logs (basic operations, i.e., Keypresses, Pointings and Clicks). 

download Basic Key Logger from here

3) Test sequencer that i) presented a control window on the secondary screen (with the description of current trial, elapsed time, start and stop buttons),ii) produced timing information for each trial in a separate log file, and iii) tagged the logs of Basic Key Logger so that trials could easily be identified in log files.

4) Experimenter software running on the experimenter's computer for i) capture on-line information of participants (sociodemographic, self-ratings), ii) on-line observation notes during the session and iii) backup and name adequately test data.


Tasks

The 6 tasks are summarized in Table 1 and described in more detail in Sections Instructions and Screen shots. They were designed to meet the following requirements.

i) to be representative of tasks performed with common software. Microsoft Office (R) was chosen because it is commonly used by students in their academic work.

ii) to be minimally constrained, i.e.,  defined by their result and with minimal prohibitions (e.g., no use of external clipboard or documents, no use of macro-commands)

iii) to be short so that the session duration was kept reasonable.

iv) to require always the same number of operations.

The structure of the tasks were as follows.

The initial configuration was a blank document, cursor in upper left corner, menus and toolbars in the default configuration, buffers and clipboard empty.

The final configuration was a filled document. 

The printed instructions gave an example of solution to reach the final configuration.


Table 1. Tasks. The software is the version 2002 of Office(R), in English. The steps are those of the example of sequence given in the instructions.  The execution time and number of operations are the averages measured over participants (1-20) and repetitions (4).

Task Software Number of steps Execution time (s) Number of  operations
1 Word 10
47
134
2 Word
9
64
182
3 Powerpoint
7
57
171
4 Powerpoint
5
72
232
5 Excel 7
64
158
6 Excel 7
52
93


Verbal instructions 


The experimenter showed briefly the printed instructions to the participants and explained them that they should explore the tasks for as long as they wanted, in order to "find a way of executing the tasks as fast as possible". 

They were told that the typical sequence contained in the printed instructions was "only the experimenter's own solution", and they were encouraged to find their own strategies. 

They were told that the experimenter would tell them whenever they do "something forbidden" (for instance, copy-pasting data from previous executions was not allowed). 

They were finally told that instructions could be consulted later, between trials, and in case of necessity, during trials. When participants told so, the performance test started.


Printed instructions


The initial and final configuration were presented in printed sheets, with an example of sequence of operations to attain the final configuration.

The instructions were consulted openly during the practice. 

During the following executions of the tasks, instructions could be consulted between trials and in case of absolute necessity, during a trial. In order to note easily when instructions were consulted, they were placed on the desk, in a closed folder.

printed instructions (zip)



Screenshots (click to enlarge) 


Task 1 - initial and final state




Task 2 - initial and final state




Task 3- initial and final state




Task 4- initial and final state




Task 5- initial and final state




Task 6- initial and final state



 

The KPC logs and the Basic Key Logger


The Basic Key Logger captures the events of the operating system (Windows (R), for now) and generates two log files: a key log that contains the input events, and a KPC log. 

The KPC log contains K P C operations and three additional symbols: W (scrolls of mouse wheel), A (key presses generated automatically by the keyboard auto-repeat) and Q (quiet periods, i.e., no entry during for at least a 1s duration). 

Timing information 

The start time of an operation is the time stamp of its first event (temporal resolution ~ 10ms). The duration of an operation is the difference between the time stamps of its first and last event. This method allows detecting pauses between operations (or during operations, see below) and overlaps between consecutive operations. 

Conversion of sequences of events into operations 

The conversions rules are presented in Table 2.


Table 2. Generation of the operations from the input events in Basic Key Logger. Note that Basic Key Logger can be configured differently.
Operation Description Generation
A
automatic key press (keyboard auto-repeat)
A is generated for each key press of some key X when X was already pressed and had not been released (the previous event on X is a key press instead of a key release).
A ends with the next event on key X
C
click on mouse button
C is generated for each press of some mouse button X.
C ends with next release event on button X.
K
key press
K is generated for each key press of some key X when X was not previously pressed.
K ends with the next event on key X.
P
pointing
P is generated for each sequence of mouse move events that occur within 100ms (counted from the former mouse move) and/or have no intermediary event of a different type.
P ends when there is an intermediary event and a delay above 100s between 2 mouse moves.
Q
pause Q is generated when that there are no input events for at least 1s.
Q ends with the next input event.
W
mouse wheel scroll
W is generated for each sequence of mouse wheel activations that occur within 100ms (counted from the former mouse wheel activation) and/or have no intermediary event of a different type.
W ends when there is an intermediary event and a delay above 100s between 2 mouse wheel activations.




Comments

The first event of a K operation is a key press and its last event is the corresponding key release (button press and release for C operations). Automatic key presses are detected when the key is "pressed" without being previously released.

Ps and Ws are processed in similar ways. A P initiates when the mouse starts moving (mouse move event), and stops when the mouse stops moving and another operation is executed, i.e.

1) the delay between mouse move events is above a continuity threshold (100 ms) and

2) there is another operation (keyboard or button related)

If the mouse stops moving but there is no other operation, a pause Q is inserted but the P does not stop, i.e., the P and the Q overlap.

If some operation occurs while the mouse moves continuously, the operation is inserted but the P continues, i.e., the 2 operations overlap.


Data capture and processing



Log files and KPC sequences of participants

The KPC sequences for each trial were produced as follows. 

1) The logs of the participants were recorded by means of the Basic Key Logger during the entire session. During the session, the test sequencer tagged the logs at the beginning and the end of each trial (i.e., special events startTrial and stopTrial were inserted in the logs).

2) The KPC sequences were extracted from the logs for tasks 1-6 and executions 2-5 (24 trials). Recall that the first execution was considered an additional practice.

3) The sequences were placed into a spreadsheet and the following data were computed

3.1) Performance indicators. For each sequence, the number of operations N, the total execution time T and the total execution time of the operations Tope (larger than T because of the overlaps) were determined for the current sequence

3.2) Statistics on operations. For each type of operation K, P, C, A, W, the number of operations, the total execution time, and the average, median and standard deviation of the unitary execution time were computed for the current sequence.

3.3) Pauses and overlaps. The number of pauses Q and overlaps O, their total durations, and the average, median and standard deviation of their unitary duration were computed for the current sequence.

3.4) Mouse movement statistics. The total pointing distance D (distance between initial and final position of a pointing movement), the total length L (length of the trajectory of the mouse during a pointing movement; L > D as soon as trajectory is not a straight line), and the average, median and standard deviation for the unitary distance and length were computed for the current sequence.

  
Reference sequences in KPC and KLM

The KPC and KLM models use reference sequences to predict the execution time of a task. 

Reference sequences are presumably optimal, i.e., whatever solution adopted by an expert user or a skilled user should be roughly equivalent (in terms of execution time). 

Reference sequences were determined independently by the 3 analysts.

The analysts determined their strategies during a first exploration of the interfaces. The strategies were coarse-grained, i.e., defined by their main steps, e.g., "fill field A then move with tab, then copy-paste result", etc.

The strategies of the analysts were executed and the raw KPC sequences were recorded with the software Basic Key Logger.

More about Basic Key Logger

Then, the reference sequences in KPC and KLM were obtained semi-manually by means of a spreadsheet. The result was a set of 36 sequences (6 tasks, 3 analysts, 2 models). Here is the result (and the spreadsheet)

  
Details of processing

1) The KPC logs were copy-pasted into the spreadsheet.

2) The errors and redundant operations were manually identified and a mark "useless X" was inserted .

3) The missing operations (e.g., C release not recorded in the log) were manually identified and a mark "missing X" was inserted in the spreadsheet.

4) The spreadsheet automatically produced a filtered sequence in which operations were added and/or removed, As (automatic key presses) were transformed into Ks and W (mouse wheel scrolls) were transformed into Ps. 

This was the KPC reference sequence. 
 
In a second stage, the KLM sequence was produced.The spreadsheet automatically produced successive transformations of the reference sequence towards the final KLM sequence.

5) In the first transformation, the  shift, ctrls, alts etc were removed in order to keep only simple keystrokes, the double Cs were transformed into single Cs, the Qs (pauses) were transformed into Rs

Note. It is possible to override these rules by adding manually "Ks" or "Cs", as seen before.

The result was a sequence of C, K, P, K, R.

6) In the next transformation, the spreadsheet determined the current entry (keyboard or mouse), added Hs at the transitions, and transformed the C into K.

The result was a sequence of H, K, H, P, R.

7) The mental activities were identified and marked. We used the description of the analyst's strategy and the heuristics of CRITIQUE (Hudson & al., 1999)

when the participant started working with an interactive component (field, button)

when the participant switched the interaction mode with a widget.

Note. In case of ambiguity in the application of these heuristics, no M was inserted.

The result was the KLM reference sequence composed of H K M P Rs. 

Prediction of the models

For KPC, we used the estimates of execution times determined in experiment 1.

Table 2. Estimates of execution times ofor KPC in milliseconds

 operationdescriptionestimate
Cbutton click 260
Kkey press 360 
Ppointing movement820 
Qforced pause1000 

For KLM, the predictions of execution times were computed using the estimates of (Card & al., 1980) for skilled typists (Table 3).

Table 3. Estimates of execution times of operations in the original KLM (Card et al., 1980) in milliseconds

 operationdescriptionestimate
 Hhoming (hand movement) 400
 Kkey press for skilled typist or mouse click 200 
 Mmental activity (pause) 1350 
 P pointing movement1100 


Finally, for each task, we averaged the predictions of the execution times for the sequences of the 3 analysts.


Comparison of KPC and KLM

We compared the prediction of execution time obtained with KPC and KLM with the actual performance of the participants on tasks 1-6. 

We computed the average and the standard deviation of the execution time of participants 1-20 on trials 2-5 for each task.

Then we computed the bias (difference between average execution time of participants and model prediction) and the dispersion of the participants' execution time (standard deviation/average).

The dispersion indicates the limit of precision of any model, biased or not. 

For instance, a dispersion of 0.5 means that the average distance between the prediction of an unbiased model and the actual execution time of a trial is 50% of the predicted execution time.


See result in the following spreadsheet


(see legend in spreadsheet) 

Figure 1 depicts the bias for each task (all levels of training together) and Figure 2 depicts the bias at different levels of training for each task, and the average (across all tasks).



Figure 1. Bias for each task.  X-axis task. Y-axis: relative bias (average bias of task for participants 1-20, repetitions 2-5 as a percentage of the average execution time) for KPC ( light grey) and KLM (dark grey) . Transparent bar: dispersion. The rightmost bars represent the grand average of bias and dispersion (averages across tasks 1-6)


Counts of the repeated/novel sequences of the participants

We first determined whether skilled users employed stereotyped sequences or not. To do so, we counted the repeated sequences in 3 ways:

identical sequences: each operation is the same, at the same place. This is a strict condition. As expected there were few (0.4%)

identical operations: the numbers of operations K, P, C and Q are the same, whatever the order. This means that i) the KPC predicts the same execution time for these sequences and ii) permuting the order of the operations (e.g., fill fields in any order) is considered the same strategy. about 2.5% of the sequences had identical operations.

identical prediction: the execution time predicted by KPC is identical, even if the operations are different. This means that the KPC predicts that the sequences are equally efficient. There were about 10% of sequences with identical prediction.

See the results in 


there is one sheet per task, summary on the 'result' sheet.


Optimality of the sequences of the participants

The KPC model hypothesizes that skilled computer users become expert very fast at a novel task. We thus compared the sequences of the participants with those of the analysts, i.e., the reference sequences.

If analysts are experts and participants are not, the reference sequences should be 'better' than those of the participants.

Note that it was pointless to compare execution times, because they depend markedly on individual motor skills. Therefore we compared the sequences according to their predicted execution time in KPC. 
The faster the prediction, the more efficient is - in theory - the strategy below the sequence.

We thus computed the predicted execution time of each sequence (participants 1-20, tasks 1-6, trials 2-5) using the KPC model. 

Then, we compared for each task the predicted execution time with :

the most efficient reference sequence (among the 3 analysts'). Sequences faster than this sequence were marked as optimal.

the prediction of KPC i.e., the average of the 3 references sequences. Sequences faster than this average were marked as efficient.

Figure 2 depicts the proportions of optimal and efficient sequences for each tasks and Figure 3 depicts the distributions of predicted execution time. The numbers are in:

comparison of KPC and KLM, repeated and optimal sequences (xls) 

Note. given the low proportion of redundant sequences, it was pointless to eliminate them, the results would have been similar.




Figure 2. Proportion of optimal (light grey) and efficient (dark grey) sequences for each task. X-axis: task. Y-axis: proportion of sequences. Rightmost columns: grand average of the proportions across the 6 tasks.

Figure 3. Histograms of predicted execution time for the participants' sequences. X-axis: predicted time in seconds. Y-axis: frequency of sequences with a given predicted execution time. Thick bar: model prediction, i.e., average of the 3 analysts' sequences. The left part of the distribution is efficient, i.e., the sequences of the participants are in theory more efficient than the analyst's





How to run the test


The test sequencer is not provided here. You can nonetheless run the test as follows.

Download printed instructions (zip) and unzip anywhere

Download Basic Key Logger from here and follow installation instructions.

Open Word(R), Excel(R) and Powerpoint(R), run Basic Key Logger, and follow the printed instructions.

When you are done, close Basic Key Logger (deiconify and click on window). 


Two log are produced: Key log (raw, low level events) and KPC log (log of basic operations).

By default they are in subfolder data/

Note. The path to the logs are defined in the configuration files (.ini) in the directory of Basic Key Logger.

Open the log files with Excel (format = tab delimited or comma delimited, according to configuration of basic Key Logger). 

Note that without the test sequencer, you have to identify the trials' start and stop manually in the logs.






Copyright: (c) 2009 E.J. Fimbel, P-S Dube. This is open-access content distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.