There are numerous statistical analysis methods, and mastering them can take years. For simple A/B tests on the web, however, it's enough to how to apply some simple techniques. This page shows you how.

Tools R

The R scripts on this page make it easy to do the kinds of analysis we do in IS211. R is available for free on Windows, Macintosh, and Linux.

Running Examples

Getting Help

Other Tools

You may use JASP Notion.

You can run your T-test using Microsoft Excel, here are some supplementary information about running T-test in Excel

These tools are not supported, but have been used effectively by others in is211.

Variable Types

Interval Variable Tests (Normal Distributions Only)

The most common case in is211 is to compare the mean of some interval variable between version A and version B of an interface. The statistical test we use depends on the type of experiment we are running.

Note that these tests can only be used if the variable is normally distributed. Check the variable's histogram and use a test for ordinal variables if your variable does not look normally distributed. Variables that are often normally distributed include task time, number of clicks, or money.

Ordinal Variable Tests (Also for Non-Normal Interval Variables)

Use one of the following tests to compare means of an ordinal variable, such as Likert scale responses, between version A and version B of an interface. Again, the statistical test we use depends on the type of experiment we are running.

You should also use these tests instead of t-tests for comparing means of interval variables whenever the histogram for the variable does not look normally distributed (for example, when it is heavily skewed in one direction). Variables like number of errors are usually skewed toward 0, so we often don't bother with t-tests at all in this case.

(Are you wondering why we use the t-test at all? It's because the t-test has more power to detect significant results. In other words, it takes fewer participants to see significant results. So use the t-test if you can.)

Nominal Variable Tests

Comparing Counts

The following test compares counts of a nominal variable. This is useful for comparing how often an event occurs between two versions of an interface. For example, how many users succeed in completing a task or click a link when using Version A vs. Version B? The following test will help you check whether or not these differences are significant. This test works for both within-subjects and between-subjects experiments.

Comparing Preferences

In within-subjects experiments, it's fairly common to ask participants to choose between two alternatives (e.g. "Do you prefer A or B?"). A binomial test can help you determine if you have enough participants to show that this preference is significant.