How to measure a test?

Test Design

Running tests and experiments are how we learn. There two major questions you need to answer to design a successful test:

  • What are you trying to measure?
  • How do you know if you have been successful?

Answering the first question will help you figure out what you can and cannot measure. This will help design a framework for success. The second question is important to answer since a poorly designed test is in danger of delivering false positives and false negatives. This is a common pitfall of most tests.

Measuring Success

To measure success we need to build in adequate controls to act as a baseline reference.

There are two types of controls - control groups and control periods. Some experiments use both to evaluate the test results, some just rely on one of these control methods.

A control group is a matched group of people / stores whose performance is tracked at the same time as the test group. Comparing results between the two groups will help establish if the test has been successful or otherwise. This is subject to a few conditions, one of them being that the power of the test is sufficiently strong enough to determine a real effect in the data.

A control period is a matched time period. Typically, when we design a test we use pre and a post period to look at the results. A pre period is a period of time before the test began and the post period takes place after the test began. Both should be the same length of time and be representative of store / customer behavior.

Harnessing Statisical Power

A well designed test will be optimized on being able to measure a difference but also will designed around sufficient observations so as to give the test some concrete foundations. This calculation can be used to work out how many observations are required based on the order of magnitude of the expected outcome.

Using Extracts and Collections

These two features can be used to create an auto-updating measurement framework for tests. This takes the strain out of repeated measurement and data collection. Go here to learn more.

To specify a control group and control period, we have build a simple selector into our UI. For the control period, this is a relatively straight forward date selector. For the control group you first need to specify a control / test group variable - a field that will be used to divide up the data into two groups. You then need to specify values of this variable that correspond to test members and values of the variable that correspond to control members.