Rank Sum Tests

The total length of the videos in this section is approximately 20 minutes, but you will also spend time answering short questions while completing this section.

You can also watch these videos at the playlist linked here.

Rank sum test

The rank sum is an example of a randomization/permutation test, and randomization/permutation tests are examples of "non-parametric tests." This is in contrast to "parametric" tests that involve estimating an estimand based on a target population (such as a t-test or a regression).

Non-parametric tests are particularly appropriate for small sample sizes or when you suspect that the assumptions underlying parametric tests (such as "normal," bell-curve-shaped data) are not true. The rank sum test is a straightforward option when there is censoring. For randomized experiments, I prefer the randomization test even when the sample size is large and the assumptions for a parametric test may be true, because there really are no assumptions for a randomization test. We can exactly calculate the randomization distribution.

RankSum.1.Intro.mp4

Question 1: If one of the data values is known only to be >12, which of the following could be used as test statistics for comparing two groups? Check all that apply.

Show answer

difference in two group medians, ratio of two group medians, difference in two group third quartiles

You can't take a mean of >12, but you do know how it compares to other observations (it's bigger). So, for any rank-based statistic, like median or third quartile, you can move forward. Here, we introduce another popular rank-best test statistic.

Rank sum test details

RankSum.2.Details.mp4

Question 2: Suppose that you observe the following data:

Group A: 10, 14

Group B: 4, 50, 100

What is the value of the rank sum statistic?

Show answer

Group A is smaller, so we add the ranks of the values in that group. 10 and 14 are the second and third ranked values among these five, so T=5.

p-values

RankSum.3.Pvalues.mp4

Question 3: The word "significance" is typically used when we decide that a p-value is small enough to reject a null hypothesis. As we've discussed, a conventional but arbitrary cutoff of 0.05. So, if we are comparing two groups, and the null hypothesis is that group status is unrelated to the values, a p-value less than 0.05 might lead us to say that the two groups are significantly different.

In the example in the video, are the two groups significantly different, according to the rank sum test?

Show answer

No. The p-value is bigger than 0.05 or any other reasonable cutoff.

Why sum of ranks?

IntroToNPTests.10.WhyRankSum.mov

In case you are looking for details:

Consider summing the numbers from 1 to 4. N=4.

1 2 3 4

If we pair the numbers from the outside, we get (1 + 4) + (2 + 3) = 5 + 5 = 5 * 2.

The sum of each pair is 5 = N+1.

The number of pairs is N/2.

So, the sum of the numbers from 1 to N is (N+1) * N/2.

The rank sum statistic is the sum of a subset of these numbers, perhaps n1 of them. So, multiply this expression by n1/N to obtain the expected value of the rank sum statistic: (N+1) * n1/2.

For example, if the sum of the numbers from 1 to 4 is 10, and we are going to sum any two of these numbers, the most likely result will be 10/2 = 5.

This argument leads to the same result for odd N - feel free to try it!

When to use the rank sum

IntroToNonParametricTests.11.WhenRankSum.mov

Question 4: Would the p-value from the rank sum example change if we replaced the 0 by -30 million?

Show answer

No. We say that the rank sum test is "resistant" to outliers.

Question 5: Why are rank sum tests (and all randomization/permutation tests) appropriate for small sample sizes, not only large sample sizes?

Show answer

Other methods, like t-tests, estimate the reference distribution by making assumptions. With small sample sizes, these assumptions of other methods - such as that the data set follows a normal distribution - are less likely to be true.

That's it for this section.

During this tutorial you learned:


Terms and concepts:

Randomization test, rank sum test, censored data, significance, outlier, resistant