Longitudinal Rank Sum Test

The Problem

Clinical trials—especially in Alzheimer’s Disease (AD) and Parkinson’s Disease (PD)—don’t just measure one outcome. They track multiple variables (like cognition and motor function) across many time points. This is known as multivariate longitudinal data. But analyzing such data is no easy feat.

Traditional statistical tools like linear mixed models (LMMs) or univariate rank tests:

Either assume a parametric structure, normality for example (which often doesn't hold),
Or analyze each outcome separately, requiring multiple comparisons corrections that can reduce statistical power.

In simpler terms, we’ve had to choose between making too many assumptions or losing power. That’s where our new method comes in.

Our Solution: The Longitudinal Rank-Sum Test (LRST)

We introduce LRST, a nonparametric (no distributional assumptions), rank-based test for comparing two groups across multiple outcomes and multiple time points—all at once.

It’s inspired by the classic Wilcoxon rank-sum test—but extended to the world of longitudinal, multivariate clinical trials.

What Makes LRST Special?

Global View: Tests the overall treatment effect across all outcomes and time points—no need to run many individual tests.
Robust: Doesn’t assume normality or equal variances. Handles messy, real-world data like ties and skewness gracefully.
Efficient: Controls Type I error and improves power—helping detect true treatment effects with smaller sample sizes.

A Peek for Stat-Savvy Readers

The Longitudinal Rank-Sum Test (LRST) extends the Wilcoxon rank-sum test to handle multivariate longitudinal data—multiple outcomes measured repeatedly over time in two treatment groups.

Suppose you have:

Two groups: Control and Treatment
K different outcomes (e.g., cognitive score, daily function, motor skills)
T time points (e.g., weeks 13, 26, 39...)

We have provided asymptotic normality of the test statistic and extensive simulations to show our method performs at par with linear mixed effects model under the linear set-up, and outperformes a linear mixed effects model under the presence of nonlinearity.

We have also provided real data analysis results for BAPI 302 Trial and Azilect Study. For more information, see our published paper at Statistics in Biopharmaceutical Research.

Multi-Arm LRST: A Powerful Nonparametric Tool for Modern Clinical Trials

In many modern randomized clinical trials (RCTs)—especially in Alzheimer's and Parkinson’s research—multiple treatment arms (e.g., varying doses) are compared against a shared control, and multiple outcomes are measured across time. Traditional approaches often struggle here due to strict assumptions and the need for multiple testing corrections.

To address this, we propose a multi-arm extension of the Longitudinal Rank-Sum Test (LRST)—a robust, nonparametric method that scales beyond two groups and still leverages the longitudinal and multivariate nature of the data

Most methods for multi-arm trials fall into two camps:

Parametric models like LMMs or GEEs, which assume normality and are sensitive to outliers.
Multiple univariate rank tests, which need heavy multiplicity correction, reducing power.

The multi-arm LRST solves both problems by:

Using U-statistics to construct pairwise rank comparisons between control and each treatment arm.
Aggregating results into a global max-type test statistic that controls Type I error without Bonferroni correction.
Capturing treatment efficacy across all time points and outcomes

For each treatment group, it computes a rank-based treatment effect compared to the control. It then takes the maximum of these standardized differences as the test statistic.

Under the null hypothesis (no treatment effect), this follows the distribution of the max of correlated normal variables, for which numerical p-values are computed. Extensions are provided for four or more treatment arms.

Across various sample sizes and up to 7 treatment arms, the method controls Type I error near the nominal level.

In power simulations, the multi-arm LRST:

Outperformed Bonferroni-corrected LRSTs consistently
Accurately identified the most effective dose
Handled unequal effects across doses with ease

For more information please look at our arxived paper. The paper has been recently accepted at Journal of Multivariate Analysis (JMVA).

Designing Stronger Trials: Power and Sample Size for LRST

One of the most common questions when designing a clinical trial is: How many participants do we need to detect a meaningful treatment effect? This becomes especially tricky in modern trials—like those in Alzheimer's or Parkinson’s disease—where researchers collect multiple outcomes over multiple time points.

Our previously developed Longitudinal Rank-Sum Test (LRST) offers a nonparametric, robust framework for analyzing such data. But to fully support trial design, we needed a way to calculate power and sample size specifically tailored to LRST.

This paper fills that gap.

Multivariate longitudinal trials often involve:

Multiple domains (e.g., cognition, function, motor symptoms)
Repeated measures (e.g., 6–10 visits per subject)
Non-normal or ordinal data (e.g., rating scales)

Standard power calculations—based on linear models—don’t apply here. Our goal: develop a power/sample size method that works under the rank-based, nonparametric framework of LRST.

We developed a complete toolkit to answer:

How large should the trial be?
What’s the power of LRST given real data or assumed effect sizes?

This involved:

Deriving the asymptotic distribution of LRST under both null and alternative hypotheses.
Estimating the variance-covariance structure from data.
Creating closed-form power and sample size formulas.
Validating through extensive simulations and real trial data.

The paper is submitted to Statistics in Medicine, and is also available at arxiv.

Page updated

Google Sites

Report abuse