Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets

Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, Nurjahan Begum, Yifei Ding, Hoang Anh Dau, Diego Furtado Silva, Abdullah Mueen, and Eamonn Keogh

Introduction

This page was built in support for the paper Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets. The resources contained in this page include demonstration videos, source code for the proposed STAMP algorithm, source code necessary for reproducing the figures and experiments, additional results, and supporting materials.

Demo Video

Anytime STAMP

The following video shows how STAMP's anytime property can help us quickly identify the motifs we shown in Figure 8.

STAMPI

The following video shows STAMPI in action. The time series used in the video is the same time series used for plotting Figure 14 and Figure 15 in paper.

Time Series Set Similarity

The following video demonstrates the utility of STAMP in terms of discovering conserved patterns between two time series. The example we shown in the video is mentioned in the paper's Section IV.D.

Source Code

STAMP

Figure

Table

Additional Result

Additional Scalability Test Result

The table below shows STAMP's runtime difference for different "types" of time series length (i.e., power of two, non-power of two but composition number, and prime number.) The runtime on time series with power of two length is at least 14% faster than other types of length given that all the length are close to 131,272.

To demonstrate the relative runtime difference graphically, we have converted above table into the following plot.

The code for this set of experiments can be download here.

Additional Time Series Difference (TSD) Result

To further demonstrate the functionality of TSD operator defined in Section IV.D, we applied the method on a ECG telemetry data in addition to the Harry Potter example we included in the paper. We took 2.67 minutes long ECG data of an 89 year old female patient from the mitdb/207 dataset. We downsampled the data by a factor of 15 and treated the first half of the time series as TA and the second half as TB. The patient information is available here. We ran BA join on this data. From the PBA curve, the high values indicate instances present in TB but not in TA, i.e. the TSD operation. From the annotation of this data, we found two visually dissimilar occurrences of Ventricular escape beat (E) and left bundle branch block beat (L).

The code for this set of experiments can be download here.

Supporting Material