Finding and Visualizing Time Series Motifs of All Lengths Using the Matrix Profile
Scalable KInetoscopic Matrix Profile (SKIMP) is a family of algorithms which compute the Pan Matrix Profile (PMP), a new data structure which contains the nearest neighbor information for all subsequences of all lengths. This data structure allows the first truly parameter-free motif discovery algorithm in the literature.
In exploratory data mining, we may have no idea as to the subsequence lengths in which patterns are conserved in a dataset necessitating the need for variable-length motif discovery.
This very basic problem is ubiquitous in nearly all domains as the user's choice limits what regularities can be found in the dataset.
In many cases, the suggested subsequence length for motif discovery is not readily apparent. This problem is exacerbated if a time series has multiple motifs of different lengths.
In this work we solve the motif-length sensitivity problem by introducing the Pan Matrix Profile, a data structure that contains all Matrix Profile information of a time series, and SKIMP, a family of parameter-less, anytime algorithms used to quickly approximate SKIMP.