Forest Proximities for Proximity Forests

Project Summary

RF-GAP has recently been introduced as an improved random forest proximity measure. In this paper, we present PF-GAP, an extension of RF-GAP proximities to proximity forests, an accurate and efficient time series classification model. We use the forest proximities in connection with Multi-Dimensional Scaling to obtain vector embeddings of univariate time series, comparing the embeddings to those obtained using various time series distance measures. We also use the forest proximities alongside Local Outlier Factors to investigate the connection between misclassified points and outliers, comparing with nearest neighbor classifiers which use time series distance measures. We show that the forest proximities seem to exhibit a stronger connection between misclassified points and outliers than nearest neighbor classifiers.

Datasets

The details about the 64 datasets from the UCR 2018 archive used in this paper can be found here: https://www.timeseriesclassification.com/dataset.php

Jupyter Notebooks

Notebook 1: Proximity-Based Time Series Analysis with LOF and F1 Evaluation on UCR Datasets

Notebook 2: K-Nearest Neighbors classifier Evaluation with F1 Evaluation on UCR Time Series Datasets

Notebook #3: K-Nearest Neighbors classifier and LOF Evaluation with F1 Evaluation on UCR Time Series Datasets

Page updated

Google Sites

Report abuse