Detecting Changes in Data Streams by Testing Exchangeability
Abstract: In a data streaming setting data points are observed sequentially. The data generating model may change as the data is streaming. In this paper, we propose detecting this change in data streams by testing the exchangeability properties of the observed data. Our martingale approach is an efficient, non-parametric, one-pass algorithm that is effective on the classification, cluster, and regression data generating models. Experimental results show the feasibility and effectiveness of the martingale methodology in detecting changes in the data generating model for time-varying data streams. Moreover, we also show that (i) an adaptive support vector machine (SVM) utilizing the martingale methodology compares favorably against an adaptive SVM utilizing a sliding window, and (ii) a multiple martingale video-shot change detector compares favorably against standard shot-change detection algorithms.
Patent:
(New) S.-S. Ho and H. Wechsler, Data Stream Change Detector, US Patent No. 8,877,963, December 6, 2011.
Code:
Tutorial:
“Conformal Predictions for Reliable Machine Learning: Theory and Applications”, organizers/presenters: Vineeth N Balasubramanian (Arizona State University), Shen-Shyang Ho, Sethuraman Panchanathan (Arizona State University), Vladimir Vovk (Royal Holloway, University of London), Tutorial Session, IJCNN 2011, San Jose, CA, 31 July 2011.
References:
S.-S. Ho and H. Wechsler, “A Martingale Framework for Detecting Changes in Data Streams by Testing Exchangeability”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 12, pp. 2113-2127, 2010.
S.-S. Ho, “Learning from Data Streams Using Transductive Inference and Martingale”, PhD Dissertation, Jan 2007.
S.-S. Ho and H. Wechsler, Detecting Change-Points in Unlabeled Data Streams using Martingale, Proc. 20th Int. Joint. Conf. Artificial Intelligence (IJCAI 2007), Hyderabad, India, Jan. 6 - 12, 2007.
S-S Ho, A Martingale Framework for Concept Change Detection in Time-Varying Data Streams, Proc. Int. Conf. on Machine Learning (ICML 2005), Bonn, Germany, Aug. 7 - 11, 2005.
S-S Ho and H. Wechsler, Adaptive Support Vector Machine for Time-Varying Data streams Using the Martingale, Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI 2005), Edinburgh, Scotland, July 30 - Aug. 5, 2005.
S-S Ho and H. Wechsler, On the detection of concept change in time-varying data streams by testing exchangeability, Proc. Conference on Uncertainty in Artificial Intelligence (UAI 2005), Edinburgh, Scotland, July 26 - 29, 2005.