Advanced Topics on Data Stream Mining
Albert Bifet, João Gama, Ricard Gavaldà,
Georg Krempl, Mykola Pechenizkiy,
Bernhard Pfahringer, Myra Spiliopoulou, Indrė Žliobaitė
ECML PKDD 2012, Bristol, Sept. 24
Nowadays, the quantity of data that is created every two days is estimated to be 5 exabytes. This amount of data is similar to the amount of data created from the dawn of time up until 2003. Moreover, it was estimated that 2007 was the first year in which it was not possible to store all the data that we are producing. This massive amount of real time streaming data opens new challenging discovery tasks. Some of them are already addressed with mature algorithms, while new challenges emerge, including learning on not one but multiple streams. This tutorial has two parts. The first part gives an introduction to recent advances in algorithmic techniques and tools to cope with challenges on stream mining. The second part discusses state of the art research on mining multiple streams – distributed streams and interdependent relational streams.
NOTICE: This tutorial is longer than the others ECML-PKDD 2012 tutorials.
UPDATE: The schedule is the following: 9:00-10:30 ‘Mining One Stream’; 10:45-12:15 'Mining Multiple Streams’ .
The first part (9:00 – 10:30), ‘Mining One Stream’, will be presented by Albert Bifet, Ricard Gavaldà, Mykola Pechenizkiy, Bernhard Pfahringer, and Indrė Žliobaitė.
The second part (10:45 – 12:15), ‘Mining Multiple Streams’ will be presented by João Gama, Myra Spiliopoulou, and Georg Krempl.
Albert Bifet. Researcher at Yahoo! Research Barcelona. He is the author of a book on Adaptive Stream Mining and Pattern Learning and Mining from Evolving Data Streams. He is one of the core developers of MOA software environment for implementing algorithms and running experiments for online learning from evolving data streams.
João Gama. Researcher at LIAAD, University of Porto, working at the Machine Learning group. His main research interest is in Learning from Data Streams. He published more than 80 articles. He served as Co-chair of ECML 2005, DS09, ADMA09 and a series of Workshops on KDDS and Knowledge Discovery from Sensor Data with ACM SIGKDD. He is author of a recent book on Knowledge Discovery from Data Streams.
Ricard Gavaldà. Professor at the Department of Software, U. Politècnica de Catalunya – BarcelonaTech. He has published over 70 papers and supervised 7 Ph.D. students. His current research interests are algorithmics of machine learning and data mining, with emphasis on streaming and adaptive methods. He is also working on the use of data mining in autonomic and green computing.
Assistant Professor at the Department of Computer Science, Eindhoven
University of Technology, the Netherlands. He has broad research
interests in data mining and its application to various (adaptive)
information systems serving industry, commerse, medicine and education.
He has been organizing several workshops and conferences in these areas.
Associate Professor with the Computer Science Department of the
University of Waikato. His main research interests are in Machine
Learning and Data Mining, especially in efficient algorithms, stream
mining, randomization, and applications.
Professor of Information Systems in the Faculty of Computer Science,
Otto-von-Guericke-University Magdeburg, and Chair of the Knowledge
Management & Discovery (KMD) lab. Main research interest is mining
in evolving systems. PC Co-Chair of ECML PKDD 2006 and NLDB 2008,
Tutorials Co-Chair at ICDM 2010, Workshops Co-Chair at ICDM 2011, PC
Co-Chair of GfKl 2012 and Demo Track Co-Chair at ECML PKDD 2012.
Lecturer in computational intelligence at Bournemouth University, UK
and a research task leader within the INFER.eu project. Her research
interests and competences concentrate around online predictive modeling,
context awareness and adaptation over time, predictive analytics