PAKDD 2019

IoT Big Data Stream Mining Tutorial

Presenters: Joao Gama, Albert Bifet, and Latifur Khan.

Summary:

The challenge of deriving insights from the Internet of Things (IoT) has been recognized as one of the most exciting and key opportunities for both academia and industry. Advanced analysis of big data streams from sensors and devices is bound to become a key area of data mining research as the number of applications requiring such processing increases. Dealing with the evolution over time of such data streams, i.e., with concepts that drift or change completely, is one of the core issues in IoT stream mining. This tutorial is a gentle introduction to mining IoT big data streams. The first part introduces data stream learners for classification, regression, clustering, and frequent pattern mining. The second part deals with scalability issues inherent in IoT applications, and discusses how to mine data streams on distributed engines such as Spark, Flink, Storm, and Samza.

Content:

1. IoT Fundamentals and Stream Mining Algorithms

– IoT Stream mining setting

– Concept drift

– Classification and Regression

– Clustering

– Frequent Pattern mining

2. IoT Distributed Big Data Stream Mining

– Distributed Stream Processing Engines

– Classification

– Regression

– Open Source Tools

– Applications

Short Bio.

Joao Gama's Profile

Joao Gama received, in 2000, his Ph.D. degree in Computer Science from the Faculty of Sciences of the University of Porto, Portugal. He joined the Faculty of Economy where he holds the position of Associate Professor. He is also a senior researcher and vice-director of LIAAD, a group belonging to INESC TEC. He has worked in several National and European projects on Incremental and Adaptive learning systems, Ubiquitous Knowledge Discovery, Learning from Massive, and Structured Data, etc. He served as Co-Program chair of ECML'2005, DS'2009, ADMA'2009, IDA' 2011, and ECM-PKDD'2015. He served as track chair on Data Streams with ACM SAC from 2007 till 2016. He organized a series of Workshops on Knowledge Discovery from Data Streams with ECMLPKDD conferences and Knowledge Discovery from Sensor Data with ACM SIGKDD. He is author of several books in Data Mining (in Portuguese) and authored a monograph on Knowledge Discovery from Data Streams. He authored more than 250 peer-reviewed papers in areas related to machine learning, data mining, and data streams. He is a member of the editorial board of international journals ML, DMKD, TKDE, IDA, NGC, and KAIS.a Researcher at LIAAD, University of Porto, working at the Machine Learning group. His main research interest is in Learning from Data Streams. He published more than 80 articles. He served as Co-chair of ECML 2005, DS09, ADMA09 and a series of Workshops on KDDS and Knowledge Discovery from Sensor Data with ACM SIGKDD. He is serving as Co-Chair of next ECM-PKDD 2015. He is author of a recent book on Knowledge Discovery from Data Streams.

Albert Bifet's Profile

Albert Bifet is Professor at University of Waikato and LTCI, Telecom ParisTech. Previously he worked at Huawei Noah's Ark Lab in Hong Kong, Yahoo Labs in Barcelona and UPC BarcelonaTech. He is the author of a book on Adaptive Stream Mining and Pattern Learning and Mining from Evolving Data Streams. He is one of the leaders of MOA and Apache SAMOA software environments for implementing algorithms and running experiments for online learning from evolving data streams. He is serving as Co-Chair of the Industrial track of IEEE MDM 2016, ECML PKDD 2015, and as Co-Chair of BigMine (2019- 2012), and ACM SAC Data Streams Track (2019-2012).

Latifur Khan's Profile

Latifur Khan is a full Professor (tenured) in the Computer Science department at the University of Texas at Dallas where he has been teaching and conducting research since September 2000. He received his Ph.D. and M.S. degrees in Computer Science from the University of Southern California in August of 2000, and December of 1996 respectively. He has received prestigious awards including the IEEE Technical Achievement Award for Intelligence and Security Informatics. Dr. Khan is an ACM Distinguished Scientist and a Senior Member of IEEE. He has chaired several conferences and serves (or has served) as associate editor on multiple editorial boards including IEEE Transactions on Knowledge and Data Engineering (TKDE) journal. He has conducted tutorial sessions in prominent conferences such as ACM WWW 2005, MIS2005, DASFAA 2007, and WI 2008 ( "Matching Words and Pictures - Problems, Applications, and Progress" ) and PAKDD 2011 ( "Data Stream Mining Challenges and Techniques").

.

PAKDD-19-IoT-Tutorial.pdf