Induction of Decision Trees

This line of research primarily explores issues of efficiency of algorithms for the incremental induction of decision trees.

... but, a really new research direction is related to hiding sensitive rules <<< jump there now!

We are mostly concerned with devising heuristics that outperform benchmark solutions and, sometimes, guarantee such a performance.

The research is described in the following papers:

D. Kalles, A. Papagelis and Y.C. Stamatiou. “Consolidating a Heuristic for Incremental Decision Tree Learning through Asymptotic Analysis”, International Journal on Artificial Intelligence Tools, Vol. 20, No. 1, pp. 29 – 52, 2011.

D. Kalles and A. Papagelis. “Stable Decision Trees: Using Local Anarchy for Efficient Incremental Learning”, International Journal on Artificial Intelligence Tools, Vol. 9, No. 1, pp. 79 - 95, 2000.

D. Kalles and A. Papagelis. “Controlled Flux Results in Stable Decision Trees”, IEEE International Conference on Tools with Artificial Intelligence, Chicago, 1999.

D. Kalles and D.T. Morris. “Efficient Incremental Induction of Decision Trees”, Machine Learning, Vol. 24, No. 3, pp. 231 - 243, 1996.

A secondary line of research explores issues of representation in decision trees.

A novel aspect of that research is the ability to include cost considerations in the induction of decision trees. When attributes share parts of the procedure that computes them, such considerations may affect our algorithms.

The research is described in the following paper (see attachment) and constituted a major part of my PhD work:

D.T. Morris and D. Kalles. "Decision trees and domain knowledge in pattern recognition", Pattern Recognition in Practice IV Conference, Vlieland, The Netherlands (published by Elsevier), 1994.

A further aspect of that research is what happens when one allows an attribute to have more than one value; this can sometimes lead to better classification as well as increased opportunities for post-processing since second-best results are more reliable.

The research is described in the following papers:

D. Kalles, A. Papagelis and E. Ntoutsi. "Induction of decision trees in numeric domains using set-valued attributes", Intelligent Data Analysis, Vol. 4, pp. 323 - 347, 2000.

(an updated version of) D. Kalles and A. Papagelis. “Induction of Decision Trees in Numeric Domains using Set-Valued Attributes”, Workshop on "Pre- and post-processing in machine learning and data mining: theoretical aspects and applications", Advanced Course on “Machine Learning and Applications”, Chania, Greece, 1999.

G. Orphanos, D. Kalles, A. Papagelis and D. Christodoulakis. "Decision Trees and NLP: A Case Study in POS Tagging", Workshop on "Machine learning in human language technology", Advanced Course on "Machine Learning and Applications", Chania, Greece, 1999.

Ask me for the code (in C) if you want to experiment, at your own risk :-). It is available for free for non-profit research and teaching (Unix tested).