Constructive Machine Learning

Constructive machine learning describes a class of related machine learning problems where the ultimate goal of learning is not to find a good model of the data but instead to find one or more particular instances of the domain which are likely to exhibit desired properties. While traditional approaches choose these domain instances from a given set/databases of unlabelled domain instances, constructive machine learning is typically iterative and searches an infinite or exponentially large instance space.

Constructive machine learning aims to automate the process of constructing virtual objects with given properties. This is different from traditional machine learning which aims to automate the process of determining some properties of given virtual objects. Both, traditional and constructive machine learning, infer a model of the unknown dependency between virtual objects and their properties from a few given examples exhibiting this dependency. The type of problems considered by constructive machine learning, however, is much closer to the type of problems encountered in many real-world applications, ranging from computer games to pharmaceutical chemistry and beyond. In this project, we investigate constructive machine learning from its theoretical foundations to its real-world applications: What models allow for efficient construction? Can model learning be biased towards models that allow for efficient construction? How many examples are needed to build such a model? How effective are constructive machine learning approaches in real-world applications?

This research of Thomas has been funded from 2010 to 2015 by an Emmy Noether grant of the German Research Foundation (DFG) under reference GA 1615/1-1. The topic is currently investigated with Dino Oglic (University of Nottingham and University of Bonn) and Prof Roman Garnett (Washington University in St Louis). It is also the focus of an ongoing workshop series co-chaired by Thomas and partially sponsored by Allianz and Sony.

Key publications are:

  • Dino Oglic, Roman Garnett, and Thomas Gärtner. Active search in intensionally specified structured spaces. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

  • Olana Missura and Thomas Gärtner. Predicting dynamic difficulty. In Advances in Neural Information Processing Systems 24, 2011.

  • Thomas Gärtner and Shankar Vembu. On structured output training: Hard cases and an efficient alternative. Machine Learning, 76(2-3), 2009. Paper selected from the ‘European Conference on Machine Learning’.

Data-driven Life and Health Science

We investigate how advanced machine learning algorithm can help in the drug discovery process. Particular topics include active search, task selectivity, orphan screening, and fingerprint optimisation. This research of Thomas is currently investigated with Katrin Ullrich (University of Bonn), Michael Kamp (Fraunhofer IAIS), Professor Jonathan Hirst (University of Nottingham), Prof Roman Garnett (Washington University in St Louis), and the department of life sciences at the B-IT of the University of Bonn.

Key publications are:

  • Katrin Ullrich, Michael Kamp, Thomas Gärtner, Martin Vogt and Stefan Wrobel. Ligand-based virtual screening with co-regularised support vector regression. In 2016 IEEE 16th International Conference on Data Mining Workshops, 2016.

  • Roman Garnett, Thomas Gärtner, Martin Vogt, and Jürgen Bajorath. Introducing the ‘active search’ method for iterative virtual screening. Journal of Computer-Aided Molecular Design, 2015.

  • Hanna Geppert, Jens Humrich, Dagmar Stumpfe, Thomas Gärtner, and Jürgen Bajorath. Ligand prediction from protein sequence and small molecule information using support vector machines and fingerprint descriptors. Journal of Chemical Information and Modeling, 2009.

Computational Aspects of Mining and Learning

Significant advances in machine learning and data mining over the last few years have fundamentally changed the nature of tasks that can be tackled automatically by machines. These advances, in turn, have revealed new challenges for machine learning and data mining algorithms. It is this exciting interplay between theory and application that motivates us to aim at further narrowing the gap between what's known to be possible and what's known to be impossible.

We investigate the computational aspects of making machine learning and data mining algorithms more efficient and more effective. This research of Thomas has partially been funded from 2011 to 2014 by the German Research Foundation (DFG) as `Effective Well-behaved Pattern Mining through Sampling’ under reference GA 1615/2-1. The topic is currently investigated with Dino Oglic (University of Nottingham and University of Bonn) and Michael Kamp (Fraunhofer IAIS).

Key publications are:

  • Dino Oglic and Thomas Gärtner. Greedy feature construction. In Advances in Neural Information Processing Systems 29, 2016.

  • Dino Oglic, Daniel Paurat, and Thomas Gärtner. Interactive knowledge-based kernel PCA. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, 2014.

  • Mario Boley, Sandy Moens, and Thomas Gärtner. Linear space direct pattern sampling using coupling from the past. In The 18th annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012.