A near optimal solution to exploration/exploitation dilemma in fully observable MDPs:
Strens M J A, 2000. A Bayesian framework for reinforcement learning, In Proceedings of the Seventeenth International Conference on Machine Learning. http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.140.1701
Differential evolution operators can be used in a population MCMC framework for a very effective vector space sampler or optimizer:
Strens M J A, Bernhardt M, Nicholas Everett, 2002. Markov chain monte carlo sampling using direct search optimization, In Proceedings of the Nineteenth International Conference on Machine Learning.
Principled version of genetic algorithms. Can be used for discrete sampling or optimization:Strens M J A, 2003. Evolutionary MCMC sampling and optimization in discrete spaces, In Proceedings of the Twentieth International Conference on Machine Learning ICML-2003
Learning strategies where evaluation of performance (across a large set of scenarios) is expensive:
Strens M J A, Moore A W, 2003. Policy Search using Paired Comparisons, 2003. Journal of Machine Learning Research.
The competitive attentional tracker... effective track-before-detect at low signal to noise ratios and in clutter:
Strens M J A, Gregory I N, 2003. Tracking in Cluttered Images, Journal of Image & Vision Computing.
Sampling from a function that is an exponential of a sum, but avoiding evaluating the sum at every step. Efficient for Bayesian inference from large datasets. A form of rejection sampling with highly directed proposals constructed using subsets of the data:
Strens M J A, 2004. Efficient hierarchical MCMC for policy search. In Proceedings of the twenty-first International Conference on Machine Learning. http://www.machinelearning.org/proceedings/icml2004/papers/177.pdf
The use of stochastic task models for dynamic replanning in multi robot task allocation:
Strens M J A, Windelinckx N, 2005. Combining Planning with Reinforcement Learning for Multi-robot Task Allocation. Springer Lecture Notes in Computer Science, Volume 3394.
Some POMDP problems have special structure, allowing efficient solution:
Strens M J A, Learning multi-agent search strategies, 2005. Springer Lecture Notes in Computer Science, Volume 3394, pp 245-259.