English,
T., and G. W. Greenwood. Intelligent
Design and Evolutionary
Computation
© 2008 Springer. Personal use of this material is
permitted.
However, permission to reprint/republish this material for advertising
or promotional purposes or for creating new collective works for resale
or redistribution to servers or lists, or to reuse any copyrighted
component of this work in other works must be obtained from Springer.
Excerpts
from Chapter 1 of Design by
Evolution: Advances in Evolutionary Design, P. F. Hingston, L. C.
Barone, and Z. Michalewicz (eds.).

English, T. No More Lunch:
Analysis of Sequential Search
© 2004 IEEE. Personal use of this material is permitted.
However, permission to reprint/republish this material for advertising
or promotional purposes or for creating new collective works for resale
or redistribution to servers or lists, or to reuse any copyrighted
component of this work in other works must be obtained from the IEEE.
Note: This invokes
results proved in "On the Structure of Sequential Search: Beyond 'No
Free Lunch'."
AbstractSequential search algorithms of the type
predicated in conservation theorems are studied in their own right.
With representation of functions as strings, the sets of test functions
and search results are identical. This allows sequential search
algorithms to be treated as operators on distributions on functions.
Certain distributions, referred to as block uniform, are fixed points
for all algorithms. Sequential search preserves the identity of the
nearest fixed point and the KullbackLeibler distance to that point. In
practice, distributions of test functions are not block uniform, and
conservation properties hold to a degree that depends upon distance to
the nearest fixed point. Randomized sequential search is also analyzed.
Here the search operator generally moves the distribution closer to the
nearest fixed point. A random walk transforms any distribution into the
nearest fixed point.

English, T. On the
Structure of Sequential Search: Beyond "No Free Lunch"
Draft paper submitted for review:
4th European Conference on Evolutionary Computation in Combinatorial
Optimization (EvoCOP2004). Revised paper to appear in conference
proceedings.
Note: The primary
importance of this paper is that it proves some properties of
sequential
search assumed in "No More Lunch: Analysis of Sequential
Search." I recommend reading the latter paper
first.
AbstractIn deterministic, exhaustive, sequential
search the algorithm permutes a test function to obtain the search
result. The mapping from test functions to search results is a
onetoone correspondence. There is a partition of the set of functions
into maximal blocks of functions that are permutations of one another.
This permits reasoning about search to be reduced to reasoning about
search within blocks of functions. It also leads to the definition of
the block uniform distribution, in which functions within the same
block have the same probability. It is established that the
KullbackLeibler distance to the nearest block uniform distribution is
identical for the distributions of test functions and search results.
The distance is zero if and only if all search algorithms have
identically distributed search results. That is, a block uniform
distribution of test functions is necessary and sufficient for “no free
lunch.”

English, T.
M. Practical
Implications of New Results in Conservation of Optimizer Performance
© 2000 SpringerVerlag Berlin Heidelberg New York
(published in the Lecture
Notes in Computer Science series)
AbstractThree
theoretical perspectives upon conservation of performance in function
optimization are outlined. In terms of statistical information,
performance is conserved when the distribution of functions is such
that all domain subsets of a given size have identically distributed
random values. In terms of algorithmic information, performance is
conserved because almost all functions are algorithmically random.
Also, an optimizer's algorithmic complexity bounds the information
gained or lost in its processing of the test function. The practical
consequences of these theoretical results are explored, with emphasis
upon the results from algorithmic information theory, which are new.

English, T.
M. Ranking
Models for Combination.
© 2000 The Society of PhotoOptical Instrumentation
Engineers
AbstractA considerable amount of research has
addressed the methods and objectives of model combination. Very little
attention has been given to the question of how to obtain a good
collection of models for combination. Here a rationale for inductive
inference of multiple models of time series is developed in terms of
algorithmic information theory. A modelbased Kolmogorov sufficient
statistic is described, and is utilized in a recursive scheme for
ranking models in a population. Ranks are assigned in such a way that
the n topranked models are considered to be the best subset of
n models to use in combination. The ranking scheme is
appropriate for use in the selection operation of an evolutionary
computation. The treatment is primarily theoretical, but several
practical issues in ranking and model combination are addressed.

English, T. M. Optimization is
Easy and Learning is Hard in the Typical Function
© 2000 IEEE. Personal use of this material is permitted.
However, permission to reprint/republish this material for advertising
or promotional purposes or for creating new collective works for resale
or redistribution to servers or lists, or to reuse any copyrighted
component of this work in other works must be obtained from the IEEE.
AbstractElementary results in algorithmic
information theory are invoked to show that almost all finite functions
are highly random. That is, the shortest program generating a given
function description is rarely much shorter than the description. It is
also shown that the length of a program for learning or optimization
poses a bound on the algorithmic information it supplies about any
description. For highly random descriptions, success in guessing values
is essentially accidental, but learning accuracy can be high in some
cases if the program is long. Optimizers, on the other hand, are graded
according to the goodness of values in partial functions they sample.
In a highly random function, good values are as common and evenly
dispersed as bad values, and random sampling of points is very
efficient.

English, T. M. Nonlinear Combination of
Neural Networks from a Diverse and Evolving Population
Preprint of position paper appearing in Integrating Multiple Learned Models
(IMLM96).
AbstractThe objective is to predict on the basis
of recent observations the behavior of a
nonlinear dynamical system. An evolutionary algorithm searches a large
space of model
topologies and precision hyperparameters, fitting limitedprecision
models to older
observations and assessing how well the models generalize to more
recent observations. In
each generation, topologyprecision pairs are ranked according to an
inductive criterion
favoring validation errors that are not only small but low in
redundancy with those of
higherranked pairs. The fitted models associated with the highest
ranks of a generation are
combined in a nonlinear fashion. Their predictions are treated
collectively as the state vector
of the modeled system, and memory of errors in similar states of the
past is used to correct
and combine predictions in the present. Model combinations of different
generations are
themselves similarly combined. Empirical results are given for
populations of recurrent
neural networks modeling chaotic systems.
Emended reference list

English, T. M. Evaluation of
Evolutionary and Genetic Optimizers: No Free Lunch
© 1996 MIT Press
Note: This is the
second of all peerreviewed publications on "no free lunch." It proves
that
Shannon information is conserved in sequential search, and that
NFL is an immediate consequence. It also shows that optimization by a
random
walk is highly efficient for a uniform distribution of objective
functions. The paper explains why the formal results do not apply to
biological evolution.
AbstractThe recent “no free lunch” theorems of Wolpert
and Macready indicate the need to reassess empirical methods for
evaluation of evolutionary and genetic optimizers. Their main theorem
states, loosely, that the average performance of all optimizers is
identical if the distribution of functions is average. The present work
generalizes the result to an uncountable set of distributions. The
focus is upon the conservation of information as an optimizer evaluates
points. It is shown that the information an optimizer gains about
unobserved values is ultimately due to its prior information of value
distributions. Inasmuch as information about one distribution is
misinformation about another, there is no generally superior function
optimizer. Empirical studies are best regarded as attempts to infer the
prior information optimizers have about distributions–i.e., to
determine which tools are good for which tasks.


