Dynamic programming: Planning beyond the next trial in adaptive experiments
By Woojae Kim (Ohio State University)
Abstract: Experimentation is at the heart of scientific inquiry, whether one is interested in studying the threshold characteristics of spatial vision or understanding the neural basis of memory dysfunction. Regardless of discipline, it would ideal to design an experiment that leads to the rapid accumulation of information about the phenomenon under study. Adaptive experimentation has the potential to accelerate scientific progress while at the same time minimize the cost of experimentation. To date, all adaptive methods have relied on myopic, one-step-ahead strategies in which the stimulus on each trial is selected to maximize inference on the next trial only. A long-standing question in the field has been whether additional benefit would be gained by optimizing beyond the next trial. That is, by considering the consequences of all future next trials, inference might be improved even further. Dynamic programming (DP), a tool for planning into the future, is ideally suited to this problem because it provides a means of solving the otherwise intractable computation involved in such full-horizon, “global” optimization. The present study presents the first demonstration of DP in adaptive behavioral experiments. Application of DP to model-based sensory threshold estimation provided sound insight into general conditions that will and will not benefit from DP, including when looking further ahead than the next trial is unnecessary. Implications of the results for the use of adaptive methods in this and other content areas are discussed.