Planning Algorithms

Randomized Kinodynamic Planning
Steve LaValle and James Kuffner (IJRR 2001

This paper presents the first randomized approach to kinodynamic planning (also known as trajectory planning or trajectory design). The task is to determine control inputs to drive a robot from an initial configuration and velocity to a goal configuration and velocity while obeying physically based dynamical models and avoiding obstacles in the robot's environment. The authors consider generic systems that express the nonlinear dynamics of a robot in terms of the robot's high-dimensional configuration space. Kinodynamic planning is treated as a motion-planning problem in a higher dimensional state space that has both first-order differential constraints and obstacle-based global constraints. The state space serves the same role as the configuration space for basic path planning; however, standard randomized path-planning techniques do not directly apply to planning trajectories in the state space. The authors have developed a randomized planning approach that is particularly tailored to trajectory planning problems in high-dimensional state spaces. The basis for this approach is the construction of rapidly exploring random trees, which offer benefits that are similar to those obtained by successful randomized holonomic planning methods but apply to a much broader class of problems. Theoretical analysis of the algorithm is given. Experimental results are presented for an implementation that computes trajectories for hovercrafts and satellites in cluttered environments, resulting in state spaces of up to 12 dimensions.

A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes
Michael Kearns, Yishay Mansour, Andrew Y. Ng
(Machine Learning (49): 2-3, 2002)

A critical issue for the application of Markov decision processes (MDPs) to realistic problems is how the complexity of planning scales with the size of the MDP. In stochastic environments with very large or infinite state spaces, traditional planning and reinforcement learning algorithms may be inapplicable, since their running time typically grows linearly with the state space size in the worst case. In this paper we present a new algorithm that, given only a generative model (a natural and common type of simulator) for an arbitrary MDP, performs on-line, near-optimal planning with a per-state running time that has no dependence on the number of states. The running time is exponential in the horizon time (which depends only on the discount factor and the desired degree of approximation to the optimal policy). Our algorithm thus provides a different complexity trade-off than classical algorithms such as value iteration: rather than scaling linearly in both horizon time and state space size, our running time trades an exponential dependence on the former in exchange for no dependence on the latter.

Our algorithm is based on the idea of sparse sampling. We prove that a randomly sampled look-ahead tree that covers only a vanishing fraction of the full look-ahead tree nevertheless suffices to compute near-optimal actions from any state of an MDP. Practical implementations of the algorithm are discussed, and we draw ties to our related recent results on nding a near-best strategy from a given class of strategies in very large partially observable MDPs [KMN00].

The Fast Downward Planning System

Malte Helmert (Journal of Artificial Intelligence Research (26), 2006)

Fast Downward is a classical planning system based on heuristic search. It can deal with general deterministic planning problems encoded in the propositional fragment of PDDL2.2, including advanced features like ADL conditions and effects and derived predicates (axioms). Like other well-known planners such as HSP and FF, Fast Downward is a progression planner, searching the space of world states of a planning task in the forward direction. However, unlike other PDDL planning systems, Fast Downward does not use the propositional PDDL representation of a planning task directly. Instead, the input is first translated into an alternative representation called multivalued planning tasks, which makes many of the implicit constraints of a propositional planning task explicit. Exploiting this alternative representation, Fast Downward uses hierarchical decompositions of planning tasks for computing its heuristic function, called the causal graph heuristic, which is very different from traditional HSP-like heuristics based on ignoring negative interactions of operators.

In this article, we give a full account of Fast Downward's approach to solving multi-valued planning tasks. We extend our earlier discussion of the causal graph heuristic to tasks involving axioms and conditional effects and present some novel techniques for search control that are used within Fast Downward's best-%rst search algorithm: preferred operators transfer the idea of helpful actions from local search to global best-%rst search, deferred evaluation of heuristic functions mitigates the negative effect of large branching factors on search performance, and multi-heuristic best-%rst search combines several heuristic evaluation functions within a single search algorithm in an orthogonal way. We also describe effcient data structures for fast state expansion (successor generators and axiom evaluators) and present a new non-heuristic search algorithm called focused iterative-broadening search, which utilizes the information encoded in causal graphs in a novel way.

Fast Downward has proven remarkably successful: It won the "classical" (i. e., propositional, non-optimising) track of the 4th International Planning Competition at ICAPS 2004, following in the footsteps of planners such as FF and LPG. Our experiments show that it also performs very well on the benchmarks of the earlier planning competitions and provide some insights about the usefulness of the new search enhancements.

A Planning Heuristic Based on Causal Graph Analysis
Malte Helmert (ICAPS 2004)

In recent years, heuristic search methods for classical planning have achieved remarkable results. Their most successful representative, the FF algorithm, performs well over a wide spectrum of planning domains and still sets the state of the art for STRIPS planning. However, there are some planning domains in which algorithms like FF and HSP perform poorly because their relaxation method of ignoring the ``delete lists'' of STRIPS operators loses too much vital information.

Planning domains which have many dead ends in the search space are especially problematic in this regard. In some domains, dead ends are readily found by the human observer yet remain undetected by all propositional planning systems we are aware of. We believe that this is partly because the STRIPS representation obscures the important causal structure of the domain, which is evident to humans.

In this paper, we propose translating STRIPS problems to a planning formalism with multi-valued state variables in order to expose this underlying causal structure. Moreover, we show how this structure can be exploited by an algorithm for detecting dead ends in the search space and by a planning heuristic based on hierarchical problem decomposition. Our experiments show excellent overall performance on the benchmarks from the international planning competitions.