Unifying Robot Optimization: Monte Carlo Tree Search with Tensor Factorization

Teng Xue, Yan Zhang, Amirreza Razmjoo, Sylvain Calinon

Idiap Research Institute and EPFL

Paper

Code

Abstract

Many robotic tasks, such as inverse kinematics, motion planning, and optimal control, can be formulated as optimization problems. Solving these problems involves addressing nonlinear kinematics, complex contact dynamics, long-horizon correlation, and multi-modal landscapes, each posing distinct challenges for state-of-the-art optimization methods. Monte Carlo Tree Search is a powerful approach that can strategically explore the solution space and can be applied to a wide range of tasks across varying scenarios. However, it typically suffers from combinatorial complexity when applied to robotics, resulting in slow convergence and high memory demands. To address this limitation, we propose Tensor Train Tree Search (TTTS), which leverages tensor factorization to exploit correlations among decision variables arising from common kinematic structures, dynamic constraints, and environmental interactions in robot decision-making. This yields a compact, linear-complexity representation that significantly reduces both computation time and storage requirements. We prove that TTTS can efficiently reach the bounded global optimum within a finite time. Experimental results across inverse kinematics, motion planning around obstacles, legged robot manipulation, multi-stage motion planning, and bimanual whole-body manipulation demonstrate the efficiency of TTTS on a diverse set of robotic tasks.

Method

We consider a general robot optimization formulation that can address diverse problems, such as inverse kinematics, motion planning, multi-stage motion planning with mode switching, and model predictive control, which corresponds to a large sets of mathematical programs, including nonlinear programming (NLP), large-scale (aka. high-dim) NLP and mixed-integer nonlinear programming (MINLP). In particular, we leverage basis functions to reduce the dimensionality for large-scale NLP formulation.

The unified mathematical formulation for our robot optimization problems is:

where,

(4) encodes the full trajectory of decision variables using basis functions.

(5) enforces allowed transitions in the discrete mode set given the current mode and the system trajectory.

(6) and (7) represent the system dynamics, consistency of different modes, and other physical constraints.

We propose TTTS to tackle the above MINLP problem. Tensor Train (TT) is introduced as an efficient representation for decision trees that significantly reduces combinatorial complexity to linear complexity. A multi-layer decision tree can be represented as a high-dimensional tensor, as shown in Figure 1(A). Since tensor factorization can be used to represent a high-dim tensor through a chain of third-order tensor cores (Figure 1(B)), resulting in TT format, a multi-layer node-based tree can be expressed as TT-Tree, which significantly reduces the combinatorial complexity to linear complexity in both computation time and storage space. With such structure, the node value and visiting info can efficiently computed, as shown in Figure 1(C), which illustrates an example about how how to compute the value of a node in the first layer of a three-layer tree using TT cores.

Figure 1: Tree-Tensor-TT transformation.