RBFS (Recursive Best First Search) algorithm to solve TSP problem
RBFS like A* is complete and optimal. It is characterized by a linear space complexity with the depth of the optimal solution O(bd). RBFS overcomes the space problem of A*, however it suffers from excessive node regeneration, which may cause exponential time complexity. Generally, RBFS with enough time can solve problems that A∗ cannot solve because it runs out of memory.
Time and space complexities of RBFS algorithm
Since we are using the minimum spanning tree (MST) as a heuristic function, which is admissible, RBFS algorithm is complete and optimal.
RBFS is characterized by a linear space complexity with the depth of the optimal solution O(bd). The time complexity of RBFS depends on the accuracy of the heuristic function and on the number of times the best path changes when nodes are expanded, which is difficult to characterize. Since RBFS uses limited memory space, it is required to regenerate (reexpand) the forgotten states, which cause overheating in time. This overheating may cause exponential increase in time complexity associated with redundant paths in graphs.
(MATLAB code can be downloaded from MATLAB File Exchange or GitHub)
function solution = solveTSP_RBFS(TSPGraph) % returns a solution (path and cost)
firstNode = TSPGraph.getFirstNode % Get the start node (randomly)
startNode ← initializeStartNode(TSPGraph, firstNode) % initialize the start node.
solution ← TSP_RBFS(TSPGraph, startNode, startNode, ∞) % call RBFS to solve TSP.
return solution % if there is no solution (failure), the solution will be empty.
function solution = TSP_RBFS(TSPGraph, startNode, currentNode, f_limit) % returns a solution
solution ← isGoal(currentNode, TSPGraph) % check if the current node is the goal.
if (solution is not empty) return solution % return the solution if node is the goal.
successors ← expandNode(TSPGraph, startNode, currentNode); % expand the current node.
if there are no successors (successors is empty), then return empty solution [failure, ∞]
for each successors do:
suc.cost ← max( suc.g_n+suc.h_n, currentNode.cost);
while(true)
best ← get the minimum cost node in the successors;
if (best.cost>f_limit) then return [failure, best.cost]
secMinCost ← get the second minimum cost in all successors;
[solution, best.cost] ← TSP_RBFS(TSPGraph, startNode, best, min(f_limit, secMinCost))
if we reach the goal (solution is not empty), then return the solution
function successors = expandNode(TSPGraph, startNode, currentNode)% returns successors
successors ← [] % create an empty successors list.
% For TSP problem (each node is visited only once), so the successors of
% currentNode are the unvisited nodes in the current path (path from the first
% node to currentNode).
unvisited ← getUnvisitedNodes(TSPGraph, currentNode)
% estimate cost for each successor
for all unvisited do:
% Estimate heuristic of successor h[n], which equal:
node.h_n ← cost of the MST of the unvisited subgraph +
minimum cost to connect MST to unvisited subgraph +
minimum cost to connect MST to start node;
% Calculate the distance from the start node to the successor g[n], equal:
node.g_n ← distance from the stat node to the current node +
the distance between the current node and successor;
% Calculate the estimated cost from the start node to goal node successor
node.cost ← node.h_n + node.g_n
node.parent ← currentNode
node.depth ← currentNode.depth+1
Add node to successors
return successors
Result of running RBFS on dataset att-48.xml
After around 2 hours, the RBFS algorithm reached depth 9 with a total number of nodes in memory 267. After 3 hours, the algorithm did not go beyond depth 9 and the number of nodes became 267 (same as in depth 6), this means that the algorithm returned to lower depths (depth six in this case). From the results, we notice that the relationship between the maximum depth so far and the number of nodes in memory is leaner. However, the relationship between the maximum depth and the execution time started linearly and then became exponential. This due to the overhead by regenerating the nodes.
Result of running RBFS on dataset a280.xml,
After around 4 hours, the RBFS algorithm reached depth 83 with total number of visited nodes was 19160. From the results, we notice that the relationship between the maximum depth so far and the number of nodes in memory is leaner. However, the relationship between the maximum depth and the execution time started linearly and then became exponential. This due to the overhead by regenerating the nodes.
RBFS overcome the space problem of A*, however it suffers from excessive node regeneration, which may cause exponential time complexity. Generally, RBFS with enough time can solve problems that A∗ cannot solve because it runs out of memory.
% read the TSP graph from xml file and convert it to matlab graph object
%[G, Gnodes] = loadTSPGraph("att48_.xml");
%[G, Gnodes] = loadTSPGraph("data2.xml");
%[G, Gnodes] = loadTSPGraph("burma14_.xml");
[G, Gnodes] = loadTSPGraph("a280_.xml");
solutions =[];
time = [];
tic
Max_depth=0;
no_expanded_nodes = 0;
for idx=1 : size(Gnodes,1) % select a node to be as start node
% initialize the startt node
strnode = initializeStartNode(G, Gnodes{idx});
solution = RBFS(G, strnode, strnode, inf, 0, 0);
if (~isempty(solution))
solution.cost
solution.path
solutions = [solutions; solution];
end
end
function [solution, f_limit, Max_depth] = RBFS(G, strnode, currentNode, f_limit, Max_depth, no_expanded_nodes)
% display the state of searching
if( Max_depth < currentNode.depth)
Max_depth = currentNode.depth;
disp( ['Node: ', currentNode.name, ...
' , Depth: ', num2str(currentNode.depth), ...
' , Max-depth: ', num2str(Max_depth), ...
' , current_nodes: ', num2str(no_expanded_nodes), ...
' , Elapsed time is ', num2str(toc) ,'seconds'] );
end
% check if the node is the goal.
% if we don't reach the goal, solution will be empty
solution = isGoal(currentNode, G);
% if we reach the goal (solution is not empty), return the solution
if (~isempty(solution))
return
end
% expand the node
nodelist = expandNode(G, strnode.name, currentNode);
no_expanded_nodes = no_expanded_nodes+size(nodelist,1);
% if there are no successors (nodelist is empty), return empty solution
if (isempty(nodelist))
f_limit = inf;
return
end
for i=1:size(nodelist,1)
suc = nodelist(i);
suc.cost = max( suc.g_n+suc.h_n, currentNode.cost);
end
while(true)
[minCost, minidx] = min(cell2mat({nodelist.cost}));
best = nodelist(minidx);
if (best.cost>f_limit)
solution = [];
f_limit = best.cost;
return
end
% calculate the second minimum cost in all successors
if (size(nodelist,1) == 1)
% if only one successor, the second minimum cost = f_limit
secMinCost = f_limit;
else
nodelist2 = nodelist;
nodelist2(minidx) = [];
secMinCost = min(cell2mat({nodelist2.cost}));
end
[solution, nodelist(minidx).cost, Max_depth] = RBFS(G, strnode, best, min(f_limit,secMinCost), Max_depth, no_expanded_nodes);
% if we reach the goal (solution is not empty), return the solution
if (~isempty(solution))
return
end
end
end
function node = initializeStartNode(G, startNode)
% remove the first node from the the graph and add it to the open list
% for the first node we have to
unvisitedGraph = rmnode(G, startNode);
h_n = estimateDist(G, unvisitedGraph, startNode, startNode);
node.name = startNode;
node.h_n = h_n;
node.g_n = 0;
node.depth = 0;
node.cost = node.h_n + node.g_n - node.depth;
node.parent = [];
end
% estimateDist function
% estimate the distance from a current Node 'currentNode' to the end node 'startNode'
function cost = estimateDist(G, unvisitedGraph, startNode, currentNode)
% caculate Minimum spanning tree using Prim’s algorithm
T = minspantree(unvisitedGraph);
% estimated the distance to visit all unvisited nodes using MST heuristic
h_n_MST = sum(T.Edges.Weight);
unvisitedNs = table2cell(unvisitedGraph.Nodes);
% caculate the minimum distance from an unvisited node to the current node
idx_to_curr_node = findedge(G, currentNode, unvisitedNs);
mindist_to_curr_node = min(G.Edges.Weight(idx_to_curr_node));
% caculate the minimum distance from an unvisited node to the start node
idx_to_str_node = findedge(G, startNode, unvisitedNs);
mindist_to_str_node = min(G.Edges.Weight(idx_to_str_node));
cost = h_n_MST + mindist_to_str_node + mindist_to_curr_node;
end
% expandNode function
% returns all the successors of a node
function nodelist = expandNode(G, startNode, currentNode)
% For TSP problem (each node is visited only once)
% here we just get the nodes that are not already in the current node path
unvisitedGraph = G;
node1 = currentNode;
while(~isempty(node1))
unvisitedGraph = rmnode(unvisitedGraph, node1.name);
node1 = node1.parent;
end
% the expanded nodes
unvisitedNs = table2cell(unvisitedGraph.Nodes);
% the expanded nodes as structure (to be returned)
nodelist=[];
% estimate cost for the expanded nodes
for i=1:size(unvisitedNs,1)
expandedNode = unvisitedNs{i}; % expanded node name as a string (char array)
% if the node is leaf (the last node in the path), the heuristic is
% the cost of connectin this node to the start node
if (size(unvisitedNs,1)==1)
idx_to_str_node = findedge(G, startNode, expandedNode);
h_n = G.Edges.Weight(idx_to_str_node);
else
unvistg = rmnode(unvisitedGraph, expandedNode);
% h_n = cost of the MST of the subgraph of expandedNode +
% minimum cost to connecte MST to expandedNode +
% minimum cost to connecte MST to start node
h_n = estimateDist(G, unvistg, startNode, expandedNode);
end
% create the successor node as structure
node.name = expandedNode; % the successor node
node.h_n = h_n; % the heuristic of the successor node
% the cost (edge value) of connecting the current node with its
% successor (the expanded node)
g_n = G.Edges.Weight(findedge(G, node.name ,currentNode.name));
% the cost of connecting the current node to the first node
% (currentNode.g_n). node.g_n is the cost of the successor node to
% the start node
node.g_n = currentNode.g_n + g_n;
node.parent = currentNode;
node.depth = currentNode.depth+1;
% the estimated cost from the current from start node to the goal
% node throw the successor node
node.cost = node.h_n + node.g_n - node.depth;
nodelist = [nodelist; node];
end
end
% isGoal function
% Test if the node is the goal node (if the goal is satisfied)
function solution = isGoal(node, G)
path=[];
snode=[];
cost=node.cost;
while(~isempty(node))
path = [path; {node.name}];
G = rmnode(G, node.name);
snode = {node.name};
node = node.parent;
end
if (isempty(G.Nodes))
path = [snode; path];
solution.path=path;
solution.cost=cost;
else
solution = [];
end
end
function [G, Gnodes]= loadTSPGraph(fileName)
% read the TSP graph from xml file and convert it to matlab structure
tsp = xml2struct(fileName);
% convert graph structure to graph matrix
nodeNo = size(tsp.graph.vertex, 2);
graphMat = zeros(nodeNo, nodeNo);
names = zeros(nodeNo,1);
for v=1 : nodeNo
names(v) = v;
vertex = tsp.graph.vertex(v);
diag = 0;
for e=1 : nodeNo-1
edge = vertex{1}.edge(e);
cost = str2double(edge{1}.Attributes.cost);
if(v==e)
diag = 1;
end
graphMat(v, e+diag) = cost;
end
end
% convert graph matrix to matlab graph object
Gnodes = cellstr(string(names));
G = graph(graphMat, Gnodes);
end