A* (star) algorithm to solve TSP problem
A* search algorithm is complete and optimal with admissible heuristic function. However, it has exponential complexity problems in computation time and memory usage. Implementing a graph search for A* improves the performance of A* in time and space. However, we need to deal carefully with the open and closed list to guarantee an optimal solution. In addition, the time complexity could grow exponentially after reaching a certain level of depth.
Time and space complexities of A* algorithm
The minimum spanning tree (MST) is a heuristic function when it is used for traveling salesman problem (TSP). MST finds the tree with minimum cost that visit all nodes at least once with zero cost for revisiting the node again. Therefore, the MST for node A will be always less than the actual cost form node A to the goal node, which make MST a heuristic function.
Since minimum spanning tree (MST) heuristic is admissible, A∗ search (tree search) is complete and optimal. However, it has limitations in computation time and storage space. The time complexity of A∗ is exponential based on the solution depth (d) and it keeps all generated nodes in memory, so the space complexity is also exponential based on (d). For the traveling salesman problem (TSP) the solution depth is at the maximum depth (n-1), so it is impractical to use A* to solve TSP if the number of nodes is high. The detailed of A* complexity analysis is given below.
The complexity analysis of A∗ depends on the state space of the problem. For a problem with a single reversible goal, the time complexity of A* is exponential in the maximum absolute error, as follow.
For MST heuristic, as for almost all heuristics [1], the absolute error is at least proportional to the path cost h*, so ϵ is constant or growing and the time complexity is exponential in d.
Moreover, for the traveling salesman problem (TSP), which has (n-1)! goal states, the search process could follow a non-optimal path and there is an additional cost that is proportional to the number of goals that have cost within a factor of the optimal cost.
We can improve the process of A* (in time and space) by using a graph search in which we create a list of visited cities (closed list) in addition to the list of unvisited cities (fringe list or open). The experiment results of this report show that the graph search of A* was able to give optimal or very near-optimal solutions in reasonable computation time and usage space.
(MATLAB code can be downloaded from MATLAB File Exchange or GitHub)
function solution = solveTSP_Astar_treesearch(TSPGraph) % returns a solution (path and cost)
firstNode ← TSPGraph.getFirstNode % Get the start node (randomly).
open ← [] % create an empty fringe list.
startNode ← initializeStartNode(TSPGraph, firstNode) % initialize the start node.
open.add(startNode) % add the start node to the open list (fringe list).
solution ← Astar(TSPGraph, startNode, open) % call A* algorithm to solve TSP.
return solution % if there is no solution (failure), the solution will be empty.
function solution = Astar(TSPGraph, startNode, open) % returns a solution (path and cost)
solution ← [] % create an empty solution (failure).
while(open list is not empty)
currentNode ← minCost(open) % get the minimum cost node in the fringe.
solution ← isGoal(currentNode, TSPGraph) % check if the current node is the goal.
if(solution is not empty) then return solution
successors ← expandNode(TSPGraph, startNode, currentNode); % expand the current node.
open.remove(currentNode) % remove the current node form the open list (fringe).
open.add(successors) % add the successors (if there are) to the open list.
return failure % return failure (empty solution) if no solution.
function successors = expandNode(TSPGraph, startNode, currentNode) % returns successors of node
successors ← [] % create an empty successors list.
% For TSP problem (each node is visited only once), so the successors of currentNode are
% the unvisited nodes in the current path (path from the first node to currentNode).
unvisited ← getUnvisitedNodes(TSPGraph, currentNode)
% estimate cost for each successor
for all unvisited do:
% Estimate heuristic of successor h[n], which equal:
node.h_n ← cost of the MST of the unvisited subgraph +
minimum cost to connect MST to unvisited subgraph +
minimum cost to connect MST to start node;
% Calculate the distance from the start node to the successor g[n], which equal:
node.g_n ← distance from the stat node to the current node +
the distance between the current node and successor;
% Calculate the estimated cost from the start node to goal node throw the successor
node.cost ← node.h_n + node.g_n
node.parent ← currentNode
node.depth ← currentNode.depth+1
Add node to successors
return successors
function solution = solveTSP_Astar_graphsearch(TSPGraph) % returns a solution
firstNode ← TSPGraph.getFirstNode % Get the start node (randomly).
open ← [] % create an empty fringe list.
closed ← [] % create an empty closed list.
startNode ← initializeStartNode(TSPGraph, firstNode) % initialize the start node.
open.add(startNode) % add the start node to the open list (fringe list).
solution ← Astar (TSPGraph, startNode, open) % call A* algorithm to solve TSP.
return solution % if there is no solution (failure), the solution will be empty.
function solution = Astar (TSPGraph, startNode, open) % returns a solution (path and cost)
solution ← [] % create an empty solution (failure).
while(open list is not empty)
currentNode ← minCost(open) % get the minimum cost node in the fringe.
Remove the current Node from open list
add the current Node to the closed list
if the current node is the goal node then return solution(currentNode, TSPGraph)
successors ← expandNode(TSPGraph, startNode, currentNode); % expand the current node.
% add/update successors in the open and closed lists based on the node cost and depth
--------------------------------------------------------------------------------------
For each node s in successors
if [s is in closed and (s.cost <= s_inClosed.cost)] or
[s is in closed and (s.cost > s_inClosed.cost) and (s.depth>s_inClosed.depth)]
then move s from closed to open
else if [s is not already in open] then add s to open
else if (s.cost<s_inOpen.cost) and (s.depth>s_inOpen.depth) then remove s from open
else if [s.cost <= s_inOpen.cost] or
[(s.cost > s_inOpen.cost) and (s.depth > s_inOpen.depth)]
then add s to open
--------------------------------------------------------------------------------------
return failure % return failure (empty solution) if no solution.
function successors = expandNode(TSPGraph, startNode, currentNode) % returns successors of node
….
% Same as the ‘expandNode’ function mentioned before
….
Result of running A* (tree search) on att48.xml dataset
After around 5 hours, A* algorithm reached depth 24 with fringe size = 570138. From the results, it is clear that the fringe is growing exponentially and the time required to reach the next depth is growing exponentially.
Result of running A* (tree search) on a280.xml dataset
After around 3 hours, A* algorithm reached depth 81 with fringe size = 858565. From the results, it is clear that the fringe is growing exponentially and the time required to reach the next depth is growing exponentially.
Result of running A* (graph search) on dataset att48.xml,
In less than 3 minutes, A* algorithm reached the optimal solution, with path cost equals to 10628 [2], starting from a random initial node named ‘2’. The complete solution tour is given below:
'2' '26' '4' '35' '45' '10' '24' '42' '5' '48' '39' '32' '21' '47' '20' '33' '46' '36' '30' '43' '17' '27' '19' '37' '6' '28' '7' '18' '44' '31' '38' '8' '1' 9' '40' '15' '12' '11' '13' '25' '14' '23' '3' '22' '16' '41' '34' '29' '2'
In order to obtain this optimal solution, A* spend 161 second and stored a maximum of 589 nodes in memory. Employing graph search improved the process of A* algorithm in both time and space. However, by using different initial states, A* does not always give the optimal solution. In fact, among the different 48 initial nodes on dataset ‘att48’, A* was able to provide the optimal solution for 8 initial nodes, ‘2’, ‘8’, ‘19’, ‘29’, ‘31’, ‘38’, ‘41’, and ‘44’. For the other nodes, it gives very near optimal solutions, which range between 10648 to 10795.
Result of running A* (graph search) on dataset a280.xml,
From the results, for a280 dataset, A* with graph search showed enhancement to the performance compared with A* with tree search. For the same a280 dataset, A* tree search reached depth 81 with fringe size = 858565 after 3.1 hours, while A* graph search reached the same depth with fringe size = 5993 after 0.4 hours. A* graph search enhanced the space and make the usage of memory bounded instead of exponential growing in the A* with tree search. However, the time complexity started to grow exponentially after reaching depth 86, which may cause improvement in time is ineffective for the problem with a large number of nodes.
A* search algorithm has exponential complexity problems in computation time and memory usage. The previous experiments showed that fringe size and the time required to reach the next depth grows exponentially. A* search was not able to reach a solution and after running for a long time it run out of memory. Therefore, A∗ is not practical for many problems including TSP.
Implementing a graph search for A* improves the performance of A* in time and space. However, we need to deal carefully with the open and closed list to guarantee an optimal solution. In addition, the time complexity could grow exponentially after reaching a certain level of depth as shown in previous experiments.
% read the TSP graph from xml file and convert it to matlab graph object
[G, Gnodes] = loadTSPGraph("att48_.xml");
%[G, Gnodes] = loadTSPGraph("data2.xml");
%[G, Gnodes] = loadTSPGraph("burma14_.xml");
%[G, Gnodes] = loadTSPGraph("a280_.xml");
solutions =[];
time = [];
tic
Max_depth=0;
no_expanded_nodes = 0;
for idx=1 : size(Gnodes,1) % select a node to be as start node
% initialize the startt node
strnode = initializeStartNode(G, Gnodes{idx});
solution = RBFS(G, strnode, strnode, inf, 0, 0);
if (~isempty(solution))
solution.cost
solution.path
solutions = [solutions; solution];
end
end
function [solution, f_limit, Max_depth] = RBFS(G, strnode, currentNode, f_limit, Max_depth, no_expanded_nodes)
% display the state of searching
if( Max_depth < currentNode.depth)
Max_depth = currentNode.depth;
disp( ['Node: ', currentNode.name, ...
' , Depth: ', num2str(currentNode.depth), ...
' , Max-depth: ', num2str(Max_depth), ...
' , current_nodes: ', num2str(no_expanded_nodes), ...
' , Elapsed time is ', num2str(toc) ,' seconds.'] );
end
% check if the node is the goal.
% if we don't reach the goal, solution will be empty
solution = isGoal(currentNode, G);
% if we reach the goal (solution is not empty), return the solution
if (~isempty(solution))
return
end
% expand the node
nodelist = expandNode(G, strnode.name, currentNode);
no_expanded_nodes = no_expanded_nodes+size(nodelist,1);
% if there are no successors (nodelist is empty), return empty solution
if (isempty(nodelist))
f_limit = inf;
return
end
for i=1:size(nodelist,1)
suc = nodelist(i);
suc.cost = max( suc.g_n+suc.h_n, currentNode.cost);
end
while(true)
[minCost, minidx] = min(cell2mat({nodelist.cost}));
best = nodelist(minidx);
if (best.cost>f_limit)
solution = [];
f_limit = best.cost;
return
end
% calculate the second minimum cost in all successors
if (size(nodelist,1) == 1)
% if only one successor, the second minimum cost = f_limit
secMinCost = f_limit;
else
nodelist2 = nodelist;
nodelist2(minidx) = [];
secMinCost = min(cell2mat({nodelist2.cost}));
end
[solution, nodelist(minidx).cost, Max_depth] = RBFS(G, strnode, best, min(f_limit,secMinCost), Max_depth, no_expanded_nodes);
% if we reach the goal (solution is not empty), return the solution
if (~isempty(solution))
return
end
end
end
function node = initializeStartNode(G, startNode)
% remove the first node from the the graph and add it to the open list
% for the first node we have to
unvisitedGraph = rmnode(G, startNode);
h_n = estimateDist(G, unvisitedGraph, startNode, startNode);
node.name = startNode;
node.h_n = h_n;
node.g_n = 0;
node.depth = 0;
node.cost = node.h_n + node.g_n - node.depth;
node.parent = [];
end
% estimateDist function
% estimate the distance from a current Node 'currentNode' to the end node 'startNode'
function cost = estimateDist(G, unvisitedGraph, startNode, currentNode)
% caculate Minimum spanning tree using Prim’s algorithm
T = minspantree(unvisitedGraph);
% estimated the distance to visit all unvisited nodes using MST heuristic
h_n_MST = sum(T.Edges.Weight);
unvisitedNs = table2cell(unvisitedGraph.Nodes);
% caculate the minimum distance from an unvisited node to the current node
idx_to_curr_node = findedge(G, currentNode, unvisitedNs);
mindist_to_curr_node = min(G.Edges.Weight(idx_to_curr_node));
% caculate the minimum distance from an unvisited node to the start node
idx_to_str_node = findedge(G, startNode, unvisitedNs);
mindist_to_str_node = min(G.Edges.Weight(idx_to_str_node));
cost = h_n_MST + mindist_to_str_node + mindist_to_curr_node;
end
% expandNode function
% returns all the successors of a node
function nodelist = expandNode(G, startNode, currentNode)
% For TSP problem (each node is visited only once)
% here we just get the nodes that are not already in the current node path
unvisitedGraph = G;
node1 = currentNode;
while(~isempty(node1))
unvisitedGraph = rmnode(unvisitedGraph, node1.name);
node1 = node1.parent;
end
% the expanded nodes
unvisitedNs = table2cell(unvisitedGraph.Nodes);
% the expanded nodes as structure (to be returned)
nodelist=[];
% estimate cost for the expanded nodes
for i=1:size(unvisitedNs,1)
expandedNode = unvisitedNs{i}; % expanded node name as a string (char array)
% if the node is leaf (the last node in the path), the heuristic is
% the cost of connectin this node to the start node
if (size(unvisitedNs,1)==1)
idx_to_str_node = findedge(G, startNode, expandedNode);
h_n = G.Edges.Weight(idx_to_str_node);
else
unvistg = rmnode(unvisitedGraph, expandedNode);
% h_n = cost of the MST of the subgraph of expandedNode +
% minimum cost to connecte MST to expandedNode +
% minimum cost to connecte MST to start node
h_n = estimateDist(G, unvistg, startNode, expandedNode);
end
% create the successor node as structure
node.name = expandedNode; % the successor node
node.h_n = h_n; % the heuristic of the successor node
% the cost (edge value) of connecting the current node with its
% successor (the expanded node)
g_n = G.Edges.Weight(findedge(G, node.name ,currentNode.name));
% the cost of connecting the current node to the first node
% (currentNode.g_n). node.g_n is the cost of the successor node to
% the start node
node.g_n = currentNode.g_n + g_n;
node.parent = currentNode;
node.depth = currentNode.depth+1;
% the estimated cost from the current from start node to the goal
% node throw the successor node
node.cost = node.h_n + node.g_n - node.depth;
nodelist = [nodelist; node];
end
end
% isGoal function
% Test if the node is the goal node (if the goal is satisfied)
function solution = isGoal(node, G)
path=[];
snode=[];
cost=node.cost;
while(~isempty(node))
path = [path; {node.name}];
G = rmnode(G, node.name);
snode = {node.name};
node = node.parent;
end
if (isempty(G.Nodes))
path = [snode; path];
solution.path=path;
solution.cost=cost;
else
solution = [];
end
end
function [G, Gnodes]= loadTSPGraph(fileName)
% read the TSP graph from xml file and convert it to matlab structure
tsp = xml2struct(fileName);
% convert graph structure to graph matrix
nodeNo = size(tsp.graph.vertex, 2);
graphMat = zeros(nodeNo, nodeNo);
names = zeros(nodeNo,1);
for v=1 : nodeNo
names(v) = v;
vertex = tsp.graph.vertex(v);
diag = 0;
for e=1 : nodeNo-1
edge = vertex{1}.edge(e);
cost = str2double(edge{1}.Attributes.cost);
if(v==e)
diag = 1;
end
graphMat(v, e+diag) = cost;
end
end
% convert graph matrix to matlab graph object
Gnodes = cellstr(string(names));
G = graph(graphMat, Gnodes);
end
% read the TSP graph from xml file and convert it to matlab graph object
[G, Gnodes] = loadTSPGraph("att48_.xml");
%[G, Gnodes] = loadTSPGraph("a280_.xml");
%[G, Gnodes] = loadTSPGraph("data2.xml");
%[G, Gnodes] = loadTSPGraph("burma14_.xml");
solutions =[];
time = [];
tic
for idx=1 : size(Gnodes,1) % select a node to be as start node
open=[];
closed=[];
% initialize the startt node
strnode = initializeStartNode(G, Gnodes{idx});
% add the start node to the open list ( finge list)
open = [open; strnode];
solution = []; % the goal strarting from node i
Max_depth=0;
no_expanded_nodes = 0;
while(~isempty(open))
no_expanded_nodes=no_expanded_nodes+1;
% get the minimum cost in the fringe
costs = cell2mat({open.cost});
depths = cell2mat({open.depth});
% get the nodes that have the minimum cost
minCNodes = find(costs==min(costs));
% if there are more than one node with the same cost, get deepest
% one
[val, index] = max(depths(minCNodes));
minidx = minCNodes(index);
currentNode = open(minidx);
% display the state of searching
if( Max_depth < max(cell2mat({open.depth})) )
Max_depth = max(cell2mat({open.depth}));
disp( ['Node: ', currentNode.name, ...
', Depth: ', num2str(currentNode.depth), ...
', Max-depth: ', num2str(max(cell2mat({open.depth}))), ...
', no_expanded_nodes: ', num2str(no_expanded_nodes), ...
', Fringe-size: ', num2str(size(open,1)), ...
' , Elapsed time is ', num2str(toc) ,' seconds.'] );
end
open(minidx)=[];
if(~ismember(currentNode.name, {open.name}))
closed = [closed; currentNode];
end
% check if the node is the goal.
solution = isGoal(currentNode, G);
if (~isempty(solution))
break;
end
% expand the node
nodelist = expandNode(G, strnode.name, currentNode);
% For graph search, we expand the node only if it was not expanded before
for i=1:size(nodelist, 1)
if (isempty(closed))
val2 = 0;
else
[val2, id2] = ismember(nodelist(i).name,{closed.name});
end
if(val2 && (nodelist(i).cost<=closed(id2).cost))
closed(id2)=[];
open = [open; nodelist(i)];
elseif(val2 && (nodelist(i).cost>closed(id2).cost))
if((nodelist(i).depth > closed(id2).depth))
closed(id2)=[];
open = [open; nodelist(i)];
end
else
id1 = find(ismember({open.name}, nodelist(i).name));
if (isempty(id1))
open = [open; nodelist(i)];
else
addnode = false;
removeidxs = [];
for s=1:size(id1,2)
if((nodelist(i).cost <= open(id1(s)).cost))
addnode = true;
if( (nodelist(i).cost < open(id1(s)).cost) && ...
(nodelist(i).depth >= open(id1(s)).depth))
removeidxs = [removeidxs; id1(s)];
end
elseif( (nodelist(i).cost > open(id1(s)).cost) && ...
(nodelist(i).depth > open(id1(s)).depth) )
addnode = true;
end
end
if(~isempty(removeidxs))
removeidxs = sort(removeidxs, 'descend');
end
for s=1:size(removeidxs,1)
open(removeidxs(s))=[];
end
if(addnode)
open = [open; nodelist(i)];
end
end
end
end
end
if (~isempty(solution))
solution.cost
solution.path(1)
solutions = [solutions; solution];
time = [time; toc];
end
end
sum(cell2mat({solutions.cost}))/size(solutions,1)
toc
function node = initializeStartNode(G, startNode)
% remove the first node from the the graph and add it to the open list
% for the first node we have to
unvisitedGraph = rmnode(G, startNode);
h_n = estimateDist(G, unvisitedGraph, startNode, startNode);
node.name = startNode;
node.h_n = h_n;
node.g_n = 0;
node.depth = 0;
node.cost = node.h_n + node.g_n;
node.parent = [];
end
% estimateDist function
% estimate the distance from a current Node 'currentNode' to the end node 'startNode'
function cost = estimateDist(G, unvisitedGraph, startNode, currentNode)
% caculate Minimum spanning tree using Prim’s algorithm
T = minspantree(unvisitedGraph);
% estimated the distance to visit all unvisited nodes using MST heuristic
h_n_MST = sum(T.Edges.Weight);
unvisitedNs = table2cell(unvisitedGraph.Nodes);
% caculate the minimum distance from an unvisited node to the current node
idx_to_curr_node = findedge(G, currentNode, unvisitedNs);
mindist_to_curr_node = min(G.Edges.Weight(idx_to_curr_node));
% caculate the minimum distance from an unvisited node to the start node
idx_to_str_node = findedge(G, startNode, unvisitedNs);
mindist_to_str_node = min(G.Edges.Weight(idx_to_str_node));
cost = h_n_MST + mindist_to_str_node + mindist_to_curr_node;
end
% expandNode function
% returns all the successors of a node
function nodelist = expandNode(G, startNode, currentNode)
% For TSP problem (each node is visited only once)
% here we just get the nodes that are not already in the current node path
unvisitedGraph = G;
node1 = currentNode;
while(~isempty(node1))
unvisitedGraph = rmnode(unvisitedGraph, node1.name);
node1 = node1.parent;
end
% the expanded nodes
unvisitedNs = table2cell(unvisitedGraph.Nodes);
% the expanded nodes as structure (to be returned)
nodelist=[];
% estimate cost for the expanded nodes
for i=1:size(unvisitedNs,1)
expandedNode = unvisitedNs{i}; % expanded node name as a string (char array)
% if the node is leaf (the last node in the path), the heuristic is
% the cost of connectin this node to the start node
if (size(unvisitedNs,1)==1)
idx_to_str_node = findedge(G, startNode, expandedNode);
h_n = G.Edges.Weight(idx_to_str_node);
else
unvistg = rmnode(unvisitedGraph, expandedNode);
% h_n = cost of the MST of the subgraph of expandedNode +
% minimum cost to connecte MST to expandedNode +
% minimum cost to connecte MST to start node
h_n = estimateDist(G, unvistg, startNode, expandedNode);
end
% create the successor node as structure
node.name = expandedNode; % the successor node
node.h_n = h_n; % the heuristic of the successor node
% the cost (edge value) of connecting the current node with its
% successor (the expanded node)
g_n = G.Edges.Weight(findedge(G, node.name ,currentNode.name));
% the cost of connecting the current node to the first node
% (currentNode.g_n). node.g_n is the cost of the successor node to
% the start node
node.g_n = currentNode.g_n + g_n;
% the estimated cost from the current from start node to the goal
% node throw the successor node
node.depth = currentNode.depth+1;
node.cost = node.h_n + node.g_n;
node.parent = currentNode;
nodelist = [nodelist; node];
end
end
% isGoal function
% Test if the node is the goal node (if the goal is satisfied)
function solution = isGoal(node, G)
path=[];
snode=[];
cost=node.cost;
while(~isempty(node))
path = [path; {node.name}];
G = rmnode(G, node.name);
snode = {node.name};
node = node.parent;
end
if (isempty(G.Nodes))
path = [snode; path];
solution.path=path;
solution.cost=cost;
else
solution = [];
end
end
function [G, Gnodes]= loadTSPGraph(fileName)
% read the TSP graph from xml file and convert it to matlab structure
tsp = xml2struct(fileName);
% convert graph structure to graph matrix
nodeNo = size(tsp.graph.vertex, 2);
graphMat = zeros(nodeNo, nodeNo);
names = zeros(nodeNo,1);
for v=1 : nodeNo
names(v) = v;
vertex = tsp.graph.vertex(v);
diag = 0;
for e=1 : nodeNo-1
edge = vertex{1}.edge(e);
cost = str2double(edge{1}.Attributes.cost);
if(v==e)
diag = 1;
end
graphMat(v, e+diag) = cost;
end
end
% convert graph matrix to matlab graph object
Gnodes = cellstr(string(names));
G = graph(graphMat, Gnodes);
end
[1] S. Rassel and P. Norwig, “Artificial intelligence: a modern approach (AIMA).” Moscow: Wiliams, 2007.
[2] “Symmetric TSPs.” [Online]. Available: http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/STSP.html. [Accessed: 08-Apr-2020].