Solving Travelling Salesman Problem TSP

using A* algorithm

A* (star) algorithm to solve TSP problem

A* search algorithm is complete and optimal with admissible heuristic function. However, it has exponential complexity problems in computation time and memory usage. Implementing a graph search for A* improves the performance of A* in time and space. However, we need to deal carefully with the open and closed list to guarantee an optimal solution. In addition, the time complexity could grow exponentially after reaching a certain level of depth.

Time and space complexities of A* algorithm

The minimum spanning tree (MST) is a heuristic function when it is used for traveling salesman problem (TSP). MST finds the tree with minimum cost that visit all nodes at least once with zero cost for revisiting the node again. Therefore, the MST for node A will be always less than the actual cost form node A to the goal node, which make MST a heuristic function.

Since minimum spanning tree (MST) heuristic is admissible, A∗ search (tree search) is complete and optimal. However, it has limitations in computation time and storage space. The time complexity of A∗ is exponential based on the solution depth (d) and it keeps all generated nodes in memory, so the space complexity is also exponential based on (d). For the traveling salesman problem (TSP) the solution depth is at the maximum depth (n-1), so it is impractical to use A* to solve TSP if the number of nodes is high. The detailed of A* complexity analysis is given below.

The complexity analysis of A∗ depends on the state space of the problem. For a problem with a single reversible goal, the time complexity of A* is exponential in the maximum absolute error, as follow.

For MST heuristic, as for almost all heuristics [1], the absolute error is at least proportional to the path cost h*, so ϵ is constant or growing and the time complexity is exponential in d.

Moreover, for the traveling salesman problem (TSP), which has (n-1)! goal states, the search process could follow a non-optimal path and there is an additional cost that is proportional to the number of goals that have cost within a factor of the optimal cost.

We can improve the process of A* (in time and space) by using a graph search in which we create a list of visited cities (closed list) in addition to the list of unvisited cities (fringe list or open). The experiment results of this report show that the graph search of A* was able to give optimal or very near-optimal solutions in reasonable computation time and usage space.

Pseudo-code

(MATLAB code can be downloaded from MATLAB File Exchange or GitHub)

Pseudo-code: (tree search)

function solution = solveTSP_Astar_treesearch(TSPGraph) % returns a solution (path and cost)

firstNode ← TSPGraph.getFirstNode % Get the start node (randomly).

open ← [] % create an empty fringe list.

startNode ← initializeStartNode(TSPGraph, firstNode) % initialize the start node.

open.add(startNode) % add the start node to the open list (fringe list).

solution ← Astar(TSPGraph, startNode, open) % call A* algorithm to solve TSP.

return solution % if there is no solution (failure), the solution will be empty.

function solution = Astar(TSPGraph, startNode, open) % returns a solution (path and cost)

solution ← [] % create an empty solution (failure).

while(open list is not empty)

currentNode ← minCost(open) % get the minimum cost node in the fringe.

solution ← isGoal(currentNode, TSPGraph) % check if the current node is the goal.

if(solution is not empty) then return solution

successors ← expandNode(TSPGraph, startNode, currentNode); % expand the current node.

open.remove(currentNode) % remove the current node form the open list (fringe).

open.add(successors) % add the successors (if there are) to the open list.

return failure % return failure (empty solution) if no solution.

function successors = expandNode(TSPGraph, startNode, currentNode) % returns successors of node

successors ← [] % create an empty successors list.

% For TSP problem (each node is visited only once), so the successors of currentNode are

% the unvisited nodes in the current path (path from the first node to currentNode).

unvisited ← getUnvisitedNodes(TSPGraph, currentNode)

% estimate cost for each successor

for all unvisited do:

% Estimate heuristic of successor h[n], which equal:

node.h_n ← cost of the MST of the unvisited subgraph +

minimum cost to connect MST to unvisited subgraph +

minimum cost to connect MST to start node;

% Calculate the distance from the start node to the successor g[n], which equal:

node.g_n ← distance from the stat node to the current node +

the distance between the current node and successor;

% Calculate the estimated cost from the start node to goal node throw the successor

node.cost ← node.h_n + node.g_n

node.parent ← currentNode

node.depth ← currentNode.depth+1

Add node to successors

return successors

Pseudo-code: (graph search)

function solution = solveTSP_Astar_graphsearch(TSPGraph) % returns a solution

firstNode ← TSPGraph.getFirstNode % Get the start node (randomly).

open ← [] % create an empty fringe list.

closed ← [] % create an empty closed list.

startNode ← initializeStartNode(TSPGraph, firstNode) % initialize the start node.

open.add(startNode) % add the start node to the open list (fringe list).

solution ← Astar (TSPGraph, startNode, open) % call A* algorithm to solve TSP.

return solution % if there is no solution (failure), the solution will be empty.

function solution = Astar (TSPGraph, startNode, open) % returns a solution (path and cost)

solution ← [] % create an empty solution (failure).

while(open list is not empty)

currentNode ← minCost(open) % get the minimum cost node in the fringe.

Remove the current Node from open list

add the current Node to the closed list

if the current node is the goal node then return solution(currentNode, TSPGraph)

successors ← expandNode(TSPGraph, startNode, currentNode); % expand the current node.

% add/update successors in the open and closed lists based on the node cost and depth

--------------------------------------------------------------------------------------

For each node s in successors

if [s is in closed and (s.cost <= s_inClosed.cost)] or

[s is in closed and (s.cost > s_inClosed.cost) and (s.depth>s_inClosed.depth)]

then move s from closed to open

else if [s is not already in open] then add s to open

else if (s.cost<s_inOpen.cost) and (s.depth>s_inOpen.depth) then remove s from open

else if [s.cost <= s_inOpen.cost] or

[(s.cost > s_inOpen.cost) and (s.depth > s_inOpen.depth)]

then add s to open

--------------------------------------------------------------------------------------

return failure % return failure (empty solution) if no solution.

function successors = expandNode(TSPGraph, startNode, currentNode) % returns successors of node

….

% Same as the ‘expandNode’ function mentioned before

….

Experiments results

Result of running A* (tree search) on att48.xml dataset

After around 5 hours, A* algorithm reached depth 24 with fringe size = 570138. From the results, it is clear that the fringe is growing exponentially and the time required to reach the next depth is growing exponentially.

Result of running A* (tree search) on a280.xml dataset

After around 3 hours, A* algorithm reached depth 81 with fringe size = 858565. From the results, it is clear that the fringe is growing exponentially and the time required to reach the next depth is growing exponentially.

Result of running A* (graph search) on dataset att48.xml,

In less than 3 minutes, A* algorithm reached the optimal solution, with path cost equals to 10628 ^[2], starting from a random initial node named ‘2’. The complete solution tour is given below:

'2' '26' '4' '35' '45' '10' '24' '42' '5' '48' '39' '32' '21' '47' '20' '33' '46' '36' '30' '43' '17' '27' '19' '37' '6' '28' '7' '18' '44' '31' '38' '8' '1' 9' '40' '15' '12' '11' '13' '25' '14' '23' '3' '22' '16' '41' '34' '29' '2'

In order to obtain this optimal solution, A* spend 161 second and stored a maximum of 589 nodes in memory. Employing graph search improved the process of A* algorithm in both time and space. However, by using different initial states, A* does not always give the optimal solution. In fact, among the different 48 initial nodes on dataset ‘att48’, A* was able to provide the optimal solution for 8 initial nodes, ‘2’, ‘8’, ‘19’, ‘29’, ‘31’, ‘38’, ‘41’, and ‘44’. For the other nodes, it gives very near optimal solutions, which range between 10648 to 10795.

Result of running A* (graph search) on dataset a280.xml,

From the results, for a280 dataset, A* with graph search showed enhancement to the performance compared with A* with tree search. For the same a280 dataset, A* tree search reached depth 81 with fringe size = 858565 after 3.1 hours, while A* graph search reached the same depth with fringe size = 5993 after 0.4 hours. A* graph search enhanced the space and make the usage of memory bounded instead of exponential growing in the A* with tree search. However, the time complexity started to grow exponentially after reaching depth 86, which may cause improvement in time is ineffective for the problem with a large number of nodes.

Conclusion

A* search algorithm has exponential complexity problems in computation time and memory usage. The previous experiments showed that fringe size and the time required to reach the next depth grows exponentially. A* search was not able to reach a solution and after running for a long time it run out of memory. Therefore, A∗ is not practical for many problems including TSP.

Implementing a graph search for A* improves the performance of A* in time and space. However, we need to deal carefully with the open and closed list to guarantee an optimal solution. In addition, the time complexity could grow exponentially after reaching a certain level of depth as shown in previous experiments.

Matlab code

Matlab code: (tree search)

% read the TSP graph from xml file and convert it to matlab graph object

[G, Gnodes] = loadTSPGraph("att48_.xml");

%[G, Gnodes] = loadTSPGraph("data2.xml");

%[G, Gnodes] = loadTSPGraph("burma14_.xml");

%[G, Gnodes] = loadTSPGraph("a280_.xml");

solutions =[];

time = [];

tic

Max_depth=0;

no_expanded_nodes = 0;

for idx=1 : size(Gnodes,1) % select a node to be as start node

% initialize the startt node

strnode = initializeStartNode(G, Gnodes{idx});

solution = RBFS(G, strnode, strnode, inf, 0, 0);

if (~isempty(solution))

solution.cost

solution.path

solutions = [solutions; solution];

end

function [solution, f_limit, Max_depth] = RBFS(G, strnode, currentNode, f_limit, Max_depth, no_expanded_nodes)

% display the state of searching

if( Max_depth < currentNode.depth)

Max_depth = currentNode.depth;

disp( ['Node: ', currentNode.name, ...

' , Depth: ', num2str(currentNode.depth), ...

' , Max-depth: ', num2str(Max_depth), ...

' , current_nodes: ', num2str(no_expanded_nodes), ...

' , Elapsed time is ', num2str(toc) ,' seconds.'] );

end

% check if the node is the goal.

% if we don't reach the goal, solution will be empty

solution = isGoal(currentNode, G);

% if we reach the goal (solution is not empty), return the solution

if (~isempty(solution))

return

end

% expand the node

nodelist = expandNode(G, strnode.name, currentNode);

no_expanded_nodes = no_expanded_nodes+size(nodelist,1);

% if there are no successors (nodelist is empty), return empty solution

if (isempty(nodelist))

f_limit = inf;

return

end

for i=1:size(nodelist,1)

suc = nodelist(i);

suc.cost = max( suc.g_n+suc.h_n, currentNode.cost);

end

while(true)

[minCost, minidx] = min(cell2mat({nodelist.cost}));

best = nodelist(minidx);

if (best.cost>f_limit)

solution = [];

f_limit = best.cost;

return

end

% calculate the second minimum cost in all successors

if (size(nodelist,1) == 1)

% if only one successor, the second minimum cost = f_limit

secMinCost = f_limit;

else

nodelist2 = nodelist;

nodelist2(minidx) = [];

secMinCost = min(cell2mat({nodelist2.cost}));

end

[solution, nodelist(minidx).cost, Max_depth] = RBFS(G, strnode, best, min(f_limit,secMinCost), Max_depth, no_expanded_nodes);

% if we reach the goal (solution is not empty), return the solution

if (~isempty(solution))

return

end

function node = initializeStartNode(G, startNode)

% remove the first node from the the graph and add it to the open list

% for the first node we have to

unvisitedGraph = rmnode(G, startNode);

h_n = estimateDist(G, unvisitedGraph, startNode, startNode);

node.name = startNode;

node.h_n = h_n;

node.g_n = 0;

node.depth = 0;

node.cost = node.h_n + node.g_n - node.depth;

node.parent = [];

end

% estimateDist function

% estimate the distance from a current Node 'currentNode' to the end node 'startNode'

function cost = estimateDist(G, unvisitedGraph, startNode, currentNode)

% caculate Minimum spanning tree using Prim’s algorithm

T = minspantree(unvisitedGraph);

% estimated the distance to visit all unvisited nodes using MST heuristic

h_n_MST = sum(T.Edges.Weight);

unvisitedNs = table2cell(unvisitedGraph.Nodes);

% caculate the minimum distance from an unvisited node to the current node

idx_to_curr_node = findedge(G, currentNode, unvisitedNs);

mindist_to_curr_node = min(G.Edges.Weight(idx_to_curr_node));

% caculate the minimum distance from an unvisited node to the start node

idx_to_str_node = findedge(G, startNode, unvisitedNs);

mindist_to_str_node = min(G.Edges.Weight(idx_to_str_node));

cost = h_n_MST + mindist_to_str_node + mindist_to_curr_node;

end

% expandNode function

% returns all the successors of a node

function nodelist = expandNode(G, startNode, currentNode)

% For TSP problem (each node is visited only once)

% here we just get the nodes that are not already in the current node path

unvisitedGraph = G;

node1 = currentNode;

while(~isempty(node1))

unvisitedGraph = rmnode(unvisitedGraph, node1.name);

node1 = node1.parent;

end

% the expanded nodes

unvisitedNs = table2cell(unvisitedGraph.Nodes);

% the expanded nodes as structure (to be returned)

nodelist=[];

% estimate cost for the expanded nodes

for i=1:size(unvisitedNs,1)

expandedNode = unvisitedNs{i}; % expanded node name as a string (char array)

% if the node is leaf (the last node in the path), the heuristic is

% the cost of connectin this node to the start node

if (size(unvisitedNs,1)==1)

idx_to_str_node = findedge(G, startNode, expandedNode);

h_n = G.Edges.Weight(idx_to_str_node);

else

unvistg = rmnode(unvisitedGraph, expandedNode);

% h_n = cost of the MST of the subgraph of expandedNode +

% minimum cost to connecte MST to expandedNode +

% minimum cost to connecte MST to start node

h_n = estimateDist(G, unvistg, startNode, expandedNode);

end

% create the successor node as structure

node.name = expandedNode; % the successor node

node.h_n = h_n; % the heuristic of the successor node

% the cost (edge value) of connecting the current node with its

% successor (the expanded node)

g_n = G.Edges.Weight(findedge(G, node.name ,currentNode.name));

% the cost of connecting the current node to the first node

% (currentNode.g_n). node.g_n is the cost of the successor node to

% the start node

node.g_n = currentNode.g_n + g_n;

node.parent = currentNode;

node.depth = currentNode.depth+1;

% the estimated cost from the current from start node to the goal

% node throw the successor node

node.cost = node.h_n + node.g_n - node.depth;

nodelist = [nodelist; node];

end

% isGoal function

% Test if the node is the goal node (if the goal is satisfied)

function solution = isGoal(node, G)

path=[];

snode=[];

cost=node.cost;

while(~isempty(node))

path = [path; {node.name}];

G = rmnode(G, node.name);

snode = {node.name};

node = node.parent;

end

if (isempty(G.Nodes))

path = [snode; path];

solution.path=path;

solution.cost=cost;

else

solution = [];

end

function [G, Gnodes]= loadTSPGraph(fileName)

% read the TSP graph from xml file and convert it to matlab structure

tsp = xml2struct(fileName);

% convert graph structure to graph matrix

nodeNo = size(tsp.graph.vertex, 2);

graphMat = zeros(nodeNo, nodeNo);

names = zeros(nodeNo,1);

for v=1 : nodeNo

names(v) = v;

vertex = tsp.graph.vertex(v);

diag = 0;

for e=1 : nodeNo-1

edge = vertex{1}.edge(e);

cost = str2double(edge{1}.Attributes.cost);

if(v==e)

diag = 1;

end

graphMat(v, e+diag) = cost;

end

% convert graph matrix to matlab graph object

Gnodes = cellstr(string(names));

G = graph(graphMat, Gnodes);

end

Matlab code: (graph search)

% read the TSP graph from xml file and convert it to matlab graph object

[G, Gnodes] = loadTSPGraph("att48_.xml");

%[G, Gnodes] = loadTSPGraph("a280_.xml");

%[G, Gnodes] = loadTSPGraph("data2.xml");

%[G, Gnodes] = loadTSPGraph("burma14_.xml");

solutions =[];

time = [];

tic

for idx=1 : size(Gnodes,1) % select a node to be as start node

open=[];

closed=[];

% initialize the startt node

strnode = initializeStartNode(G, Gnodes{idx});

% add the start node to the open list ( finge list)

open = [open; strnode];

solution = []; % the goal strarting from node i

Max_depth=0;

no_expanded_nodes = 0;

while(~isempty(open))

no_expanded_nodes=no_expanded_nodes+1;

% get the minimum cost in the fringe

costs = cell2mat({open.cost});

depths = cell2mat({open.depth});

% get the nodes that have the minimum cost

minCNodes = find(costs==min(costs));

% if there are more than one node with the same cost, get deepest

% one

[val, index] = max(depths(minCNodes));

minidx = minCNodes(index);

currentNode = open(minidx);

% display the state of searching

if( Max_depth < max(cell2mat({open.depth})) )

Max_depth = max(cell2mat({open.depth}));

disp( ['Node: ', currentNode.name, ...

', Depth: ', num2str(currentNode.depth), ...

', Max-depth: ', num2str(max(cell2mat({open.depth}))), ...

', no_expanded_nodes: ', num2str(no_expanded_nodes), ...

', Fringe-size: ', num2str(size(open,1)), ...

' , Elapsed time is ', num2str(toc) ,' seconds.'] );

end

open(minidx)=[];

if(~ismember(currentNode.name, {open.name}))

closed = [closed; currentNode];

end

% check if the node is the goal.

solution = isGoal(currentNode, G);

if (~isempty(solution))

break;

end

% expand the node

nodelist = expandNode(G, strnode.name, currentNode);

% For graph search, we expand the node only if it was not expanded before

for i=1:size(nodelist, 1)

if (isempty(closed))

val2 = 0;

else

[val2, id2] = ismember(nodelist(i).name,{closed.name});

end

if(val2 && (nodelist(i).cost<=closed(id2).cost))

closed(id2)=[];

open = [open; nodelist(i)];

elseif(val2 && (nodelist(i).cost>closed(id2).cost))

if((nodelist(i).depth > closed(id2).depth))

closed(id2)=[];

open = [open; nodelist(i)];

end

else

id1 = find(ismember({open.name}, nodelist(i).name));

if (isempty(id1))

open = [open; nodelist(i)];

else

addnode = false;

removeidxs = [];

for s=1:size(id1,2)

if((nodelist(i).cost <= open(id1(s)).cost))

addnode = true;

if( (nodelist(i).cost < open(id1(s)).cost) && ...

(nodelist(i).depth >= open(id1(s)).depth))

removeidxs = [removeidxs; id1(s)];

end

elseif( (nodelist(i).cost > open(id1(s)).cost) && ...

(nodelist(i).depth > open(id1(s)).depth) )

addnode = true;

end

if(~isempty(removeidxs))

removeidxs = sort(removeidxs, 'descend');

end

for s=1:size(removeidxs,1)

open(removeidxs(s))=[];

end

if(addnode)

open = [open; nodelist(i)];

end

if (~isempty(solution))

solution.cost

solution.path(1)

solutions = [solutions; solution];

time = [time; toc];

end

sum(cell2mat({solutions.cost}))/size(solutions,1)

toc

function node = initializeStartNode(G, startNode)

% remove the first node from the the graph and add it to the open list

% for the first node we have to

unvisitedGraph = rmnode(G, startNode);

h_n = estimateDist(G, unvisitedGraph, startNode, startNode);

node.name = startNode;

node.h_n = h_n;

node.g_n = 0;

node.depth = 0;

node.cost = node.h_n + node.g_n;

node.parent = [];

end

% estimateDist function

% estimate the distance from a current Node 'currentNode' to the end node 'startNode'

function cost = estimateDist(G, unvisitedGraph, startNode, currentNode)

% caculate Minimum spanning tree using Prim’s algorithm

T = minspantree(unvisitedGraph);

% estimated the distance to visit all unvisited nodes using MST heuristic

h_n_MST = sum(T.Edges.Weight);

unvisitedNs = table2cell(unvisitedGraph.Nodes);

% caculate the minimum distance from an unvisited node to the current node

idx_to_curr_node = findedge(G, currentNode, unvisitedNs);

mindist_to_curr_node = min(G.Edges.Weight(idx_to_curr_node));

% caculate the minimum distance from an unvisited node to the start node

idx_to_str_node = findedge(G, startNode, unvisitedNs);

mindist_to_str_node = min(G.Edges.Weight(idx_to_str_node));

cost = h_n_MST + mindist_to_str_node + mindist_to_curr_node;

end

% expandNode function

% returns all the successors of a node

function nodelist = expandNode(G, startNode, currentNode)

% For TSP problem (each node is visited only once)

% here we just get the nodes that are not already in the current node path

unvisitedGraph = G;

node1 = currentNode;

while(~isempty(node1))

unvisitedGraph = rmnode(unvisitedGraph, node1.name);

node1 = node1.parent;

end

% the expanded nodes

unvisitedNs = table2cell(unvisitedGraph.Nodes);

% the expanded nodes as structure (to be returned)

nodelist=[];

% estimate cost for the expanded nodes

for i=1:size(unvisitedNs,1)

expandedNode = unvisitedNs{i}; % expanded node name as a string (char array)

% if the node is leaf (the last node in the path), the heuristic is

% the cost of connectin this node to the start node

if (size(unvisitedNs,1)==1)

idx_to_str_node = findedge(G, startNode, expandedNode);

h_n = G.Edges.Weight(idx_to_str_node);

else

unvistg = rmnode(unvisitedGraph, expandedNode);

% h_n = cost of the MST of the subgraph of expandedNode +

% minimum cost to connecte MST to expandedNode +

% minimum cost to connecte MST to start node

h_n = estimateDist(G, unvistg, startNode, expandedNode);

end

% create the successor node as structure

node.name = expandedNode; % the successor node

node.h_n = h_n; % the heuristic of the successor node

% the cost (edge value) of connecting the current node with its

% successor (the expanded node)

g_n = G.Edges.Weight(findedge(G, node.name ,currentNode.name));

% the cost of connecting the current node to the first node

% (currentNode.g_n). node.g_n is the cost of the successor node to

% the start node

node.g_n = currentNode.g_n + g_n;

% the estimated cost from the current from start node to the goal

% node throw the successor node

node.depth = currentNode.depth+1;

node.cost = node.h_n + node.g_n;

node.parent = currentNode;

nodelist = [nodelist; node];

end

% isGoal function

% Test if the node is the goal node (if the goal is satisfied)

function solution = isGoal(node, G)

path=[];

snode=[];

cost=node.cost;

while(~isempty(node))

path = [path; {node.name}];

G = rmnode(G, node.name);

snode = {node.name};

node = node.parent;

end

if (isempty(G.Nodes))

path = [snode; path];

solution.path=path;

solution.cost=cost;

else

solution = [];

end

function [G, Gnodes]= loadTSPGraph(fileName)

% read the TSP graph from xml file and convert it to matlab structure

tsp = xml2struct(fileName);

% convert graph structure to graph matrix

nodeNo = size(tsp.graph.vertex, 2);

graphMat = zeros(nodeNo, nodeNo);

names = zeros(nodeNo,1);

for v=1 : nodeNo

names(v) = v;

vertex = tsp.graph.vertex(v);

diag = 0;

for e=1 : nodeNo-1

edge = vertex{1}.edge(e);

cost = str2double(edge{1}.Attributes.cost);

if(v==e)

diag = 1;

end

graphMat(v, e+diag) = cost;

end

% convert graph matrix to matlab graph object

Gnodes = cellstr(string(names));

G = graph(graphMat, Gnodes);

end

[1] S. Rassel and P. Norwig, “Artificial intelligence: a modern approach (AIMA).” Moscow: Wiliams, 2007.

[2] “Symmetric TSPs.” [Online]. Available: http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/STSP.html. [Accessed: 08-Apr-2020].

Contact
For any questions, feel free to contact:Hamdi Altaheri: haltaheri@ksu.edu.sa