NAG Library Function Document
nag_mip_tsp_simann (h03bbc)
1 Purpose
nag_mip_tsp_simann (h03bbc) calculates an approximate solution to a symmetric travelling salesman problem using simulated annealing via a configuration free interface.
2 Specification
#include <nag.h> 
#include <nagh.h> 

3 Description
nag_mip_tsp_simann (h03bbc) provides a probabilistic strategy for the calculation of a near optimal path through a symmetric and fully connected distance matrix; that is, a matrix for which element $\left(i,j\right)$ is the pairwise distance (also called the cost, or weight) between nodes (cities) $i$ and $j$. This problem is better known as the Travelling Salesman Problem (TSP), and symmetric means that the distance to travel between two cities is independent of which is the destination city.
In the classical TSP, which this function addresses, a salesman wishes to visit a given set of cities once only by starting and finishing in a home city and travelling the minimum total distance possible. It is one of the most intensively studied problems in computational mathematics and, as a result, has developed some fairly sophisticated techniques for getting nearoptimal solutions for large numbers of cities. nag_mip_tsp_simann (h03bbc) adopts a very simple approach to try to find a reasonable solution, for moderately large problems. The function uses simulated annealing: a stochastic mechanical process in which the heating and controlled cooling of a material is used to optimally refine its molecular structure.
The material in the TSP is the distance matrix and a given state is represented by the order in which each city is visited—the path. This system can move from one state to a neighbouring state by selecting two cities on the current path at random and switching their places; the order of the cities in the path between the switched cities is then reversed. The cost of a state is the total cost of traversing its path; the resulting difference in cost between the current state and this new proposed state is called the delta; a negative delta indicates the proposal creates a more optimal path and a positive delta a less optimal path. The random selection of cities to switch uses random number generators (RNGs) from
Chapter g05; it is thus necessary to initialize a state array for the RNG of choice (by a call to
nag_rand_init_repeatable (g05kfc) or
nag_rand_init_nonrepeatable (g05kgc)) prior to calling nag_mip_tsp_simann (h03bbc).
The simulation itself is executed in two stages. In the first stage, a series of sample searches through the distance matrix is conducted where each proposed new state is accepted, regardless of the change in cost (delta) incurred by applying the switches, and statistics on the set of deltas are recorded. These metrics are updated after each such sample search; the number of these searches and the number of switches applied in each search is dependent on the number of cities. The final collated set of metrics for the deltas obtained by the first stage are used as control parameters for the second stage. If no single improvement in cost is found during the first stage, the algorithm is terminated.
In the second stage, as before, neighbouring states are proposed. If the resulting delta is negative or causes no change the proposal is accepted and the path updated; otherwise moves are accepted based on a probabilistic criterion, a modified version of the Metropolis–Hastings algorithm.
The acceptance of some positive deltas (increased cost) reduces the probability of a solution getting trapped at a nonoptimal solution where any single switch causes an increase in cost. Initially the acceptance criteria allow for relatively large positive deltas, but as the number of proposed changes increases, the criteria become more stringent, allowing fewer positive deltas of smaller size to be accepted; this process is, within the realm of the simulated annealing algorithm, referred to as ‘cooling’. Further exploration of the system is initially encouraged by accepting nonoptimal routes, but is increasingly discouraged as the process continues.
The second stage will terminate when:
 – a solution is obtained that is deemed acceptable (as defined by supplied values);
 – the algorithm will accept no further positive deltas and a set of proposed changes have resulted in no improvements (has cooled);
 – a number of consecutive sets of proposed changes has resulted in no improvement.
4 References
Applegate D L, Bixby R E, Chvátal V and Cook W J (2006) The Traveling Salesman Problem: A Computational Study Princeton University Press
Cook W J (2012) In Pursuit of the Traveling Salesman Princeton University Press
Johnson D S and McGeoch L A The traveling salesman problem: A case study in local optimization Local search in combinatorial optimization (1997) 215–310
Press W H, Teukolsky S A, Vetterling W T and Flannery B P (2007) Numerical Recipes The Art of Scientific Computing (3rd Edition)
Rego C, Gamboa D, Glover F and Osterman C (2011) Traveling salesman problem heuristics: leading methods, implementations and latest advances European Journal of Operational Research 211 (3) 427–441
Reinelt G (1994) The Travelling Salesman. Computational Solutions for TSP Applications, Volume 840 of Lecture Notes in Computer Science Springer–Verlag, Berlin Heidelberg New York
5 Arguments
 1:
$\mathbf{nc}$ – IntegerInput

On entry: the number of cities. In the trivial cases ${\mathbf{nc}}=1$, $2$ or $3$, the function returns the optimal solution immediately with ${\mathbf{tmode}}=0$ (provided the relevant distance matrix entries are not negative).
Constraint:
${\mathbf{nc}}\ge 1$.
 2:
$\mathbf{dm}\left[{\mathbf{nc}}\times {\mathbf{nc}}\right]$ – const doubleInput

Note: the $\left(i,j\right)$th element of the matrix is stored in ${\mathbf{dm}}\left[\left(j1\right)\times {\mathbf{nc}}+i1\right]$.
On entry: the distance matrix; each ${\mathbf{dm}}\left[\left(\mathit{j}1\right)\times {\mathbf{nc}}+\mathit{i}1\right]$ is the effective cost or weight between nodes $\mathit{i}$ and $\mathit{j}$. Only the strictly upper half of the matrix is referenced.
Constraint:
${\mathbf{dm}}\left[\left(\mathit{j}1\right)\times {\mathbf{nc}}+\mathit{i}1\right]\ge 0.0$, for $\mathit{j}=2,3,\dots ,{\mathbf{nc}}$ and $\mathit{i}=1,2,\dots ,\mathit{j}1$.
 3:
$\mathbf{bound}$ – doubleInput

On entry: a lower bound on the solution. If the optimum is unknown set
bound to zero or a negative value; the function will then calculate the minimum spanning tree for
dm and use this as a lower bound (returned in
${\mathbf{alg\_stats}}\left[5\right]$). If an optimal value for the cost is known then this should be used for the lower bound. A detailed discussion of relaxations for lower bounds, including the minimal spanning tree, can be found in
Reinelt (1994).
 4:
$\mathbf{targc}$ – doubleInput

On entry: a measure of how close an approximation needs to be to the lower bound. The function terminates when a cost is found less than or equal to
${\mathbf{bound}}+{\mathbf{targc}}$. This argument is useful when an optimal value for the cost is known and supplied in
bound. It may be sufficient to obtain a path that is close enough (in terms of cost) to the optimal path; this allows the algorithm to terminate at that point and avoid further computation in attempting to find a better path.
If ${\mathbf{targc}}<0$, ${\mathbf{targc}}=0$ is assumed.
 5:
$\mathbf{path}\left[{\mathbf{nc}}\right]$ – IntegerOutput

On exit: the best path discovered by the simulation. That is,
path contains the city indices in path order. If
${\mathbf{fail}}\mathbf{.}\mathbf{code}\ne 0$ on exit,
path contains the indices
$1$ to
nc.
 6:
$\mathbf{cost}$ – double *Output

On exit: the cost or weight of
path. If
${\mathbf{fail}}\mathbf{.}\mathbf{code}\ne 0$ on exit,
cost contains the largest model real number (see
nag_real_max_exponent (X02BLC)).
 7:
$\mathbf{tmode}$ – Integer *Output

On exit: the termination mode of the function (if
${\mathbf{fail}}\mathbf{.}\mathbf{code}\ne 0$ on exit,
tmode is set to
$1$):
 ${\mathbf{tmode}}=0$
 Optimal solution found, ${\mathbf{cost}}={\mathbf{bound}}$.
 ${\mathbf{tmode}}=1$
 System temperature cooled. The algorithm returns a path and associated cost that does not attain, nor lie within targc of, the bound. This could be a sufficiently good approximation to the optimal path, particularly when ${\mathbf{bound}}+{\mathbf{targc}}$ lies below the optimal cost.
 ${\mathbf{tmode}}=2$
 Halted by cost falling within the desired targc range of the bound.
 ${\mathbf{tmode}}=3$
 System stalled following lack of improvement.
 ${\mathbf{tmode}}=4$
 Initial search failed to find a single improvement (the solution could be optimal).
 8:
$\mathbf{alg\_stats}\left[6\right]$ – doubleOutput

On exit: an array of metrics collected during the initial search. These could be used as a basis for future optimization. If
${\mathbf{fail}}\mathbf{.}\mathbf{code}\ne 0$ on exit, the elements of
alg_stats are set to zero; the first five elements are also set to zero in the trival cases
${\mathbf{nc}}=1$,
$2$ or
$3$.
 ${\mathbf{alg\_stats}}\left[0\right]$
 Mean delta.
 ${\mathbf{alg\_stats}}\left[1\right]$
 Standard deviation of deltas.
 ${\mathbf{alg\_stats}}\left[2\right]$
 Cost at end of initial search phase.
 ${\mathbf{alg\_stats}}\left[3\right]$
 Best cost encountered during search phase.
 ${\mathbf{alg\_stats}}\left[4\right]$
 Initial system temperature. At the end of stage 1 of the algorithm, this is a function of the mean and variance of the deltas, and of the distance from best cost to the lower bound. It is a measure of the initial acceptance criteria for stage 2. The larger this value, the more iterations it will take to geometrically reduce it during stage 2 until the system is cooled (below a threshold value).
 ${\mathbf{alg\_stats}}\left[5\right]$
 The lower bound used, which will be that computed internally when ${\mathbf{bound}}\le 0$ on input. Subsequent calls with different random states can set bound to the value returned in ${\mathbf{alg\_stats}}\left[5\right]$ to avoid recomputation of the minimal spanning tree.
 9:
$\mathbf{state}\left[\mathit{dim}\right]$ – IntegerCommunication Array

Note: the dimension,
$\mathit{dim}$, of this array is dictated by the requirements of associated functions that must have been previously called. This array MUST be the same array passed as argument
state in the previous call to
nag_rand_init_repeatable (g05kfc) or
nag_rand_init_nonrepeatable (g05kgc).
On entry: a valid RNG state initialized by
nag_rand_init_repeatable (g05kfc) or
nag_rand_init_nonrepeatable (g05kgc). Since the algorithm used is stochastic, a random number generator is employed; if the generator is initialized to a nonrepeatable sequence (
nag_rand_init_nonrepeatable (g05kgc)) then different solution paths will be taken on successive runs, returning possibly different final approximate solutions.
On exit: contains updated information on the state of the generator.
 10:
$\mathbf{fail}$ – NagError *Input/Output

The NAG error argument (see
Section 3.6 in the Essential Introduction).
6 Error Indicators and Warnings
 NE_ALLOC_FAIL

Dynamic memory allocation failed.
See
Section 3.2.1.2 in the Essential Introduction for further information.
 NE_BAD_PARAM

On entry, argument $\u2329\mathit{\text{value}}\u232a$ had an illegal value.
 NE_INT

On entry, ${\mathbf{nc}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{nc}}\ge 1$.
 NE_INTERNAL_ERROR

An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact
NAG for assistance.
An unexpected error has been triggered by this function. Please contact
NAG.
See
Section 3.6.6 in the Essential Introduction for further information.
 NE_INVALID_STATE

On entry,
state vector has been corrupted or not initialized.
 NE_NO_LICENCE

Your licence key may have expired or may not have been installed correctly.
See
Section 3.6.5 in the Essential Introduction for further information.
 NE_REAL_ARRAY

On entry, the strictly upper triangle of
dm had a negative element.
7 Accuracy
The function will not perform well when the average change in cost caused by switching two cities is small relative to the cost; this can happen when many of the values in the distance matrix are relatively close to each other.
The quality of results from this function can vary quite markedly when different initial random states are used. It is therefore advisable to compute a number of approximations using different initial random states. The best cost and path can then be taken from the set of approximations obtained. If no change in results is obtained after $10$ such trials then it is unlikely that any further improvement can be made by this function.
8 Parallelism and Performance
Running many instances of the function in parallel with independent random number generator states can yield a set of possible solutions from which a best approximate solution may be chosen.
Memory is internally allocated for $3\times {\mathbf{nc}}2$ integers and ${\mathbf{nc}}1$ real values.
In the case of two cities that are not connected, a suitably large number should be used as the distance (cost) between them so as to deter solution paths which directly connect the two cities.
If a city is to be visited more than once (or more than twice for the home city) then the distance matrix should contain multiple entries for that city (on rows and columns ${i}_{1},{i}_{2},\dots $) with zero entries for distances to itself and identical distances to other cities.
10 Example
An approximation to the best path through $21$ cities in the United Kingdom and Ireland, beginning and ending in Oxford, is sought. A lower bound is calculated internally.
10.1 Program Text
Program Text (h03bbce.c)
10.2 Program Data
Program Data (h03bbce.d)
10.3 Program Results
Program Results (h03bbce.r)