Note: this document is essential reading for any prospective user of the Library.
1 Summary for All Users
All users both familiar or unfamiliar with this Library who are thinking of using a routine from it, are asked to please follow these instructions:
||read the whole of this Essential Introduction;
||select an appropriate chapter or routine by:
||read the relevant Chapter Introduction;
||choose a routine, and read the routine document. If the routine does not after all meet your needs, return to step (b);
||read the Users' Note for your implementation;
||consult local documentation, which should be provided by your local support staff, about access to the Library on your computing system;
||obtain an online copy of the example program for the particular routine of interest and experiment with it.
You should now be in a position to include a call to the routine in a program, and to attempt to compile and run it. You may of course need to refer back to the relevant documentation in the case of difficulties, for advice on assessment of results, and so on.
As you become familiar with the Library, some of steps (a)
can be omitted, but it is always essential to:
- be familiar with this Essential Introduction;
- be familiar with the Chapter Introduction;
- read the routine document;
- be aware of the Users' Note for your implementation.
2 The Library and its Documentation
2.1 Structure of the Library
The NAG Library for SMP & Multicore is the same collection of routines as those available in the NAG Library, many of which have been specially tuned to maximize their performance on Symmetric Multiprocessor (SMP) machines. The document entitled ‘Introduction to the NAG Library for SMP & Multicore
’ is essential
reading for NAG Library for SMP & Multicore users.
The Library is divided into chapters
, each devoted to a branch of numerical analysis or statistics. Each chapter has a three-character name and a title,
Exceptionally, Chapters H
have one-character names. The chapters and their names are based on the ACM modified SHARE classification index (see ACM (1960–1976)
All documented routines in the Library have six-character names, beginning with the characters of the chapter name,
Note that the second and third characters are digits
, not letters; e.g., 0 is the digit zero, not the letter O. The last letter of each routine name almost always appears as ‘F’ in the documentation. Chapters D03
have some routines whose last letter is ‘A’ rather than ‘F’. An ‘A’ version is always paired with an ‘F’ routine, the ‘A’ version being safe to use in a multithreaded environment, but otherwise having identical functionality to the ‘F’ version.
(Linear Algebra Support Routines
) contains all the Basic Linear Algebra Subprograms, BLAS (Blackford et al. (2002)
), with NAG-style names as well as with the actual BLAS names, e.g., F06PAF (DGEMV)
. The names in brackets are the equivalent double precision BLAS names. Chapter F16
contains some of the routines specified in the BLAS Technical Form (The BLAS Technical Forum Standard (2001)
) and also some additional routines for integer valued vectors that are not in the standard. Some of the routines in Chapter F16
have both NAG style names and BLAS names. Chapter F07
(Linear Equations (LAPACK)
) and Chapter F08
(Least Squares and Eigenvalue Problems (LAPACK)
) contain routines derived from the LAPACK project (Anderson et al. (1999)
); also, Chapter F01
(Matrix Operations, Including Inversion
) contains storage conversion routines derived from the LAPACK project. Like the BLAS, these routines have NAG-style names as well as LAPACK names, e.g., F07ADF (DGETRF)
. Details regarding these alternate names can be found in the relevant Chapter Introductions.
In order to take full advantage of machine-specific versions of BLAS and LAPACK routines provided by some computer hardware vendors, you are encouraged to use the BLAS and LAPACK names (e.g., DGEMV
) rather than the corresponding NAG-style names (e.g., F06PAF
) wherever possible in your programs.
2.1.1 Long Names for Library Routines
Each documented routine has, in addition to its short six-character name, a long name beginning with the root nagf_ and consisting of an underscore separated list of words. The long-name naming scheme has been chosen so that the long names group like routines together and group routines within a suite together.
The long name for each routine in a chapter is listed in the respective Chapter Contents page. The second word in the long name is fixed for each chapter, e.g., routines in Chapter D01
) all have long names that begin nagf_quad_
Each chapter has a unique second word in its set of long names with the exception of Chapters F07
which share the same second word (lapack
Note that the long names of BLAS and LAPACK routines, such as nagf_blas_dgemm
, will not take advantage of machine-specific versions of BLAS and LAPACK. As mentioned in Section 2.1
you are recommended to use the plain BLAS or LAPACK name (in this case DGEMM
) for performance reasons.
Routines that are marked for withdrawal have long names that have the third word withdraw. At subsequent marks of the library, any routine that becomes marked for withdrawal will have the third word withdraw inserted into its long name; the original long name will no longer be available for the given routine at that stage.
For those chapters that have both A and F versions of a routine, the long name for the F version is the same as that of the A version, but with an additional last word (old – signifying that the F version predates the A version).
It should be noted that the long names are implemented by use of aliasing in the NAG Library interface block modules, and so long names are only accessible when calling the NAG Library from a Fortran program the USEs nag_library.mod.
Please refer to Section 3.2.1
for advice on supplying alternative routine names and, possibly, simplified routine interfaces.
2.2 Structure of the Documentation
The NAG Library Manual is the principal documentation for both the NAG Library and the NAG Library for SMP & Multicore.
It has the same chapter structure as the Library: each chapter of routines in the Library has a corresponding chapter (of the same name) in the Manual. The chapters occur in alphanumeric order. General introductory documents appear at the beginning of the Manual.
Each chapter consists of the following documents:
A routine document has the same name as the routine which it describes. Within each chapter, routine documents occur in alphanumeric order. For those chapters that have both ‘A’ and ‘F’ versions of a routine, the routine descriptions are combined into one routine document.
Documentation is provided in the following formats:
- XHTML+MathML, a fully linked version of the manual using XHTML and MathML (recommended for browsing) and providing links to the PDF version of each document (recommended for printing); and
- PDF, a full PDF manual browsed using the PDF bookmarks, or via HTML index files.
- Single file PDF, the manual as a single PDF file.
- Windows HTML help, Windows HTML help version as a single file.
Advice on viewing and navigating the formats available can be found in the document Online Documentation
The most up-to-date version of the documentation is accessible via the NAG web site
(see Section 5
2.3 Implementations of the Library
The Library is available on many different computer systems. For each distinct system, an implementation of the Library is prepared by NAG, e.g., the Sun Solaris 64-bit implementation. The implementation is distributed to sites as a tested compiled library.
An implementation is usually specific to a range of machines (e.g., the SPARC systems); it may also be specific to a particular Fortran compiler, or compiler option
(such as scalar or vector mode or thread safe).
Essentially the same facilities are provided in all implementations of the Library, but, because of differences in arithmetic behaviour and in the compilation system, routines cannot be expected to give identical results on different systems, especially for sensitive numerical problems.
The documentation supports all implementations of the Library, with the help of a few simple conventions, and a small amount of implementation-dependent information, which is published in a separate Users' Note
for each implementation (see Section 4.4
2.4 Library Identification
Periodically a new Mark of the NAG Library is released: new routines are added, corrections and/or improvements are made to existing routines; and occasionally routines are withdrawn if they have been superseded by improved routines.
You must know which implementation
, which precision
and which mark
of the Library you are using or intend to use. To find out which implementation, precision and mark of the Library is available at your site, you can run a program which calls the NAG Library routine A00AAF
The program could be:
USE nag_library, ONLY: a00aaf
Alternatively, the example program for A00AAF
can be run using the nagexample
scripts supplied with your implementation (see the Users' Note
An example of the output is:
*** Start of NAG Library implementation details ***
Implementation title: Linux/NAG nagfor
Precision: FORTRAN double precision
Product Code: FLL6A23D9L
Mark: 23 (self-contained)
*** End of NAG Library implementation details ***
2.5 Fortran Language Standards
All routines in the Library conform to the ISO Fortran 95 Standard (ISO (1997)
3 Using the Library
3.1 General Advice
A NAG Library routine cannot
be guaranteed to return meaningful results irrespective of the data supplied to it. Care and thought must
be exercised in:
||formulating the problem;
||programming the use of Library routines;
||assessing the significance of the results.
The Foreword to the Manual provides some further discussion of points (a) and (c); the remainder of Section 3
is concerned with (b).
3.2 Programming Advice
The Library and its documentation are designed with the assumption that you will write a calling program in Fortran (although it may be called from other languages – see Section 3.10
When programming a call to a routine, read the routine document carefully, especially the description of the Parameters
. This states clearly which parameters must have values assigned to them on entry to the routine, and which return useful values on exit. See Section 4.3
for further guidance.
The most common types of programming error in using the Library are:
- incorrect parameters in a call to a Library routine;
- calling the Library from a single precision program.
The USE of the nag_library MODULE will help detect or prevent some of these errors. For example, when using this, incorrect parameter types will be caught at compile time and using KIND=nag_wp in the type of real and complex variables will maintain consistency with the Library.
Therefore if a call to a Library routine results in an unexpected error message from the system (or possibly from within the Library), check
- Have some actual array arguments been passed as different dummy arguments (i.e., an array appears more than once in the argument) list with different INTENTs.
- Have all array parameters been dimensioned correctly?
Avoid the use of NAG-type names for your own program units or COMMON blocks: in general, do not use names which contain a three-character NAG chapter name embedded in them; they may clash with the names of an auxiliary routine or COMMON block used by the NAG Library.
3.2.1 Alternative Routine Names
If the Library is called from a Fortran program then it is possible to use alternative names for user-callable routines. This can be done via the ‘USE nag_library’ statement at the start of the (sub)program in which the Library routine is called. For example, you wish to use the name BesselJ0 instead of the Library name S17AEF
. In this case the line
USE nag_library, ONLY: s17aef
would be replaced by
USE nag_library, ONLY: BesselJ0 => s17aef
The (sub)program would then use the name BesselJ0 in place of S17AEF
and call it with the identical interface.
If Library routines are called from other environments then many such environments offer ways of ‘aliasing’ a routine name by a preferred alternative name.
For many of the Library routines with more complex interfaces it is likely that only a subset of the functionality is required and that some parameter values will always remain unchanged or will not be referenced. In such cases it may be preferable to write your own wrapper to the Library routine with a much simpler interface and with a preferred alternative name. For example, if you wish to integrate a system of stiff ordinary differential equations without root finding or intermediate output, you could create the simple interface wrapper to the more complicated D02EJF
USE nag_library, ONLY: nag_wp, d02ejf, d02ejw, d02ejx, d02ejy
REAL(kind=nag_wp) :: xend, y(:)
REAL(kind=nag_wp) :: tol, xstart
INTEGER :: ifail, iw, n
CHARACTER :: relabs
REAL(kind=nag_wp), ALLOCATABLE :: w(:)
n = SIZE(y)
tol = 1.0e-3_nag_wp
relabs = 'M'
iw = (12+n)*n + 50
ifail = 0
xstart = 0.0_nag_wp
CALL d02ejf(xstart,xend,n,y,fcn,d02ejy,tol,relabs,d02ejx, &
END SUBROUTINE BDFsolve
The above example of a user-defined wrapper would be compiled and linked with a main program that would include the simple call:
3.2.2 The NAG Fortran Environment
The environment for the NAG Library is defined by the nag_library MODULE. Certain routines require you to USE this to access definitions of NAG-defined TYPEs or named constants. It is recommended that you also USE the MODULE to enable checking of INTERFACEs in the Library.
The exact location of nag_library.mod
is installation dependent; please see the Users' Note
for your implementation.
3.3 Error Handling and the Parameter IFAIL
3.3.1 Errors, Failure and Warning Conditions
The error, failure or warning conditions considered here are those that can be detected by explicit coding in a Library routine. Such conditions must be anticipated by the author of the routine. They should not be confused with run-time errors detected by the compilation system, e.g., detection of overflow or failure to assign an initial value to a variable.
In the rest of this document we use the word ‘error’ to cover all types of error, failure or warning conditions detected by the routine. They fall roughly into three classes.
All three classes of errors are handled in the same way by the Library.
||On entry to the routine the value of a parameter is out of range. This means that it is not useful, or perhaps even meaningful, to begin computation.
||During computation the routine decides that it cannot yield the desired results, and indicates a failure condition. For example, a matrix inversion routine will indicate a failure condition if it considers that the matrix is singular and so cannot be inverted.
||Although the routine completes the computation and returns results, it cannot guarantee that the results are completely reliable; it therefore returns a warning. For example, an optimization routine may return a warning if it cannot guarantee that it has found a local minimum.
Each error which can be detected by a Library routine is associated with a number. Some numbers such as those associated with a failure in dynamic memory allocation (see Section 3.6
) or detecting a valid licence (Section 3.7
) are the same for all Library routines and may not be listed in individual routine documents. All other numbers, with explanations of the errors, are listed in Section 6 (Error Indicators and Warnings) in the routine document. Unless the document specifically states to the contrary, you should not assume that the routine necessarily tests for the occurrence of the errors in their order of error number, i.e., the detection of an error does not imply that other errors have or have not been detected.
3.3.2 The IFAIL Parameter
Most of the NAG Library routines which can be called directly by you have a parameter called IFAIL. This parameter is concerned with the NAG Library error trapping mechanism (and, for some routines, with controlling the output of error messages and advisory messages).
IFAIL has two
||to allow you to specify what action the Library routine should take if an error is detected;
||to inform you of the outcome of the call of the routine.
For purpose (i)
, you must
assign a value to IFAIL before the call to the Library routine. Since IFAIL is reset by the routine for purpose (ii)
, the parameter must be the name of a variable, not
a literal or constant.
The value assigned to IFAIL before entry should be either (hard fail option), or 1 or – (soft fail option). If after completing its computation the routine has not detected an error, IFAIL is reset to to indicate a successful call. Control returns to the calling program in the normal way. If the routine does detect an error, its action depends on whether the hard or soft fail option was chosen.
3.3.3 Hard Fail Option
If you set IFAIL to
before calling the Library routine, execution of the program will terminate if the routine detects an error. Before the program is stopped, this error message is output:
** ABNORMAL EXIT from NAG Library routine XXXXXX: IFAIL = n
** NAG hard failure - execution terminated
is the routine name, and n
is the number associated with the detected error. An explanation of error number n
is given in Section 6 of the routine document XXXXXX
In addition, most routines output explanatory error messages immediately before the standard termination message shown above.
The hard fail option should be selected if you are in any doubt about continuing the execution of the program after an unsuccessful call to a NAG Library routine. For environments where it might be inappropriate to halt program execution when an error is detected it is recommended that the hard fail option is not used.
3.3.4 Soft Fail Option
To select this option, you must set IFAIL to or before calling the Library routine.
If the routine detects an error, IFAIL is reset to the associated error number; further computation within the routine is suspended and control returns to the calling program.
If you set IFAIL to , then no error message is output (silent exit). If the output of error messages is undesirable, then silent exit is recommended.
If you set IFAIL to
), then before control is returned to the calling program, the following error message is output:
** ABNORMAL EXIT from NAG Library routine XXXXXX: IFAIL = n
** NAG soft failure - control returned
In addition, most routines output explanatory error messages immediately before the above standard message.
It is most important to test the value of IFAIL on exit if the soft fail option is selected. A nonzero exit value of IFAIL implies that the call was not successful so it is imperative that your program be coded to take appropriate action. That action may simply be to print IFAIL with an explanatory caption and then terminate the program. Many of the example programs in Section 9 of the routine documents have IFAIL-exit tests of this form. In the more ambitious case, where you wish your program to continue, it is essential that the program can branch to a point at which it is sensible to resume computation.
The soft fail option puts the onus on you to handle any errors detected by the Library routine. With the proviso that you are able to implement it properly
, it is clearly more flexible than the hard fail option since it allows computation to continue in the case of errors. In particular there are at least two cases where its flexibility is useful:
||where additional information about the error or the progress of computation is returned via some of the other parameters;
||in some routines, ‘partial’ success can be achieved, e.g., a probable solution found but not all conditions fully satisfied, so the routine returns a warning. On the basis of the advice in Section 6 and elsewhere in the routine document, you may decide that this partially successful call is adequate for certain purposes.
3.3.5 Historical Note
The error handling mechanism described above was introduced into the NAG Library at Mark 12. It supersedes the earlier mechanism which for most routines allowed IFAIL to be set by you to or 1 only. The new mechanism is compatible with the old except that the details of the messages output on hard failure have changed. The new mechanism also allows you to set IFAIL to (soft failure, noisy exit).
A few routines (introduced mainly at Marks 7 and 8) use IFAIL in a different way to control the output of error messages, and also of advisory messages (see Chapter X04
). In those routines IFAIL is regarded as a decimal integer whose least significant digits are denoted
with the following significance:
|: hard failure
||: soft failure
|: silent exit
||: noisy exit
Details are given in the documents of the relevant routines; for those routines this alternative use of IFAIL remains valid.
3.4 Input/output in the Library
Most NAG Library routines perform no output to an external file, except possibly to output an error message. All error messages are written to a logical error message
unit. This unit number (which is set by default to 6 in most implementations) can be changed by calling the Library routine X04AAF
Some NAG Library routines may optionally output their final results, or intermediate results to monitor the course of computation. In general, output other than error messages is written to a logical advisory message
unit. This unit number (which is also set by default to 6 in most implementations) can be changed by calling the Library routine X04ABF
. Although it is logically distinct from the error message unit, in practice the two unit numbers may be the same. A few routines in Chapter E04
allow this unit number to be specified directly as an option.
All output from the Library is appropriately formatted.
There are only a few Library routines which perform input from an external file. These examples occur in Chapters E04
. The unit number of the external file is a parameter to the routine, and all input is formatted.
You must ensure that the relevant Fortran unit numbers are associated with the desired external files, either by an OPEN statement in your calling program, or by operating system commands.
3.5 Auxiliary Routines as External Procedure Parameters
In addition to those Library routines which are documented and are intended to be called by you directly, the Library also contains many auxiliary routines.
In general, you need not be concerned with them at all, although you may be made aware of their existence if, for example, you examine a memory map of an executable program which calls NAG routines. The only exception is that when calling some NAG Library routines you may be required or allowed to supply the name of an auxiliary routine from the NAG Library as an external procedure parameter. The routine documents give the necessary details. In such cases, you only need to supply the name of the routine; you never need to know details of its parameter list.
NAG auxiliary routines have names which are similar to the name of the documented routine(s) to which they are related, but with last letter ‘Z’, ‘Y’, and so on, e.g.,
- D01BAZ is an auxiliary routine called by D01BAF.
A few chapters contain auxiliary routines whose names are obtained by adding 50 to the second and third characters of the chapter name. For instance, Chapter E04
has an auxiliary routine with the name E54NFU which is normally used as the actual argument for the QPHESS
parameter of E04NFA
; the corresponding name to be used with E04NFF
3.6 Dynamic Memory Allocation
Some NAG Library routines perform dynamic memory allocation to simplify their interfaces.
Where possible, the amount of memory allocated by a routine will be given in the routine document (usually as a function of routine parameters).
All memory allocated by NAG routines is deallocated before exit.
In the case where a routine detects a failure to dynamically allocate sufficient memory, the routine will set an error condition, by setting , and exit with an appropriate error message.
3.7 License Management
If your implementation is license managed then your local site will have details on how the license management is implemented; please contact your site installer for details. To determine whether a valid license is available on your machine run the example program for A00ACF
Should a valid license not be found when calling license managed routines from the Library then the routine will set an error condition, by setting
, and exit with an appropriate error message. The appropriate environment variables should then be checked (e.g., NAG_KUSARI_FILE) to make sure this points to the licence file containing a valid licence, and the licence file should be checked for any obvious errors (e.g., the licence refers to a different implementation). If everything appears to be correct then please contact NAG
(see Section 5
3.8 Thread Safety
Some implementations of the Library facilitate the use of threads; that is, you can call routines from the Library from within a multithreaded application. See the Thread Safety
document for more detailed guidance on using the Library in a multithreaded context. You may also need to refer to the Users' Note
for details of whether your implementation of the Library has been compiled in a manner that facilitates the use of threads.
Note that in some implementations, the Library is linked with one or more vendor libraries to provide, for example, efficient BLAS routines. NAG cannot guarantee that any such vendor library is thread safe.
3.9 Performance on SMP systems
The introduction of multicore processors and the availability of more affordable multisocket systems mean that SMP systems are increasingly common. Users of the NAG Fortran Library on these systems may benefit directly from any SMP parallelism present in the underlying vendor library (e.g., in BLAS and LAPACK routines), and indirectly via any Library routine which internally uses these parallelised vendor routines. To benefit from this, you should set the appropriate environment variable (usually OMP_NUM_THREADS) to the desired number of threads. Generally this should not be more than the number of available (idle) cores on your system. You should consult the relevant vendor library documentation for further information and contact NAG
(see Section 5
for details) for advice if required.
The NAG Library for SMP & Multicore is designed to give further benefit on SMP systems. Key Library routines have been explicitly parallelised using OpenMP
to offer enhanced performance over a wider range of routines compared with the NAG Fortran Library which relies solely on the parallelism in vendor libraries. Further information is given in the Introduction to the NAG Library for SMP & Multicore
Note that the performance increase achieved, if any, will vary depending upon which routine is called, problem sizes and other parameters, system design and operating system configuration. If you frequently call a routine with similar data sizes and other parameters, it may be worthwhile to experiment with different numbers of threads, to determine the choice that gives optimal performance.
3.10 Calling the Library from Other Languages
In general the NAG Library can be called from other computer languages (such as C and Visual Basic) provided that appropriate mappings exist between their data types.
NAG has produced C Header Files which comprise of a set of header files, indicating the match between C and Fortran data types for various compilers, documentation and examples. The documentation, examples and C Header Files are available from the NAG Web sites (see Section 5
The Dynamic Link Library (DLL) implementation can be called in a straightforward manner from a number of languages and environments, e.g., Visual Basic, Visual Basic for Applications (Excel), Delphi, C and C++. Guidance on this is provided as part of NAG Library DLLs. Further details can be found on the NAG Web sites.
3.11 Arithmetic Considerations and Reproducibility of Results
The results obtained when calling a NAG Library routine depend not only on the algorithm used to solve the problem, but also on the compiler used to build the library, compiler run-time libraries, and also the arithmetic properties of the machine on which the code is run.
Historically, different kinds of computer hardware tended to have different kinds of arithmetic. Some machines would store floating point numbers using a base 16 significand and exponent system, others would use base 2, and some even used base 8 or 10. Such differences caused major headaches for software library providers because code that worked well on one arithmetic system might not behave in exactly the same way on another. This meant that great care had to be taken to make the library code portable.
In addition, it was not unheard of for machine arithmetic to have flaws or errors where basic operations such as multiplication or division could sometimes give incorrect results, especially on numbers that were in some way ‘extreme’, such as being very large or small.
After the first of the IEEE standards for floating point arithmetic (ANSI/IEEE (1985)
) was introduced in the 1980s, the situation improved greatly. Nowadays most significant hardware, and certainly most hardware that NAG libraries run on, will use IEEE-style base 2 arithmetic. This makes production of portable code easier, but there are still problems, partly due to the latitude allowed by the IEEE standards. For example, hardware which uses extra-precise 80-bit internal registers for arithmetic, as originally introduced in the Intel 8087 coprocessor in the 1980s, behaves slightly differently from hardware that uses 64-bit registers, particularly if a compiler generates optimized code which holds arithmetic subexpressions in the extra-precise registers.
Since for performance reasons computer arithmetic is generally finite precision (as is certainly the case for IEEE standard
arithmetic) most of the numerical methods implemented by NAG Library routines can only return an approximation to the true solution, simply due to accumulation of rounding errors.
It should therefore be clear that running a program which calls a NAG Library routine with the same data on two different machines can give different results, due to compiler, hardware and run-time library considerations. Usually these differences are small – it may be that a result computed on one machine differs only in the last few significant bits from the same result computed on another machine – for example, when solving a well-conditioned set of linear equations on two different machines. Occasionally small differences may be magnified, for example if a conditional test depends on an imprecise result. A routine that searches for a mininum of an optimization problem may converge to a different local minimum, but in general, so long as the routine's documentation doesn't claim that the same local minimum will always be obtained, this should be acceptable. Even if an algorithm converges to the same local minimum, arithmetic differences may mean that a different number of iterations is taken to get there.
Modern hardware and optimizing compilers have introduced further scope for arithmetic quirks. An example is in the use of Streaming SIMD Extension (SSE) instructions. These low-level machine instructions allow hardware to operate on more than one number in parallel, if your compiler is smart enough to generate and use them correctly, or if you hand-code your own assembly language routines.
SSE instructions enable low-level parallelism of floating point arithmetic operations. For example, a 128-bit SSE register can hold two 64-bit double precision (or four 32-bit single precision) numbers at the same time, and operate on them all simultaneously. This can lead to big time savings when working on large amounts of data.
But this may come at a price. Efficient use of SSE instructions can sometimes depend on exactly how the memory used to store data is aligned. Some SSE instructions for moving data to and from memory need memory to be aligned on a 16-byte boundary. If it happens that the memory (for example, a pointer to an array of numbers) that a NAG routine uses is not aligned nicely, then it may not be possible to use those SSE instructions.
An optimizing compiler might well generate two instruction streams, one for when it detects that memory is aligned, and one for when it is not.
An example should serve to make things clearer. Suppose we wish to compute the inner product of two vectors, X and Y, each of length N. The inner product (or dot product) of two vectors is computed by multiplying together corresponding elements of the two vectors, and summing the individual products to get the result. A routine compiled by a good optimizing compiler would load numbers two or four at a time, multiply them together two or four at a time, and accumulate the results into the final result.
But if the memory is not nicely aligned – and it may well not be – the compiler needs to generate a different code path to deal with the situation. Here the result will take longer to get because the products must be computed and accumulated one at a time. At run-time, the code checks whether it can take the fast path or not, and works appropriately.
The problem is that by altering the order of the accumulations, we are quite possibly changing the final result, simply due to rounding differences when working with finite precision computer arithmetic. Instead of getting the inner product
we may get
It is likely that the result will be just as accurate either way – neither result will be precise due to finite arithmetic – but they may differ by a tiny amount. And if that tiny difference leads to a different decision made by the code that called the inner product routine, the difference may be magnified.
Furthermore, it is possible that the same program running with bitwise identical data on the same machine may give different results when run twice in a row simply because, when the program is loaded, by chance some piece of memory may or may not be aligned on a particular boundary. Such non-deterministic results can be frustrating if the user of the program depends on always getting identical results for the same data.
On even newer hardware, AVX instructions use 256-bit registers, and can therefore operate on more numbers at a time. For AVX instructions, memory may need to be 32-byte aligned.
Some memory used by NAG Library routines is allocated inside the NAG Library. In order to minimize differences due to effects like that described above, we can try to make sure the memory is always aligned nicely – for example, by use of more controllable memory allocation routines where available – but that is not always possible since it partly depends on the support of the compiler.
Of course, no Library routine has control over memory you have allocated before being passed to the routine. If you do observe non-deterministic results which you suspect are due to memory considerations, and you are unable to accept this variation, then you are advised to make sure that any memory you allocate is aligned nicely; unfortunately, precisely how you do this is dependent on your system, but you may be able to get advice through NAG
's usual support channels.
4 Using the Documentation
4.1 Using the Manual
The Manual is designed to serve the following functions
for both the NAG Library and the NAG Library for SMP & Multicore:
- to give background information about different areas of numerical and statistical computation;
- to advise on the choice of the most suitable NAG Library routine or routines to solve a particular problem;
- to give all the information needed to call a NAG Library routine correctly from a Fortran program, and to assess the results.
At the beginning of the Manual are some general introductory documents which provide some background and additional information.
There are a small number of documents which are specific to NAG Library or to the NAG Library for SMP & Multicore, you only need to read those specific to the library you are using. All other general introductory documents are relevant to both libraries.
The document entitled ‘Introduction to the NAG Library for SMP & Multicore
’ is essential
reading for NAG Library for SMP & Multicore users.
The documents entitled ‘Mark 23 NAG Fortran Library News
’ and ‘Mark 23 NAG Library for SMP & Multicore News
details of new routines added, details of routines scheduled for withdrawal and details of routines withdrawn at this mark.
The Mark 23 NAG Library for SMP & Multicore News
also provides details of routines which have been tuned or enhanced at this mark; a full list of such routines is available in the document ‘Tuned and Enhanced Routines in the NAG Library for SMP & Multicore
The document entitled ‘Routines Withdrawn or Scheduled for Withdrawal
’ provides full details of all routines withdrawn from the NAG Library and the document entitled ‘Advice on Replacement Calls for Withdrawn/Superseded Routines
’ provides advice on how to modify your program.
The online documentation provides you with a fully linked HTML Keyword Index
(a keyword index to routines) and GAMS Classification Index
(a list of NAG routines classified according to the GAMS scheme).
Having found a likely chapter or routine, you should read the corresponding Chapter Introduction, which gives background information about that area of numerical computation, and recommendations on the choice of a routine, including indexes, tables or decision trees.
When you have chosen a routine, you must consult the routine document. Each routine document is essentially self-contained (it may, however, contain references to related documents). It includes a description of the method, detailed specifications of each parameter, explanations of each error exit, remarks on accuracy, and (in most cases) an example program to illustrate the use of the routine.
4.2 Structure of Routine Documents
All routine documents have the same structure consisting of nine numbered sections:
||Parameters (see Section 4.3 below)
||Error Indicators and Warnings
||Example (see Section 4.5 below)
In a few documents (notably Chapters E04
) there are a further three sections:
||Description of Monitoring Information
4.3 Specification of Parameters
Section 5 of each routine document contains the specification of the parameters, in the order of their appearance in the parameter list.
4.3.1 Classification of parameters
Parameters are classified as follows.
Input: you must assign values to these parameters on or before entry to the routine, and these values are unchanged on exit from the routine.
Output: you need not assign values to these parameters before entry to the routine; the routine may assign values to them.
Input/Output: you must assign values to these parameters before entry to the routine, and the routine may then change these values.
Workspace: array parameters which are used as workspace by the routine. You must supply arrays of the correct type and dimension. In general, you need not be concerned with their contents.
parameters which are used to communicate data from one routine call to another.
: a routine which must be supplied (e.g., to evaluate an integrand or to print intermediate output). Usually it must be supplied as part of your calling program, in which case its specification includes full details of its parameter list and specifications of its parameters (all enclosed in a box). Its parameters are classified in the same way as those of the Library routine, but because you must write the procedure rather than call it, the significance of the classification is different.
- Input: values may be supplied on entry, which your procedure must not change.
- Output: you may or must assign values to these parameters before exit from your procedure.
- Input/Output: values may be supplied on entry, and you may or must assign values to them before exit from your procedure.
Occasionally, as mentioned in Section 3.5
, the procedure can be supplied from the NAG Library, and then you only need to know its name.
User Workspace: array parameters which are passed by the Library routine to an external procedure parameter. They are not used by the routine, but you may use them to pass information between your calling program and the external procedure.
Dummy: a simple variable which is not used by the routine. A variable or constant of the correct type must be supplied, but its value need not be set. (A dummy parameter is usually a parameter which was required by an earlier version of the routine and is retained in the parameter list for compatibility.)
4.3.2 Constraints and suggested values
The word ‘Constraint
:’ or ‘Constraints
:’ in the specification of an Input
parameter introduces a statement of the range of valid values for that parameter, e.g.,
If the routine is called with an invalid value for the parameter
), the routine will usually take an error exit, returning a nonzero value of IFAIL (see Section 3.3
Constraints on parameters of type CHARACTER only list upper case alphabetic characters, e.g.,
In practice, all routines with CHARACTER parameters will permit the use of lower case characters.
The phrase ‘Suggested Values:’ introduces a suggestion for a reasonable initial setting for an Input parameter (e.g., accuracy or maximum number of iterations) in case you are unsure what value to use; you should be prepared to use a different setting if the suggested value turns out to be unsuitable for your problem.
4.3.3 Array parameters
Most array parameters have dimensions which depend on the size of the problem. In Fortran terminology they have ‘adjustable dimensions’: the dimensions occurring in their declarations are integer variables which are also parameters of the Library routine.
For example, a Library routine might have the specification:
SUBROUTINE <name> (M, N, A, B, LDB)
INTEGER M, N, A(N), B(LDB,N), LDB
For a one-dimensional
array parameter, such as A in this example, the specification would begin
You must ensure that the dimension of the array, as declared in your calling (sub)program, is at least as large as the value you supply for N. It may be larger, but the routine uses only the first N elements.
For a two-dimensional
array parameter, such as B in the example, the specification might be
- B(LDB,N) – INTEGER array
- On entry: the by matrix .
and the parameter LDB might be described as follows:
- LDB – INTEGER
- On entry: the first dimension of the array B as declared in the (sub)program from which <name> is called.
- Constraint: .
You must supply the first dimension of the array B, as declared in your calling (sub)program, through the parameter LDB, even though the number of rows actually used by the routine is determined by the parameter M. You must ensure that the first dimension of the array is at least as large as the value you supply for M. The extra parameter LDB is needed
to allow the routine to act on subarrays of a larger two-dimensional array, e.g., factorizing a diagonal submatrix of a larger matrix.
You must also ensure that the second dimension of the array, as declared in your calling (sub)program, is at least as large as the value you supply for N. It may be larger, but the routine uses only the first N columns.
A program to call the hypothetical routine used as an example in this section might include the statements:
INTEGER AA(100), BB(100,50)
LDB = 100
M = 80
N = 20
INTEGER ALLOCATABLE :: AA(:), BB(:,:)
INTEGER :: M, N, LDB
READ(5,*) M, N
LDB = M
Many NAG routines contain array parameters declared with the ‘assumed size’ array dimension, and would be given as
INTEGER A(*), B(LDB,*)
However, the original declaration of an array in your calling program must always have dimensions, greater than or equal to the minimum value documented. The advantage of using allocatable arrays is that they can be dynamically allocated to be of a correct size not known at compile time.
Consult an expert or a textbook on Fortran if you have difficulty in calling NAG routines with array parameters.
4.4 Implementation-dependent Information
In order to support all implementations of the Library, the Manual has adopted a convention of using bold
italics to distinguish terms which have different interpretations in different implementations.
One bold italicised term is machine precision, which denotes the relative precision to which real floating point numbers are stored in the computer, e.g., in an implementation with approximately 16 decimal digits of precision, machine precision has a value of approximately .
The precise value of machine precision
is given by the routine X02AJF
. Other routines in Chapter X02
return the values of other implementation-dependent constants, such as the overflow threshold, or the largest representable integer. Refer to the X02 Chapter Introduction
for more details.
The bold italicised term block size
is used only in Chapters F07
. It denotes the block size used by block algorithms in these chapters. You only need to be aware of its value when it affects the amount of workspace to be supplied – see the parameters WORK and LWORK of the relevant routine documents and the Chapter Introduction.
In Chapters F01
, alternate routine names are available for BLAS and LAPACK derived routines. For details of the alternate routine names please refer to the relevant Chapter Introduction.
For each implementation of the Library, a separate Users' Note
is published. This is a short document, revised at each mark. At most installations it is available in machine-readable form. It gives any necessary additional information which applies specifically to that implementation, in particular:
- the values returned by Chapter X02 routines;
- the default unit numbers for output (see Section 3.4);
- the meanings of the precision parameters nag_rp (reduced precision), nag_wp (basic precision) and nag_hp (additional precision).
4.5 Example Programs and Results
The example program in Section 9 of most routine documents illustrates a simple call of the routine. The programs are designed so that they can be fairly easily modified, and so serve as the basis for a simple program to solve your problem.
For each implementation of the Library, NAG distributes the example programs in machine-readable form, with all necessary modifications already applied. Many sites make the programs accessible to you in this form. Generic forms of the programs, without implementation-specific modifications, may be obtained directly from the NAG web site
. The Users' Note
for your implementation will mention any special changes which need to be made to the example programs.
Note that the results obtained from running the example programs may not be identical in all implementations, and may not agree exactly with the results in the Manual which were obtained from a double precision implementation (with approximately 16 digits of precision).
For many routine documents, a plot of the example program results is also provided. In some cases the example program has been modified slightly to produce larger sets of results to give a more representative plot of the solution profile produced.
5 Support from NAG
NAG Response Centres
The NAG Response Centres are available for general enquiries from all users and also for technical queries from sites that subscribe to the support service.
The Response Centres are open during office hours, but contact is possible by fax, email and telephone (answering machine) at all times. Please see the Users' Note
or the NAG web site
for contact details.
When contacting one of the NAG Response Centres, it helps us to deal with your query quickly if you can quote your NAG user reference and NAG product code.
NAG Web Site
The NAG web site
is an information service providing items of interest to users and prospective users of NAG products and services. The information is regularly updated and reviewed, and includes implementation availability, descriptions of products, downloadable software, case studies, industry articles and technical reports. The NAG web site
can be accessed via:
6 Background to NAG
Various aspects of the design and development of the NAG Library, and NAG's technical policies and organisation are given in Ford (1982)
, Ford et al. (1979)
, Ford and Pool (1984)
, and Hague et al. (1982)
ACM (1960–1976) Collected algorithms from ACM index by subject to algorithms
Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J J, Du Croz J J, Greenbaum A, Hammarling S, McKenney A and Sorensen D (1999) LAPACK Users' Guide
(3rd Edition) SIAM, Philadelphia http://www.netlib.org/lapack/lug
ANSI (1966) USA standard Fortran Publication X3.9
American National Standards Institute
ANSI (1978) American National Standard Fortran Publication X3.9
American National Standards Institute
ANSI/IEEE (1985) IEEE standard for binary floating-point arithmetic Std 754-1985
IEEE, New York
ANSI/IEEE POSIX (1995) POSIX Standard Thread Library
ANSI/IEEE POSIX 1003.1c:1995
Basic Linear Algebra Subprograms Technical (BLAST) Forum (2001) Basic Linear Algebra Subprograms Technical (BLAST) Forum Standard
University of Tennessee, Knoxville, Tennessee http://www.netlib.org/blas/blast-forum/blas-report.pdf
Blackford L S, Demmel J, Dongarra J J, Duff I S, Hammarling S, Henry G, Heroux M, Kaufman L, Lumsdaine A, Petitet A, Pozo R, Remington K and Whaley R C (2002) An updated set of Basic Linear Algebra Subprograms (BLAS) ACM Trans. Math. Software 28
Ford B (1982) Transportable numerical software Lecture Notes in Computer Science 142
Ford B, Bentley J, Du Croz J J and Hague S J (1979) The NAG Library ‘machine’ Softw. Pract. Exper. 9 (1)
Ford B and Pool J C T (1984) The evolving NAG Library service Sources and Development of Mathematical Software
(ed W Cowell) 375–397 Prentice–Hall
Hague S J, Nugent S M and Ford B (1982) Computer-based documentation for the NAG Library Lecture Notes in Computer Science 142
ISO (1997) ISO Fortran 95 programming language (ISO/IEC 1539–1:1997)
ISO/IEC (1990) Information technology – programming language C Current C Language Standard
Kernighan B W and Ritchie D M (1988) The C Programming Language
(2nd Edition) Prentice–Hall
OpenMP The OpenMP Specification for Parallel Programming http://www.openmp.org
The BLAS Technical Forum Standard (2001) http://www.netlib.org/blas/blast-forum