
NAG Algorithmic Differentiation Services
Algorithmic Differentiation (AD) and Adjoint AD (AAD) are extremely powerful technologies. Applying them by hand to production sized codes is a serious, lengthy undertaking and requires a team of specialists. Code maintenance and updating also becomes more expensive and complex. For this reason, most people have turned to AD tools to get sensitivities of their simulation codes. NAG are pioneers in AD technologies and help organisations apply AD to their computation. Blue chip clients in finance are reaping the benefits of NAG’s expertise in this field, and other industries can also benefit extensively from implementing NAG AD Solutions.
Design, implementation and testing strategies for AD
Most businesses start exploring AD with a Proof of Concept (PoC) assigned to one, or sometimes two, ‘in-house’ developers. Very often, however, these PoCs don't fare very well because:
- the developer is new to (A)AD and has to learn a lot quickly, and usually on his/her own
- the developer must learn how to control memory use efficiently for production-size codes (our tools make this easy, but doing it effectively requires a good understanding of AD concepts and of the underlying code)
- typically, the PoCs are very time-limited: the developer must report back to the business by a given date, schedules are tight, and any hiccups or problems mean the schedules quickly slip
- the developer needs to learn how to use our AD tools: many people don't have the time to read through our documentation and example programs
- the developer often gets pulled onto other more urgent business matters mid-way through the PoC, then has to come back later on and try remember where he/she was
- once the AD code is working, the developer then needs to explain results: e.g., why are there zero derivatives with respect to some parameters when finite difference estimates are non-zero? Typically this is because the underlying code is not differentiable: ideally this should be spotted right at the outset and addressed before AD is applied
This is quite a daunting list. NAG's AD Proof of Concept Support service has grown through client demand: in a nutshell, NAG experts conduct the PoC with the client.
- The organisation takes one of our AD developers in-house, typically for a week.
- Our developers work with the organisation's to get the entire PoC up and running as quickly as possible
- It's a combination of coding help and on-the-job training: we answer questions, point out pitfalls, explain AD concepts, inspect the code for non-differentiabilities, point out where analytic adjoints are desirable, and generally offer advice on design, implementation and testing.
- The PoCs are completed much faster, developers learn faster because there's someone next to them to answer questions, and the resulting AD code is efficient and has predictable memory use.
AD Solutions Case Studies
Case study highlights: Obtaining the gradient through finite differences took a month and a half. The adjoint AD code obtained the gradient in less than 10 minutes.
Figure 2: MITgcm sensitivities of zonal ocean water flow through the Drake Passage to changes in bottom topography.
AD helps our understanding of climate change and improves weather predictions.
Figure 2 shows the sensitivity of the amount of water flowing through the Drake passage to changes in the topography of the ocean floor. The simulation was performed with the AD-enabled MIT Global Circulation Model (MITgcm) run on a supercomputer. The ocean was meshed with 64,800 grid points. (2)
Obtaining the gradient through finite differences took a month and a half. The adjoint AD code obtained the gradient in less than 10 minutes.
The gradient information can be used to further refine climate prediction models and our understanding of global weather, for example the high sensitivity over the South Pacific Ridge and Indonesian Throughflows even though these are far away from the Drake Passage.
Utke J, Naumann U, Wunsch C, Hill C, Heimbach P, Fagan M, Tallent N and Strout M. (2008). OpenAD/F: A modular, open-source tool for automatic differentiation of Fortran codes. ACM Trans. Math. Softw, 34(4) 18:1-18:36.
Case study highlights: The normal simulation on a top-end desktop took 44s, while the AD-enabled simulation took 273s. To obtain the same gradient information, on the same machine, by finite differences would take roughly 5 years.
Figure 1: Sensitivity of drag coefficient to the surface geometry of a car at high speed (left) and low speed (right)
AD enables sensitivity analyses of huge simulations, enabling shape optimization, intelligent design and and comprehensive risk studies.
Figure 1 shows sensitivities of the drag coefficient to each point on a car's surface when it moves at high speed (left) and low speed (right). The simulation was performed with AD-enabled OpenFOAM built on top of the AD Software Tool dco. The surface mesh had 5.5 million cells and the gradient vector was 18GB.
The normal simulation on a top-end desktop took 44s, while the AD-enabled simulation took 273s. To obtain the same gradient information, on the same machine, by finite differences would take roughly 5 years.
The gradient information can now be used to optimize the shape of the car so as to reduce the drag
Towara M and Naumann U (2013). A discrete adjoint model for OpenFOAM. Procedia Comp. Sci. Volume 18.
AD and AAD is used in finance to get sensitivities of complex instruments quickly, enabling real time risk management and hedging of quantities like xVA.
Here we show some results from a paper which studied two typical codes arising in finance: Monte Carlo and PDEs.
Monte Carlo | |||||
---|---|---|---|---|---|
n | f | cfd | AD | AD/f | cfd/AD |
34 | 0.5s | 29.0s | 3.0s (2.5MB) | 6.0x | 9.7x |
62 | 0.7s | 80.9s | 5.1s (3.2MB) | 7.3x | 15.9x |
142 | 1.5s | 423.5s | 12.4s (5.1MB) | 8.3x | 34.2x |
222 | 2.3s | 1010.7s | 24.4s (7.1MB) | 10.6x | 41.4x |
PDE | |||||
34 | 0.6s | 37.7s | 11.6s (535MB) | 19.3x | 3.3x |
62 | 1.0s | 119.5s | 18.7s (919MB) | 18.7x | 6.4x |
142 | 2.6s | 741.2s | 39s (2GB) | 15.0x | 19x |
222 | 4.1s | 1857.3s | 60s (3GB) | 14.6x | 31x |
Table 1: Run times and memory requirements as a function of gradient size n for Monte Carlo and PDE applications.
Table 1 shows the runtimes of a first-order adjoint code using dco vs. central finite differences on a typical finance application (option pricing under local volatility model, 10K sample paths/spatial points and 360 time steps). The second column f is the normal runtime of the application, cfd is the runtime for central finite differences and AD is adjoint AD runtime along with additional memory required (tape size). Calculations were run on a laptop so only the relative runtimes AD/f and cfd/AD are important, the latter showing the speedup of AD over finite differences.
In finance such derivative information is often used for hedging and risk calculations, so these gradients must be computed many times per day.
du Toit J and Naumann U (2014). Adjoint algorithmic differentiation tool support for typical numerical patterns in computational finance. NAG Technical Report TR3/14.