Nabla

Basic usage

Public API

Nabla public interface is very small, it is composed of only one top level class: ForwardModeDifferentiator

This class is an implementation of the UnivariateFunctionDifferentiator interface from Apache Commons Math. It performs algorithmic differentiation by bytecode analysis and generation, using the exact differentiation rules in order to create functions that compute exact differentials.

Differentiating user-defined functions

In order to differentiate a function using Nabla, a user-defined function must be provided as an implementation of the Apache Commons Math UnivariateFunction interface. It is passed as the single parameter to the differentiate method of a ForwardModeDifferentiator instance. If the available user class does not already implements the UnivariateFunction interface, it has to be wrapped when provided to the differentiator. The differentiate method will then return an object that implements the Apache Commons Math UnivariateDifferentialFunction interface, i.e. the new object is able to compute differentials, even despite the original user object was not able to do so.

As an example, consider the following problem. We have a model variable which is an instance of a class with a method evaluate:

double evaluate(double first, double second)

We want to compute its partial derivatives with respect to the second parameter, when the first parameter value is 2.5 and the second parameter ranges from -1 to +1. Here is a way to do that:

        UnivariateDifferentiableFunction derivative =
            new ForwardAlgorithmicDifferentiator().differentiate(new UnivariateFunction() {
                public double value(double t) {
                    return model.evaluate(2.5, t);
                }
            });

The derivative object we get can compute the desired derivatives:

          int params = 1;
          int order  = 2;
          double t   = 3.4;
          DerivativeStructure dsT = new DerivativeStructure(params, order, 0, t);
          DerivativeStructure dsV = derivative.value(dstT);
          System.out.println("first derivative at t = " + t +
                             " = " + dsV.getPartialDerivative(1));
          System.out.println("second derivative at t = " + t +
                             " = " + dsV.getPartialDerivative(2));

Derivative structures

The UnivariateDifferentiableFunction instance created by Nabla provides a method value which computes both the value of the primitive function (as the initial instance does) and the value of its derivatives with respect to its input parameter. This method has the following signature:

DerivativeStructure value(DerivativeStructure t)

It is important to note this method does not use double as its input parameter but DerivativeStructure, which is a class provided by Apache Commons Math.

It is the DerivativeStructure which specifies the desired derivation order as well as the numbers of free parameters. This allows to manage chaining calls like computing f(atan2(y, x)), where despite f is a univariate function, the expression above really depends on two variables x and y, and we may be interested in second order coupled cross derivatives like d³f/dx²dy. In this case, the DerivativeStructure argument will contain the information that there are really two variables and we want to compute all derivatives up to order 2.

         int params = 2;
         int order  = 3;
         double x   = 1.2;
         double y   = 4.5;

         // we arbitrarily specify variable x is variable number 0
         DerivativeStructure dsX = new DerivativeStructure(params, order, 0, x));

         // we arbitrarily specify variable y is variable number 1
         DerivativeStructure dsy = new DerivativeStructure(params, order, 1, y));

         // evaluate atan2 value and all its derivatives
         DerivativeStructure atan2 = DerivativeStructure.atan2(dsY, dsX);

         // perform differentiation
         UnivariateDifferentiableFunction differentiated =
             new ForwardModeDifferentiator().differentiate(f);

         // evaluate f value and all its derivatives,
         // using the automatically generated "value(DerivativeStructure)" method
         DerivativeStructure fXY = differentiated.value(atan2);

         // display output
         System.out("d3f/dx2dy = " + fXY.getPartialDerivative(3, 2));

Advanced use

Updating the base and differentiated objects

One important thing to note is a consequence of the fact that the differentiate method returns a new object when called. This implies that we end up with two different instances of two different classes that compute roughly similar things: the original instance and the newly created object. If the implementation of the value method does use some attribute of the original class, then the class of the newly created object should also provide a way to get this value.

An important design choice in Nabla is that the newly created instance does not copy the state of the original object at derivation time, but instead is permanently and tightly linked to this original instance and uses it to get the values it needs when it needs them (even if they are stored in private attributes). A direct implication is that if the state of the original object is changed after differentiation, all subsequent calls to the value method of the already created differentiated instance will reflect these changes in their behavior. There is no need to bother about updating the differentiated instance, it is already up-to-date.

As an example, consider again the problem above, where we wanted the derivative of a model with respect to its second parameter. Now we want to compute the same derivative as previously but we also want to be able to change the value of the first parameter, instead of sticking to the value 2.5. Here is a way to do this:

        public class SetableFirstParameterModel implements UnivariateFunction {

            private Model model;
            private double firstParameter;

            public SetableFirstParameterModel(Model model, double firstParameter) {
                this.model = model;
                this.firstParameter = firstParameter;
            }

            public void setFirstParameter(double firstParameter) {
                this.firstParameter = firstParameter;
            }

            public double value(double t) {
                return model.evaluate(firstParameter, t);
            }

        }

When we build the derivative of an instance of this class, this derivative will keep a reference to its primitive in order to access the firstParameter private field. If this field is changed on the primitive instance by calling the setFirstParameter method, the derivative will see the change immediately.

        SetableFirstParameterModel setable = new SetableFirstParameterModel(model, 2.5);
        UnivariateDifferentiableFunction derivative =
            new ForwardAlgorithmicDifferentiator().differentiate(setable);
        DerivativeStructure t = new DerivativeStructure(1, 1, 0, 2.0);

        // derivative with respect to second parameter when first parameter equals 2.5
        double der25 = derivative.value(t).getPartialDerivative(1);

        // derivative with respect to second parameter when first parameter equals 3.0
        setable.setFirstParameter(3.0);
        double der30 = derivative.value(t).getPartialDerivative(1);

Functions calling native code

Since the algorithmic differentiator can analyze only bytecode, functions calling native code cannot be handled this way by Nabla. In this case, a fallback procedure is to rely on finite differences, using the FiniteDifferencesDifferentiator class from Apache Commons Math.

This class need a step size at construction time. For each call to the derivative instance value method, they will call the value of the primitive instance n times where n is the number of points of the method. The evaluations are regularly distributed around the location defined by the parameter.

The step size and the number of points must be chosen with care as they influence both the accuracy of the result (which is only an approximation) and the computational cost. Small step size improve theoretical accuracy up to the point where numerical cancellations due to the finite precision of double numbers exceed the theoretical error due to finite differences modeling. Large number of points improve the accuracy but imply a large number of functions evaluation which can become prohibitive. There is no best choice that fits all needs, the right choice is problem-dependent.

Development

Project Documentation

Commons

ASF