Nabla public interface is very small, it is composed of only one top level class: ForwardModeDifferentiator
This class is an implementation of the UnivariateFunctionDifferentiator interface from Apache Commons Math. It performs algorithmic differentiation by bytecode analysis and generation, using the exact differentiation rules in order to create functions that compute exact differentials.
In order to differentiate a function using Nabla, a user-defined function must be
provided as an implementation of the Apache
Commons Math
UnivariateFunction interface. It is passed as the single parameter
to the differentiate
method of a ForwardModeDifferentiator instance. If the
available user class does not already implements the UnivariateFunction interface, it
has to be wrapped when provided to the differentiator. The differentiate
method will then return an object that implements the Apache
Commons Math
UnivariateDifferentialFunction interface, i.e. the new object is
able to compute differentials, even despite the original user object was not able
to do so.
As an example, consider the following problem. We have a model
variable
which is an instance of a class with a method evaluate
:
double evaluate(double first, double second)
We want to compute its partial derivatives with respect to
the second parameter, when the first parameter value is 2.5
and the second parameter ranges from -1 to +1. Here is a way
to do that:
UnivariateDifferentiableFunction derivative = new ForwardAlgorithmicDifferentiator().differentiate(new UnivariateFunction() { public double value(double t) { return model.evaluate(2.5, t); } });
The derivative
object we get can compute the desired derivatives:
int params = 1; int order = 2; double t = 3.4; DerivativeStructure dsT = new DerivativeStructure(params, order, 0, t); DerivativeStructure dsV = derivative.value(dstT); System.out.println("first derivative at t = " + t + " = " + dsV.getPartialDerivative(1)); System.out.println("second derivative at t = " + t + " = " + dsV.getPartialDerivative(2));
The UnivariateDifferentiableFunction instance created by Nabla provides a
method value
which computes both the value of the primitive
function (as the initial instance does) and the value of its derivatives
with respect to its input parameter. This method has the following signature:
DerivativeStructure value(DerivativeStructure t)
It is important to note this method does not use double
as its
input parameter but
DerivativeStructure, which is a class provided
by Apache Commons Math.
It is the DerivativeStructure
which specifies the desired derivation
order as well as the numbers of free parameters. This allows to manage chaining
calls like computing f(atan2(y, x))
, where despite f is a univariate
function, the expression above really depends on two variables x
and y
, and we may be interested in second order coupled cross
derivatives like d3f/dx2dy. In this case, the
DerivativeStructure
argument will contain the information that there
are really two variables and we want to compute all derivatives up to order 2.
int params = 2; int order = 3; double x = 1.2; double y = 4.5; // we arbitrarily specify variable x is variable number 0 DerivativeStructure dsX = new DerivativeStructure(params, order, 0, x)); // we arbitrarily specify variable y is variable number 1 DerivativeStructure dsy = new DerivativeStructure(params, order, 1, y)); // evaluate atan2 value and all its derivatives DerivativeStructure atan2 = DerivativeStructure.atan2(dsY, dsX); // perform differentiation UnivariateDifferentiableFunction differentiated = new ForwardModeDifferentiator().differentiate(f); // evaluate f value and all its derivatives, // using the automatically generated "value(DerivativeStructure)" method DerivativeStructure fXY = differentiated.value(atan2); // display output System.out("d3f/dx2dy = " + fXY.getPartialDerivative(3, 2));
One important thing to note is a consequence of the fact that the
differentiate
method returns a new object when
called. This implies that we end up with two different
instances of two different classes that compute roughly
similar things: the original instance and the newly created
object. If the implementation of the value
method
does use some attribute of the original class, then the class
of the newly created object should also provide a way to get
this value.
An important design choice in Nabla is that the newly
created instance does not copy the state of the
original object at derivation time, but instead is
permanently and tightly linked to this original instance and
uses it to get the values it needs when it needs them (even
if they are stored in private attributes). A direct
implication is that if the state of the original object is
changed after differentiation, all subsequent calls
to the value
method of the already created
differentiated instance will reflect these changes in their
behavior. There is no need to bother about updating the
differentiated instance, it is already up-to-date.
As an example, consider again the problem above, where we wanted the derivative of a model with respect to its second parameter. Now we want to compute the same derivative as previously but we also want to be able to change the value of the first parameter, instead of sticking to the value 2.5. Here is a way to do this:
public class SetableFirstParameterModel implements UnivariateFunction { private Model model; private double firstParameter; public SetableFirstParameterModel(Model model, double firstParameter) { this.model = model; this.firstParameter = firstParameter; } public void setFirstParameter(double firstParameter) { this.firstParameter = firstParameter; } public double value(double t) { return model.evaluate(firstParameter, t); } }
When we build the derivative of an instance of this class,
this derivative will keep a reference to its primitive in
order to access the firstParameter
private
field. If this field is changed on the primitive instance by
calling the setFirstParameter
method, the
derivative will see the change immediately.
SetableFirstParameterModel setable = new SetableFirstParameterModel(model, 2.5); UnivariateDifferentiableFunction derivative = new ForwardAlgorithmicDifferentiator().differentiate(setable); DerivativeStructure t = new DerivativeStructure(1, 1, 0, 2.0); // derivative with respect to second parameter when first parameter equals 2.5 double der25 = derivative.value(t).getPartialDerivative(1); // derivative with respect to second parameter when first parameter equals 3.0 setable.setFirstParameter(3.0); double der30 = derivative.value(t).getPartialDerivative(1);
Since the algorithmic differentiator can analyze only bytecode, functions calling native code cannot be handled this way by Nabla. In this case, a fallback procedure is to rely on finite differences, using the FiniteDifferencesDifferentiator class from Apache Commons Math.
This class need a step size at construction time. For each call
to the derivative instance value
method, they will call the
value
of the primitive instance n
times where
n
is the number of points of the method. The evaluations
are regularly distributed around the location defined by the parameter.
The step size and the number of points must be chosen with care as they influence both the accuracy of the result (which is only an approximation) and the computational cost. Small step size improve theoretical accuracy up to the point where numerical cancellations due to the finite precision of double numbers exceed the theoretical error due to finite differences modeling. Large number of points improve the accuracy but imply a large number of functions evaluation which can become prohibitive. There is no best choice that fits all needs, the right choice is problem-dependent.