The derivatives produced by Nabla are computed by applying
the exact differentiation rules to the various
sub-expressions encountered while evaluating the function.
The function to differentiate may contain conditional or
discontinuous statements like if/then/else
constructs, calls to the FastMath.floor()
method,
or use of the %
operator. From now on, we will
call branching points the values of the
t
parameter that trigger changes in these
statements.
By construction, the derivatives produced by Nabla have the same branching points as the initial function. The overall structure of the code is similar in both functions, the only differences being that the embedded mathematical expressions are differentiated in one case and not in the other case.
Consider a value t0
which is
not at a branching point. For small variations of
the t
value within t0
neighborhood, the same sub-expressions will be encountered
during derivative evaluations and the derivative will be
continuous. This is true even if the point is very close to
a branching point, as long as it is not exactly on it. This
behavior is more robust than what occurs with finite
differences schemes because these schemes involve several
evaluations of the function at separate points. If a finite
differences scheme is evaluated too close from a branching
point, the evaluations may be dispersed on both sides of it,
leading to invalid results or computation errors.
Consider a value t0
which is
exactly at a branching point. Depending on the way
the conditional or discontinuous statement is written, one of
the branch is followed and the derivative is computed as if
this branch were valid not only at the point itself but also in
its neighborhood.
In a sense Nabla extends the validity of a derivative from the interior of a domain to its boundary.
The following example shows a branching point for a simple conditional statement.
UnivariateFunction singular = new UnivariateFunction() { public double value(double t) { if (t < 0) { return 2 * t; } else { return 3 * t; } } };
In this example, there is one branching point at t =
0
. When the parameter t
is away from 0,
the derivative is either 2 (for negative t
) or
3 (for positive t
). When the parameter
t
is exactly 0, the branch which will be
elected is the else
branch, so the derivative
with respect to t
will be set to 3, just as if
the parameter were positive. The validity domain of the
derivative as computed by Nabla is extended, it contains the
boundary value 0.
This however is not the true mathematical derivative since the
function is not differentiable at t = 0
.
It does have a left derivative and a right derivative, but
since they are different there are no global derivatives at
this point.
We can see the reason why from a purely mathematical standpoint
the function is considered not to be differentiable at this
point. If we change the if (t < 0)
statement into
if (t <= 0)
, the value of the primitive function
does not change at all (both 2 * 0
and 3 *
0
produce 0
), but the value of the derivative
as computed by Nabla becomes 2.
The following example is only slightly different from the previous one. However, it will lead to different results.
UnivariateFunction nonSingular = new UnivariateFunction() { public double value(double t) { if (t < 0) { return 2 * t * t; } else { return 3 * t * t; } } };
In this example, the derivatives computed in both branches share the
same value, which is 0. So the function is differentiable
and the value computed by Nabla is the real derivative. Changing the
if (t < 0)
statement into if (t <= 0)
does not invalidate this result.
Nabla does not detect branching points. It inherits this behavior from the initial function which does neither keep track of all the conditionals it encounters during its evaluation nor distinguishes conditionals largely met from conditionals almost missed.
In summary:
The design choice to let Nabla compute derivatives at branching points where the function is not differentiable may seem strange. However, there are several good reasons to do this.
Double.NaN
or throwing an InvalidArgumentexception
).
Considering each branch can itself be split into two other branches which can themselves be split furthre and so on, this is a very tough design to implement.
Blindly assuming a function is not differentiable simply because it has a conditional is not acceptable. The last example above is an example of this.
An important family of singularities corresponds to domains boundaries: on one side the computation can be done and on the other side it cannot, either due to mathematical problems (square roots, logarithms, arc sines ...) or physical problems (collision, null pressure, phase change ...). In these cases, there is only one branch that leads to a result, the other one leading to an error. If the computation can be realized on the cut point itself, what we really want is to have a derivative that is consistent with the side from which we come and where the computation can be done. In other words, we want either the left or the right derivative and we want to completely ignore the other side of the fence.
This case is especially important in constrained optimization problems. For such problems, it is proven that the optimum is located either at a point where all derivatives are null or exactly at the boundary. In order to handle these cases which are quite common, we need to be able to compute derivatives at branching points and to be consistent with the side where computation is feasible.