Nabla

Overview

The derivatives produced by Nabla are computed by applying the exact differentiation rules to the various sub-expressions encountered while evaluating the function. The function to differentiate may contain conditional or discontinuous statements like if/then/else constructs, calls to the FastMath.floor() method, or use of the % operator. From now on, we will call branching points the values of the t parameter that trigger changes in these statements.

By construction, the derivatives produced by Nabla have the same branching points as the initial function. The overall structure of the code is similar in both functions, the only differences being that the embedded mathematical expressions are differentiated in one case and not in the other case.

Consider a value t₀ which is not at a branching point. For small variations of the t value within t₀ neighborhood, the same sub-expressions will be encountered during derivative evaluations and the derivative will be continuous. This is true even if the point is very close to a branching point, as long as it is not exactly on it. This behavior is more robust than what occurs with finite differences schemes because these schemes involve several evaluations of the function at separate points. If a finite differences scheme is evaluated too close from a branching point, the evaluations may be dispersed on both sides of it, leading to invalid results or computation errors.

Consider a value t₀ which is exactly at a branching point. Depending on the way the conditional or discontinuous statement is written, one of the branch is followed and the derivative is computed as if this branch were valid not only at the point itself but also in its neighborhood.

In a sense Nabla extends the validity of a derivative from the interior of a domain to its boundary.

Example

The following example shows a branching point for a simple conditional statement.

          UnivariateFunction singular = new UnivariateFunction() {
              public double value(double t) {
                  if (t < 0) {
                      return 2 * t;
                  } else {
                      return 3 * t;
                  }
              }
          };

In this example, there is one branching point at t = 0. When the parameter t is away from 0, the derivative is either 2 (for negative t) or 3 (for positive t). When the parameter t is exactly 0, the branch which will be elected is the else branch, so the derivative with respect to t will be set to 3, just as if the parameter were positive. The validity domain of the derivative as computed by Nabla is extended, it contains the boundary value 0.

This however is not the true mathematical derivative since the function is not differentiable at t = 0. It does have a left derivative and a right derivative, but since they are different there are no global derivatives at this point.

We can see the reason why from a purely mathematical standpoint the function is considered not to be differentiable at this point. If we change the if (t < 0) statement into if (t <= 0), the value of the primitive function does not change at all (both 2 * 0 and 3 * 0 produce 0), but the value of the derivative as computed by Nabla becomes 2.

The following example is only slightly different from the previous one. However, it will lead to different results.

          UnivariateFunction nonSingular = new UnivariateFunction() {
              public double value(double t) {
                  if (t < 0) {
                      return 2 * t * t;
                  } else {
                      return 3 * t * t;
                  }
              }
          };

In this example, the derivatives computed in both branches share the same value, which is 0. So the function is differentiable and the value computed by Nabla is the real derivative. Changing the if (t < 0) statement into if (t <= 0) does not invalidate this result.

Nabla does not detect branching points. It inherits this behavior from the initial function which does neither keep track of all the conditionals it encounters during its evaluation nor distinguishes conditionals largely met from conditionals almost missed.

In summary:

Nabla produces derivatives at branching points,
if the mathematical derivative exists at a branching point, Nabla computes it normally,
if the mathematical derivative does not exists, Nabla computes a value by extending the domain of validity of one of the branches,
there is no special handling of branching points.

Rationale

The design choice to let Nabla compute derivatives at branching points where the function is not differentiable may seem strange. However, there are several good reasons to do this.

Complexity

Detecting singularities and handling them properly would be very difficult.We would need to check each conditional and see if we are exactly at its cut point or not, and if the derivatives on both sides are equal or not. This implies we would have to compute the derivatives from both branches in parallel and reconcile the results at the end. If the derivatives on both sides are equal, then the function is differentiable and we have its value. If the values are not equal, then the function is not differentiable and an error case should be triggered (either setting the derivative to Double.NaN or throwing an InvalidArgumentexception).

Considering each branch can itself be split into two other branches which can themselves be split furthre and so on, this is a very tough design to implement.

Blindly assuming a function is not differentiable simply because it has a conditional is not acceptable. The last example above is an example of this.

Performance

The second reason is that if such a complete handling of cut points were implemented, it would have a very bad impact on performance. The number of conditionals evaluations would be doubled (adding an equality test to the existing inequality ones). In addition, when branching points are encountered, the complete computation would be performed on both branches. Given that physical models often only have a few specific cut points that repeat throughout the code (0 being by far the most important one for many functions), and that conditional constructs may appear inside loops that may iterate hundreds or thousands of times, it is clear than the computation can be put to an halt by such thorough branches explorations.

Needs

The third reason is simply that the pure mathematical compliance is not always desired.

An important family of singularities corresponds to domains boundaries: on one side the computation can be done and on the other side it cannot, either due to mathematical problems (square roots, logarithms, arc sines ...) or physical problems (collision, null pressure, phase change ...). In these cases, there is only one branch that leads to a result, the other one leading to an error. If the computation can be realized on the cut point itself, what we really want is to have a derivative that is consistent with the side from which we come and where the computation can be done. In other words, we want either the left or the right derivative and we want to completely ignore the other side of the fence.

This case is especially important in constrained optimization problems. For such problems, it is proven that the optimum is located either at a point where all derivatives are null or exactly at the boundary. In order to handle these cases which are quite common, we need to be able to compute derivatives at branching points and to be consistent with the side where computation is feasible.

Development

Project Documentation

Commons

ASF

Singularities

Overview

Example

Rationale