## 4 Numerical Analysis## 4.1 OverviewThe analysis package is the parent package for algorithms dealing with real-valued functions of one real variable. It contains dedicated sub-packages providing numerical root-finding, integration, interpolation and differentiation. It also contains a polynomials sub-package that considers polynomials with real coefficients as differentiable real functions. Functions interfaces are intended to be implemented by user code to represent their domain problems. The algorithms provided by the library will then operate on these function to find their roots, or integrate them, or ... Functions can be multivariate or univariate, real vectorial or matrix valued, and they can be differentiable or not. ## 4.2 Error handling
For user-defined functions, when the method encounters an error
during evaluation, users must use their private static class LocalException extends RuntimeException { // the x value that caused the problem private final double x; public LocalException(double x) { this.x = x; } public double getX() { return x; } } private static class MyFunction implements UnivariateFunction { public double value(double x) { double y = hugeFormula(x); if (somethingBadHappens) { throw new LocalException(x); } return y; } } public void compute() { try { solver.solve(maxEval, new MyFunction(a, b, c), min, max); } catch (LocalException le) { // retrieve the x value } } As shown in this example the exception is really something local to user code and there is a guarantee Apache Commons Math will not mess with it. The user is safe. ## 4.3 Root-findingUnivariateSolver, UnivariateDifferentiableSolver and PolynomialSolver provide means to find roots of univariate real-valued functions, differentiable univariate real-valued functions, and polynomial functions respectively. A root is the value where the function takes the value 0. Commons-Math includes implementations of the several root-finding algorithms:
Some algorithms require that the initial search interval brackets the root (i.e. the function values at interval end points have opposite signs). Some algorithms preserve bracketing throughout computation and allow user to specify which side of the convergence interval to select as the root. It is also possible to force a side selection after a root has been found even for algorithms that do not provide this feature by themselves. This is useful for example in sequential search, for which a new search interval is started after a root has been found in order to find the next root. In this case, user must select a side to ensure his loop is not stuck on one root and always return the same solution without making any progress. There are numerous non-obvious traps and pitfalls in root finding. First, the usual disclaimers due to the way real world computers calculate values apply. If the computation of the function provides numerical instabilities, for example due to bit cancellation, the root finding algorithms may behave badly and fail to converge or even return bogus values. There will not necessarily be an indication that the computed root is way off the true value. Secondly, the root finding problem itself may be inherently ill-conditioned. There is a "domain of indeterminacy", the interval for which the function has near zero absolute values around the true root, which may be large. Even worse, small problems like roundoff error may cause the function value to "numerically oscillate" between negative and positive values. This may again result in roots way off the true value, without indication. There is not much a generic algorithm can do if ill-conditioned problems are met. A way around this is to transform the problem in order to get a better conditioned function. Proper selection of a root-finding algorithm and its configuration parameters requires knowledge of the analytical properties of the function under analysis and numerical analysis techniques. Users are encouraged to consult a numerical analysis text (or a numerical analyst) when selecting and configuring a solver.
In order to use the root-finding features, first a solver object must
be created by calling its constructor, often providing relative and absolute
accuracy. Using a solver object, roots of functions
are easily found using the `f(c) = 0.0`(see "function value accuracy")`min <= c <= max`(except for the secant method, which may find a solution outside the interval)
Typical usage: UnivariateFunction function = // some user defined function object final double relativeAccuracy = 1.0e-12; final double absoluteAccuracy = 1.0e-8; final int maxOrder = 5; UnivariateSolver solver = new BracketingNthOrderBrentSolver(relativeAccuracy, absoluteAccuracy, maxOrder); double c = solver.solve(100, function, 1.0, 5.0, AllowedSolution.LEFT_SIDE); Force bracketing, by refining a base solution found by a non-bracketing solver: UnivariateFunction function = // some user defined function object final double relativeAccuracy = 1.0e-12; final double absoluteAccuracy = 1.0e-8; UnivariateSolver nonBracketing = new BrentSolver(relativeAccuracy, absoluteAccuracy); double baseRoot = nonBracketing.solve(100, function, 1.0, 5.0); double c = UnivariateSolverUtils.forceSide(100, function, new PegasusSolver(relativeAccuracy, absoluteAccuracy), baseRoot, 1.0, 5.0, AllowedSolution.LEFT_SIDE);
The
The
The
The
The
The
The
## 4.4 Interpolation
A
UnivariateInterpolator is used to find a univariate real-valued
function y) yields
_{i}f(x to the best accuracy possible. The result
is provided as an object implementing the
UnivariateFunction interface. It can therefore be evaluated at any point,
including point not belonging to the original set.
Currently, only an interpolator for generating natural cubic splines and a polynomial
interpolator are available. There is no interpolator factory, mainly because the
interpolation algorithm is more determined by the kind of the interpolated function
rather than the set of points to interpolate.
There aren't currently any accuracy controls either, as interpolation
accuracy is in general determined by the algorithm.
_{i})=y_{i}Typical usage: double x[] = { 0.0, 1.0, 2.0 }; double y[] = { 1.0, -1.0, 2.0}; UnivariateInterpolator interpolator = new SplineInterpolator(); UnivariateFunction function = interpolator.interpolate(x, y); double interpolationX = 0.5; double interpolatedY = function.value(x); System.out println("f(" + interpolationX + ") = " + interpolatedY);
A natural cubic spline is a function consisting of a polynomial of
third degree for each subinterval determined by the x-coordinates of the
interpolated points. A function interpolating x.
_{N}The polynomial function returned by the Neville's algorithm is a single polynomial guaranteed to pass exactly through the interpolation points. The degree of the polynomial is the number of points minus 1 (i.e. the interpolation polynomial for a three points set will be a quadratic polynomial). Despite the fact the interpolating polynomials is a perfect approximation of a function at interpolation points, it may be a loose approximation between the points. Due to Runge's phenomenom the error can get worse as the degree of the polynomial increases, so adding more points does not always lead to a better interpolation. Loess (or Lowess) interpolation is a robust interpolation useful for smoothing univariate scaterplots. It has been described by William Cleveland in his 1979 seminal paper Robust Locally Weighted Regression and Smoothing Scatterplots. This kind of interpolation is computationally intensive but robust. Microsphere interpolation is a robust multidimensional interpolation algorithm. It has been described in William Dudziak's MS thesis. Hermite interpolation is an interpolation method that can use derivatives in addition to function values at sample points. The HermiteInterpolator class implements this method for vector-valued functions. The sampling points can have any spacing (there are no requirements for a regular grid) and some points may provide derivatives while others don't provide them (or provide derivatives to a smaller order). Points are added one at a time, as shown in the following example: HermiteInterpolator interpolator = new HermiteInterpolator; // at x = 0, we provide both value and first derivative interpolator.addSamplePoint(0.0, new double[] { 1.0 }, new double[] { 2.0 }); // at x = 1, we provide only function value interpolator.addSamplePoint(1.0, new double[] { 4.0 }); // at x = 2, we provide both value and first derivative interpolator.addSamplePoint(2.0, new double[] { 5.0 }, new double[] { 2.0 }); // should print "value at x = 0.5: 2.5625" System.out.println("value at x = 0.5: " + interpolator.value(0.5)[0]); // should print "derivative at x = 0.5: 3.5" System.out.println("derivative at x = 0.5: " + interpolator.derivative(0.5)[0]); // should print "interpolation polynomial: 1 + 2 x + 4 x^2 - 4 x^3 + x^4" System.out.println("interpolation polynomial: " + interpolator.getPolynomials()[0]);
A
BivariateGridInterpolator is used to find a bivariate real-valued
function y,_{j}f)
yields _{ij}f(x to the best accuracy
possible. The result is provided as an object implementing the
BivariateFunction interface. It can therefore be evaluated at any point,
including a point not belonging to the original set.
The arrays _{i},y_{j})=f_{ij}x and _{i}y must be
sorted in increasing order in order to define a two-dimensional grid.
_{j}In bicubic interpolation, the interpolation function is a 3rd-degree polynomial of two variables. The coefficients are computed from the function values sampled on a grid, as well as the values of the partial derivatives of the function at those grid points. From two-dimensional data sampled on a grid, the BicubicSplineInterpolator computes a bicubic interpolating function. Prior to computing an interpolating function, the SmoothingPolynomialBicubicSplineInterpolator class performs smoothing of the data by computing the polynomial that best fits each of the one-dimensional curves along each of the coordinate axes.
A
TrivariateGridInterpolator is used to find a trivariate real-valued
function y,_{j}z,
_{k}f)
yields _{ijk}f(x
to the best accuracy possible. The result is provided as an object implementing the
TrivariateFunction interface. It can therefore be evaluated at any point,
including a point not belonging to the original set.
The arrays _{i},y_{j},z_{k})=f_{ijk}x, _{i}y and
_{j}z must be sorted in increasing order in order to define
a three-dimensional grid.
_{k}In tricubic interpolation, the interpolation function is a 3rd-degree polynomial of three variables. The coefficients are computed from the function values sampled on a grid, as well as the values of the partial derivatives of the function at those grid points. From three-dimensional data sampled on a grid, the TricubicSplineInterpolator computes a tricubic interpolating function. ## 4.5 IntegrationA UnivariateIntegrator provides the means to numerically integrate univariate real-valued functions. Commons-Math includes implementations of the following integration algorithms: ## 4.6 PolynomialsThe org.apache.commons.math4.analysis.polynomials package provides real coefficients polynomials. The PolynomialFunction class is the most general one, using traditional coefficients arrays. The PolynomialsUtils utility class provides static factory methods to build Chebyshev, Hermite, Jacobi, Laguerre and Legendre polynomials. Coefficients are computed using exact fractions so these factory methods can build polynomials up to any degree. ## 4.7 DifferentiationThe org.apache.commons.math4.analysis.differentiation package provides a general-purpose differentiation framework. The core class is DerivativeStructure which holds the value and the differentials of a function. This class handles some arbitrary number of free parameters and arbitrary derivation order. It is used both as the input and the output type for the UnivariateDifferentiableFunction interface. Any differentiable function should implement this interface. The main idea behind the DerivativeStructure class is that it can be used almost as a number (i.e. it can be added, multiplied, its square root can be extracted or its cosine computed... However, in addition to computed the value itself when doing these computations, the partial derivatives are also computed alongside. This is an extension of what is sometimes called Rall's numbers. This extension is described in Dan Kalman's paper Doubly Recursive Multivariate Automatic Differentiation, Mathematics Magazine, vol. 75, no. 3, June 2002. Rall's numbers only hold the first derivative with respect to one free parameter whereas Dan Kalman's derivative structures hold all partial derivatives up to any specified order, with respect to any number of free parameters. Rall's numbers therefore can be seen as derivative structures for order one derivative and one free parameter, and primitive real numbers can be seen as derivative structures with zero order derivative and no free parameters.
The workflow of computation of a derivatives of an expression int params = 1; int order = 3; double xRealValue = 2.5; DerivativeStructure x = new DerivativeStructure(params, order, 0, xRealValue); DerivativeStructure y = f(x); System.out.println("y = " + y.getValue(); System.out.println("y' = " + y.getPartialDerivative(1); System.out.println("y'' = " + y.getPartialDerivative(2); System.out.println("y''' = " + y.getPartialDerivative(3);
In fact, there are no notions of
When we compute
This design choice is a very classical one in many algorithmic differentiation frameworks, either
based on operator overloading (like the one we implemented here) or based on code generation. It implies
the user has to
This design also allow a very interesting feature which can be explained with the following example.
Suppose we have a two arguments function int params = 2; int order = 2; double xRealValue = 2.5; double yRealValue = -1.3; DerivativeStructure x = new DerivativeStructure(params, order, 0, xRealValue); DerivativeStructure y = new DerivativeStructure(params, order, 1, yRealValue); DerivativeStructure f = DerivativeStructure.hypot(x, y); DerivativeStructure g = f.log(); System.out.println("g = " + g.getValue(); System.out.println("dg/dx = " + g.getPartialDerivative(1, 0); System.out.println("dg/dy = " + g.getPartialDerivative(0, 1); System.out.println("d2g/dx2 = " + g.getPartialDerivative(2, 0); System.out.println("d2g/dxdy = " + g.getPartialDerivative(1, 1); System.out.println("d2g/dy2 = " + g.getPartialDerivative(0, 2); There are several ways a user can create an implementation of the UnivariateDifferentiableFunction interface. The first method is to simply write it directly using the appropriate methods from DerivativeStructure to compute addition, subtraction, sine, cosine... This is often quite straigthforward and there is no need to remember the rules for differentiation: the user code only represent the function itself, the differentials will be computed automatically under the hood. The second method is to write a classical UnivariateFunction and to pass it to an existing implementation of the UnivariateFunctionDifferentiator interface to retrieve a differentiated version of the same function. The first method is more suited to small functions for which user already control all the underlying code. The second method is more suited to either large functions that would be cumbersome to write using the DerivativeStructure API, or functions for which user does not have control to the full underlying code (for example functions that call external libraries). Apache Commons Math provides one implementation of the UnivariateFunctionDifferentiator interface: FiniteDifferencesDifferentiator. This class creates a wrapper that will call the user-provided function on a grid sample and will use finite differences to compute the derivatives. It takes care of boundaries if the variable is not defined on the whole real line. It is possible to use more points than strictly required by the derivation order (for example one can specify an 8-points scheme to compute first derivative only). However, one must be aware that tuning the parameters for finite differences is highly problem-dependent. Choosing the wrong step size or the wrong number of sampling points can lead to huge errors. Finite differences are also not well suited to compute high order derivatives. Here is an example on how this implementation can be used: // function to be differentiated UnivariateFunction basicF = new UnivariateFunction() { public double value(double x) { return x * FastMath.sin(x); } }); // create a differentiator using 5 points and 0.01 step FiniteDifferencesDifferentiator differentiator = new FiniteDifferencesDifferentiator(5, 0.01); // create a new function that computes both the value and the derivatives // using DerivativeStructure UnivariateDifferentiableFunction completeF = differentiator.differentiate(basicF); // now we can compute display the value and its derivatives // here we decided to display up to second order derivatives, // because we feed completeF with order 2 DerivativeStructure instances for (double x = -10; x < 10; x += 0.1) { DerivativeStructure xDS = new DerivativeStructure(1, 2, 0, x); DerivativeStructure yDS = f.value(xDS); System.out.format(Locale.US, "%7.3f %7.3f %7.3f%n", yDS.getValue(), yDS.getPartialDerivative(1), yDS.getPartialDerivative(2)); }
Note that using
FiniteDifferencesDifferentiatora> in order to have a
UnivariateDifferentiableFunction that can be provided to a Newton-Raphson's
solver is a very bad idea. The reason is that finite differences are not really accurate and needs lots
of additional calls to the basic underlying function. If user initially have only the basic function
available and needs to find its roots, it is Another implementation of the UnivariateFunctionDifferentiator interface is under development in the related project Apache Commons Nabla. This implementation uses automatic code analysis and generation at binary level. However, at time of writing (end 2012), this project is not yet suitable for production use. |