Class AbstractMultipleLinearRegression
- java.lang.Object
-
- org.apache.commons.math4.legacy.stat.regression.AbstractMultipleLinearRegression
-
- All Implemented Interfaces:
MultipleLinearRegression
- Direct Known Subclasses:
GLSMultipleLinearRegression
,OLSMultipleLinearRegression
public abstract class AbstractMultipleLinearRegression extends Object implements MultipleLinearRegression
Abstract base class for implementations of MultipleLinearRegression.- Since:
- 2.0
-
-
Constructor Summary
Constructors Constructor Description AbstractMultipleLinearRegression()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected abstract RealVector
calculateBeta()
Calculates the beta of multiple linear regression in matrix notation.protected abstract RealMatrix
calculateBetaVariance()
Calculates the beta variance of multiple linear regression in matrix notation.protected double
calculateErrorVariance()
Calculates the variance of the error term.protected RealVector
calculateResiduals()
Calculates the residuals of multiple linear regression in matrix notation.protected double
calculateYVariance()
Calculates the variance of the y values.double
estimateErrorVariance()
Estimates the variance of the error.double
estimateRegressandVariance()
Returns the variance of the regressand, ie Var(y).double[]
estimateRegressionParameters()
Estimates the regression parameters b.double[]
estimateRegressionParametersStandardErrors()
Returns the standard errors of the regression parameters.double[][]
estimateRegressionParametersVariance()
Estimates the variance of the regression parameters, ie Var(b).double
estimateRegressionStandardError()
Estimates the standard error of the regression.double[]
estimateResiduals()
Estimates the residuals, ie u = y - X*b.protected RealMatrix
getX()
protected RealVector
getY()
boolean
isNoIntercept()
void
newSampleData(double[] data, int nobs, int nvars)
Loads model x and y sample data from a flat input array, overriding any previous sample.protected void
newXSampleData(double[][] x)
Loads new x sample data, overriding any previous data.protected void
newYSampleData(double[] y)
Loads new y sample data, overriding any previous data.void
setNoIntercept(boolean noIntercept)
protected void
validateCovarianceData(double[][] x, double[][] covariance)
Validates that the x data and covariance matrix have the same number of rows and that the covariance matrix is square.protected void
validateSampleData(double[][] x, double[] y)
Validates sample data.
-
-
-
Constructor Detail
-
AbstractMultipleLinearRegression
public AbstractMultipleLinearRegression()
-
-
Method Detail
-
getX
protected RealMatrix getX()
- Returns:
- the X sample data.
-
getY
protected RealVector getY()
- Returns:
- the Y sample data.
-
isNoIntercept
public boolean isNoIntercept()
- Returns:
- true if the model has no intercept term; false otherwise
- Since:
- 2.2
-
setNoIntercept
public void setNoIntercept(boolean noIntercept)
- Parameters:
noIntercept
- true means the model is to be estimated without an intercept term- Since:
- 2.2
-
newSampleData
public void newSampleData(double[] data, int nobs, int nvars)
Loads model x and y sample data from a flat input array, overriding any previous sample.
Assumes that rows are concatenated with y values first in each row. For example, an input
data
array containing the sequence of values (1, 2, 3, 4, 5, 6, 7, 8, 9) withnobs = 3
andnvars = 2
creates a regression dataset with two independent variables, as below:y x[0] x[1] -------------- 1 2 3 4 5 6 7 8 9
Note that there is no need to add an initial unitary column (column of 1's) when specifying a model including an intercept term. If
isNoIntercept()
istrue
, the X matrix will be created without an initial column of "1"s; otherwise this column will be added.Throws IllegalArgumentException if any of the following preconditions fail:
data
cannot be nulldata.length = nobs * (nvars + 1)
nobs > nvars
- Parameters:
data
- input data arraynobs
- number of observations (rows)nvars
- number of independent variables (columns, not counting y)- Throws:
NullArgumentException
- if the data array is nullDimensionMismatchException
- if the length of the data array is not equal tonobs * (nvars + 1)
InsufficientDataException
- ifnobs
is less thannvars + 1
-
newYSampleData
protected void newYSampleData(double[] y)
Loads new y sample data, overriding any previous data.- Parameters:
y
- the array representing the y sample- Throws:
NullArgumentException
- if y is nullNoDataException
- if y is empty
-
newXSampleData
protected void newXSampleData(double[][] x)
Loads new x sample data, overriding any previous data.
The inputx
array should have one row for each sample observation, with columns corresponding to independent variables. For example, ifx = new double[][] {{1, 2}, {3, 4}, {5, 6}}
setXSampleData(x)
results in a model with two independent variables and 3 observations:x[0] x[1] ---------- 1 2 3 4 5 6
Note that there is no need to add an initial unitary column (column of 1's) when specifying a model including an intercept term.
- Parameters:
x
- the rectangular array representing the x sample- Throws:
NullArgumentException
- if x is nullNoDataException
- if x is emptyDimensionMismatchException
- if x is not rectangular
-
validateSampleData
protected void validateSampleData(double[][] x, double[] y) throws MathIllegalArgumentException
Validates sample data. Checks that- Neither x nor y is null or empty;
- The length (i.e. number of rows) of x equals the length of y
- x has at least one more row than it has columns (i.e. there is sufficient data to estimate regression coefficients for each of the columns in x plus an intercept.
- Parameters:
x
- the [n,k] array representing the x datay
- the [n,1] array representing the y data- Throws:
NullArgumentException
- ifx
ory
is nullDimensionMismatchException
- ifx
andy
do not have the same lengthNoDataException
- ifx
ory
are zero-lengthMathIllegalArgumentException
- if the number of rows ofx
is not larger than the number of columns + 1
-
validateCovarianceData
protected void validateCovarianceData(double[][] x, double[][] covariance)
Validates that the x data and covariance matrix have the same number of rows and that the covariance matrix is square.- Parameters:
x
- the [n,k] array representing the x samplecovariance
- the [n,n] array representing the covariance matrix- Throws:
DimensionMismatchException
- if the number of rows in x is not equal to the number of rows in covarianceNonSquareMatrixException
- if the covariance matrix is not square
-
estimateRegressionParameters
public double[] estimateRegressionParameters()
Estimates the regression parameters b.- Specified by:
estimateRegressionParameters
in interfaceMultipleLinearRegression
- Returns:
- The [k,1] array representing b
-
estimateResiduals
public double[] estimateResiduals()
Estimates the residuals, ie u = y - X*b.- Specified by:
estimateResiduals
in interfaceMultipleLinearRegression
- Returns:
- The [n,1] array representing the residuals
-
estimateRegressionParametersVariance
public double[][] estimateRegressionParametersVariance()
Estimates the variance of the regression parameters, ie Var(b).- Specified by:
estimateRegressionParametersVariance
in interfaceMultipleLinearRegression
- Returns:
- The [k,k] array representing the variance of b
-
estimateRegressionParametersStandardErrors
public double[] estimateRegressionParametersStandardErrors()
Returns the standard errors of the regression parameters.- Specified by:
estimateRegressionParametersStandardErrors
in interfaceMultipleLinearRegression
- Returns:
- standard errors of estimated regression parameters
-
estimateRegressandVariance
public double estimateRegressandVariance()
Returns the variance of the regressand, ie Var(y).- Specified by:
estimateRegressandVariance
in interfaceMultipleLinearRegression
- Returns:
- The double representing the variance of y
-
estimateErrorVariance
public double estimateErrorVariance()
Estimates the variance of the error.- Returns:
- estimate of the error variance
- Since:
- 2.2
-
estimateRegressionStandardError
public double estimateRegressionStandardError()
Estimates the standard error of the regression.- Returns:
- regression standard error
- Since:
- 2.2
-
calculateBeta
protected abstract RealVector calculateBeta()
Calculates the beta of multiple linear regression in matrix notation.- Returns:
- beta
-
calculateBetaVariance
protected abstract RealMatrix calculateBetaVariance()
Calculates the beta variance of multiple linear regression in matrix notation.- Returns:
- beta variance
-
calculateYVariance
protected double calculateYVariance()
Calculates the variance of the y values.- Returns:
- Y variance
-
calculateErrorVariance
protected double calculateErrorVariance()
Calculates the variance of the error term.
Uses the formulavar(u) = u · u / (n - k)
where n and k are the row and column dimensions of the design matrix X.- Returns:
- error variance estimate
- Since:
- 2.2
-
calculateResiduals
protected RealVector calculateResiduals()
Calculates the residuals of multiple linear regression in matrix notation.u = y - X * b
- Returns:
- The residuals [n,1] matrix
-
-