Class UnconditionedExactTest


  • public final class UnconditionedExactTest
    extends Object
    Implements an unconditioned exact test for a contingency table.

    Performs an exact test for the statistical significance of the association (contingency) between two kinds of categorical classification. A 2x2 contingency table is:

    \[ \left[ {\begin{array}{cc} a & b \\ c & d \\ \end{array} } \right] \]

    This test applies to the case of a 2x2 contingency table with one margin fixed. Note that if both margins are fixed (the row sums and column sums are not random) then Fisher's exact test can be applied.

    This implementation fixes the column sums \( m = a + c \) and \( n = b + d \). All possible tables can be created using \( 0 \le a \le m \) and \( 0 \le b \le n \). The random values \( a \) and \( b \) follow a binomial distribution with probabilities \( p_0 \) and \( p_1 \) such that \( a \sim B(m, p_0) \) and \( b \sim B(n, p_1) \). The p-value of the 2x2 table is the product of two binomials:

    \[ \begin{aligned} p &= Pr(a; m, p_0) \times Pr(b; n, p_1) \\ &= \binom{m}{a} p_0^a (1-p_0)^{m-a} \times \binom{n}{b} p_1^b (1-p_1)^{n-b} \end{aligned} \]

    For the binomial model, the null hypothesis is the two nuisance parameters are equal \( p_0 = p_1 = \pi\), with \( \pi \) the probability for equal proportions, and the probability of any single table is:

    \[ p = \binom{m}{a} \binom{n}{b} \pi^{a+b} (1-\pi)^{m+n-a-b} \]

    The p-value of the observed table is calculated by maximising the sum of the as or more extreme tables over the domain of the nuisance parameter \( 0 \lt \pi \lt 1 \):

    \[ p(a, b) = \sum_{i,j} \binom{m}{i} \binom{n}{j} \pi^{i+j} (1-\pi)^{m+n-i-j} \]

    where table \( (i,j) \) is as or more extreme than the observed table \( (a, b) \). The test can be configured to select more extreme tables using various methods.

    Note that the sum of the joint binomial distribution is a univariate function for the nuisance parameter \( \pi \). This function may have many local maxima and the search enumerates the range with a configured number of points. The best candidates are optionally used as the start point for an optimized search for a local maxima.

    References:

    1. Barnard, G.A. (1947). Significance tests for 2x2 tables. Biometrika, 34, Issue 1-2, 123–138.
    2. Boschloo, R.D. (1970). Raised conditional level of significance for the 2 × 2-table when testing the equality of two probabilities. Statistica neerlandica, 24(1), 1–9.
    3. Suisaa, A and Shuster, J.J. (1985). Exact Unconditional Sample Sizes for the 2 × 2 Binomial Trial. Journal of the Royal Statistical Society. Series A (General), 148(4), 317-327.
    Since:
    1.1
    See Also:
    FisherExactTest, Boschloo's test (Wikipedia), Barnard's test (Wikipedia)
    • Method Detail

      • withInitialPoints

        public UnconditionedExactTest withInitialPoints​(int v)
        Return an instance with the configured number of initial points.

        The search for the nuisance parameter will use \( v \) points in the open interval \( (0, 1) \). The interval is evaluated by including start and end points approximately equal to 0 and 1. Additional internal points are enumerated using increments of approximately \( \frac{1}{v-1} \). The minimum number of points is 2. Increasing the number of points increases the precision of the search at the cost of performance.

        To approximately double the number of points so that all existing points are included and additional points half-way between them are sampled requires using 2p - 1 where p is the existing number of points.

        Parameters:
        v - Value.
        Returns:
        an instance
        Throws:
        IllegalArgumentException - if the value is < 2.
      • withOptimize

        public UnconditionedExactTest withOptimize​(boolean v)
        Return an instance with the configured optimization of initial search points.

        If enabled then the initial point(s) with the highest probability is/are used as the start for an optimization to find a local maxima.

        Parameters:
        v - Value.
        Returns:
        an instance
        See Also:
        withInitialPoints(int)
      • statistic

        public double statistic​(int[][] table)
        Compute the statistic for the unconditioned exact test. The statistic returned depends on the configured method.
        Parameters:
        table - 2-by-2 contingency table.
        Returns:
        test statistic
        Throws:
        IllegalArgumentException - if the table is not a 2-by-2 table; any table entry is negative; any column sum is zero; the table sum is zero or not an integer; or the number of possible tables exceeds the maximum array capacity.
        See Also:
        with(Method), test(int[][])