Class Statistic

java.lang.Object
org.bzdev.math.stats.Statistic
Direct Known Subclasses:
ChiSquareStat, FStat, KSStat, StudentsTStat, WelchsTStat

public abstract class Statistic extends Object
Class representing a statistic.

Subclasses implement specific statistics. This class declares common methods and defines methods for computing p values and critical values. The documentation for a subclass describes the statistic that subclass implements. Please see the package description for a summary of the definitions of these quantities and some related quantities.

The following sequence of operations is typical:

  1. Create an instance of a statistic, providing some parameters and optionally providing some or all of a data set or (for some classes) multiple data sets.
  2. With a few exceptions, one can then add data to the statistics. This is typically done by calling a method whose name is add or whose name starts with the string add.
  3. One will then call one or more of the methods
  4. For statistics whose distributions can be created using a noncentrality parameter, one may also want to use critical values to check the probability of a type 2 error by calling the method getBeta(double,double,double) or getBeta(double,double,boolean), which will return the probability of a type 2 error given a noncentrality parameter. The methods getNCParameter(double) or getNCParameter(double...) can be used to obtain the appropriate noncentrality parameters for a subclass of Statistic (the arguments are defined by each subclass). Instead of calling getBeta(double,double,double) or getBeta(double,double,boolean), one may call getPower(double,double,double) or getPower(double,double,boolean) to get the statistical power (defined as 1-β). For each statistic, the meaning of a noncentrality parameter, if one exists, is dependent on that statistic's distribution.

As an example of the use of nocentrality parameter, Student's t-distribution with ν degrees of freedom is the distribution of a random variable T defined as $T = \sqrt{\frac{\mu}{V}}$ where Z is a normal distribution with an expected value of 0 and a variance of 1, V has a χ2 distribution with ν degrees of freedom, and Z and V are independent. The noncentral t-distribution is defined as the distribution of the random variable T defined by $T = (Z + \mu)\sqrt{\frac{\mu}{V}}$ using the same assumptions for Z and V, and with μ being a constant. This effectively just shifts Z by a constant but results in a different distribution than the distribution when μ is zero.

In some cases, one may want to estimate the dataset size needed so that type 1 and type 2 errors are within specified limits. Many of the subclasses of Statistic have constructors that take a number of parameters including the dataset size. These constructors can be used to save the state of a statistic in cases where repeated runs are necessary, but these constructors can also be used for estimating the required dataset size. Essentially, one would try a data set size, varying it until an adequate size is reached. For each data set size tried, one would compute the desired critical values given the value of α and then compute the value of β given the worst-case deviation from the expected value (and the corresponding noncentrality parameter) that one would want to detect.

  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static enum 
    Mode for P-value computations.
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
    protected
    Constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    double
    getBeta(double nonCentrality, double cv, boolean upperBound)
    Get the probability β of a type 2 error given one critical value.
    double
    getBeta(double nonCentrality, double cv1, double cv2)
    Get the probability β of a type 2 error given two critical values.
    double
    Get the critical value.
    Get the probability distribution for this statistic.
    getDistribution(double nonCentrality)
    Get the noncentral probability distribution associated with this statistic.
    double
    getNCParameter(double arg)
    Get the noncentrality parameter given a subclass-specific argument.
    double
    getNCParameter(double... args)
    Get the noncentrality parameter given subclass-specific arguments.
    double
    getPower(double nonCentrality, double cv, boolean upperBound)
    Get the statistical power given one critical value.
    double
    getPower(double nonCentrality, double cv1, double cv2)
    Get the statistical power given two critical values.
    double
    Get the p-value for this statistic.
    abstract double
    Get the value of this statistic.
    double
    Get the value for a statistic that indicates no deviation from the null hypothesis.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • Statistic

      protected Statistic()
      Constructor.
  • Method Details

    • optimalValue

      public double optimalValue()
      Get the value for a statistic that indicates no deviation from the null hypothesis. The default is 0.0. It is unusual for this method to be overridden as standard statistics use 0.0 for this purpose.
      Returns:
      the value
    • getValue

      public abstract double getValue() throws IllegalStateException
      Get the value of this statistic.
      Returns:
      the value of this statistic
      Throws:
      IllegalStateException - the value cannot be computed (for example, because data has not yet been entered)
    • getDistribution

      public abstract ProbDistribution getDistribution() throws IllegalStateException
      Get the probability distribution for this statistic. The distribution is the distribution for the statistic, not the the distribution for the data the statistic describes.
      Returns:
      the probability distribution
      Throws:
      IllegalStateException - the value cannot be computed (for example, because data has not yet been entered)
    • getDistribution

      public ProbDistribution getDistribution(double nonCentrality) throws UnsupportedOperationException, IllegalArgumentException, IllegalStateException
      Get the noncentral probability distribution associated with this statistic. This distribution is used for statistical-power calculations or for calculating the probability of a type II error (frequently denoted by the Greek letter β), and is a function of a statistic dependent parameter.

      When it can be computed, subclasses should override this method and provide subclass-specific documentation.

      Parameters:
      nonCentrality - the subclass-specific parameter that indicates how "non-central" the statistic is.
      Returns:
      an appropriate probability distribution.
      Throws:
      UnsupportedOperationException - the operation is not supported for this statistic
      IllegalArgumentException - the argument is not allowed for this statistic
      IllegalStateException - the state of this statistic does not allow this function to return a meaningful value (e.g., because enough data has not be provided)
    • getPValue

      public double getPValue(Statistic.PValueMode mode)
      Get the p-value for this statistic. If the argument is null, a default mode is chosen. If the probability distribution is symmetric about the value returned by optimalValue(), the TWO_SIDED option is used. Otherwise the ONE_SIDED option is used.

      Before using p-values, please read " Statement on Statistical Significance and P-Values," as p-values are often misinterpreted.

      Parameters:
      mode - one of the PvalueMode enumeration constants POSITIVE_SIDE, NEGATIVE_SIDE, TWO_SIDED, ONE_SIDED; null for a default based on the type of statistic
      Returns:
      the p-value
      Throws:
      IllegalArgumentException - the mode is not one that is accepted for this statistic
    • getCriticalValue

      public double getCriticalValue(Statistic.PValueMode mode, double alpha)
      Get the critical value. The critical value is obtained from the inverse of the function implemented by getPValue(PValueMode).

      The modes TWO_SIDED and ONE_SIDED are supported only when the probability distribution is symmetric about a statistic-dependent value (typically zero). For the ONE_SIDED case, the optimal value for the statistic (the value when the errors are zero) must be 0.0.

      Parameters:
      mode - one of the PvalueMode enumeration constants POSITIVE_SIDE, NEGATIVE_SIDE, TWO_SIDED, ONE_SIDED; null for a default based on the type of statistic
      alpha - the probability that a value at least as extreme as the returned value occurred by chance
      Returns:
      the critical value
    • getNCParameter

      public double getNCParameter(double arg) throws IllegalArgumentException, IllegalStateException, UnsupportedOperationException
      Get the noncentrality parameter given a subclass-specific argument. The default behavior is to throw an exception. Typically the argument will be the difference between some offset value and a current value.
      Parameters:
      arg - the argument
      Returns:
      the noncentrality parameter
      Throws:
      IllegalArgumentException
      IllegalStateException
      UnsupportedOperationException
    • getNCParameter

      public double getNCParameter(double... args) throws IllegalArgumentException, IllegalStateException, UnsupportedOperationException
      Get the noncentrality parameter given subclass-specific arguments. The default behavior when one argument is provided is to return the value obtained by calling the method getNCParameter(double), although subclasses may override this behavior. Typically the arguments will be a difference between some offset values and current values.
      Parameters:
      args - the arguments
      Returns:
      the noncentrality parameter.
      Throws:
      IllegalArgumentException
      IllegalStateException
      UnsupportedOperationException
    • getPower

      public double getPower(double nonCentrality, double cv1, double cv2) throws UnsupportedOperationException, IllegalArgumentException, IllegalStateException
      Get the statistical power given two critical values. The statistical power is the probability that the null hypothesis was rejected when the alternate hypothesis is in fact true. The critical values give the range of values of a statistic for which the null hypothesis is likely to be true. The noncentrality parameter determines the alternative hypothesis.
      Parameters:
      nonCentrality - the noncentrality parameter
      cv1 - the first critical value
      cv2 - the second critical value
      Returns:
      the statistical power
      Throws:
      UnsupportedOperationException - the operation is not supported for this statistic
      IllegalArgumentException - the argument is not allowed for this statistic
      IllegalStateException - the state of this statistic does not allow this function to return a meaningful value (e.g., because enough data has not be provided)
      See Also:
    • getPower

      public double getPower(double nonCentrality, double cv, boolean upperBound) throws UnsupportedOperationException, IllegalArgumentException, IllegalStateException
      Get the statistical power given one critical value. The statistical power is the probability that the null hypothesis was rejected when the alternate hypothesis is in fact true. The critical values give the range of values of a statistic for which the null hypothesis is likely to be true. The noncentrality parameter determines the alternative hypothesis.
      Parameters:
      nonCentrality - the noncentrality parameter
      cv - the critical value
      upperBound - true if the critical value is an upper bound on the range of values for which the null hypothesis is assumed to be true; false otherwise
      Returns:
      the statistical power
      Throws:
      UnsupportedOperationException - the operation is not supported for this statistic
      IllegalArgumentException - the argument is not allowed for this statistic
      IllegalStateException - the state of this statistic does not allow this function to return a meaningful value (e.g., because enough data has not be provided)
      See Also:
    • getBeta

      public double getBeta(double nonCentrality, double cv1, double cv2) throws UnsupportedOperationException, IllegalArgumentException, IllegalStateException
      Get the probability β of a type 2 error given two critical values. The probability β is the probability that the null hypothesis was not rejected when the alternate hypothesis is in fact true. The critical values give the range of values of a statistic for which the null hypothesis is likely to be true. The noncentrality parameter determines the alternative hypothesis.
      Parameters:
      nonCentrality - the noncentrality parameter
      cv1 - the first critical value
      cv2 - the second critical value
      Returns:
      the probability of a type 2 error
      Throws:
      UnsupportedOperationException - the operation is not supported for this statistic
      IllegalArgumentException - the argument is not allowed for this statistic
      IllegalStateException - the state of this statistic does not allow this function to return a meaningful value (e.g., because enough data has not be provided)
      See Also:
    • getBeta

      public double getBeta(double nonCentrality, double cv, boolean upperBound) throws UnsupportedOperationException, IllegalArgumentException, IllegalStateException
      Get the probability β of a type 2 error given one critical value. The probability β is the probability that the null hypothesis was not rejected when the alternate hypothesis is in fact true. The critical values give the range of values of a statistic for which the null hypothesis is likely to be true. The noncentrality parameter determines the alternative hypothesis.
      Parameters:
      nonCentrality - the noncentrality parameter
      cv - the critical value
      upperBound - true if the critical value is an upper bound on the range of values for which the null hypothesis is assumed to be true; false otherwise
      Returns:
      the probability of a type 2 error
      Throws:
      UnsupportedOperationException - the operation is not supported for this statistic
      IllegalArgumentException - the argument is not allowed for this statistic
      IllegalStateException - the state of this statistic does not allow this function to return a meaningful value (e.g., because enough data has not be provided)
      See Also: