Wednesday, August 26, 2009

Degree of freedom

Meaning of degree of freedom (df):

1. In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.[1]

Mathematically, degrees of freedom is the dimension of the domain of a random vector, or essentially the number of 'free' components: how many components need to be known before the vector is fully determined.

The number of degrees of freedom in a problem, distribution, etc., is the number of parameters which may be independently varied.

The concept of degrees of freedom is central to the principle of estimating statistics of populations from samples of them. "Degrees of freedom" is commonly abbreviated to df. In short, think of df as a mathematical restriction that we need to put in place when we calculate an estimate one statistic from an estimate of another.

In statistics, the number of degrees of freedom (d.o.f.) is the number of independent pieces of data being used to make a calculation. It is usually denoted with the greek letter nu, ν. The number of degrees of freedom is a measure of how certain we are that our sample population is representative of the entire population - the more degrees of freedom, usually the more certain we can be that we have accurately sampled the entire population. For statistics in analytical chemistry, this is usually the number of observations or measurements N made in a certain experiment.

For a set of data points in a given situation (e.g. with mean or other parameter specified, or not), degrees of freedom is the minimal number of values which should be specified to determine all the data points.

In statistics, the term degrees of freedom (df) is a measure of the number of independent pieces of information on which the precision of a parameter estimate is based.

Estimates of statistical parameters can be based upon different amounts of information or data. The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom (df). In general, the degrees of freedom of an estimate is equal to the number of independent scores that go into the estimate minus the number of parameters estimated as intermediate steps in the estimation of the parameter itself.

The df can be viewed as the number of independent parameters available to fit a model to data. Generally, the more parameters you have, the more accurate your fit will be. However, for each estimate made in a calculation, you remove one degree of freedom. This is because each assumption or approximation you make puts one more restriction on how many parameters are used to generate the model. Put another way, for each estimate you make, your model becomes less accurate.

Another way of thinking about the restriction principle behind degrees of freedom is to imagine contingencies. For example, imagine you have four numbers (a, b, c and d) that must add up to a total of m; you are free to choose the first three numbers at random, but the fourth must be chosen so that it makes the total equal to m - thus your degree of freedom is three. Essentially, degrees of freedom are a count of the number of pieces of independent information contained within a particular analysis.

The maximum numbers of quantities or directions, whose values are free to vary before the remainders of the quantities are determined, or an estimate of the number of independent categories in a particular statistical test or experiment. Degrees of freedom (df) for a sample is defined as: df = n - 1 Where n is the number of scores in the sample.

The degrees of freedom for an estimate equals the number of observations (values) minus the number of additional parameters estimated for that calculation. As we have to estimate more parameters, the degrees of freedom available decreases. It can also be thought of as the number of observations (values) which are freely available to vary given the additional parameters estimated. It can be thought of two ways: in terms of sample size and in terms of dimensions and parameters.

Degrees of freedom are often used to characterize various distributions. See, for example, chi-square distribution, t-distribution, F distribution.

In case, the df was n-1, because an estimate was made that the sample mean is a good estimate of the population mean, so we have one less df than the number of independent observations.

In many statistical calculations you will do, such as linear regression, outliers, and t-tests, you will need to know or calculate the number of degrees of freedom. Degrees of freedom for each test will be explained in the section for which it is required.


No comments: