Tuesday, October 18, 2011

Seminar Nasional Statistik STIS, 3 Oktober 2011

Berikut ini file Seminar Statistik STIS, 3 Oktober 2011, semoga bermanfaat:



Monday, October 19, 2009

Statistical Significance versus Statistical Power

This title based on sub chapter in Multivariate book (Multivariate Data Analysis Joseph F. Hair, Jr.; William C. Black; Barry J. Babin; Rolph E. Anderson; Ronald L. Tatham, Person Education International, 2006, Singapore)

A census of the entire population makes statistical inference unnecessary, because any difference or relationship, however small, is “true” and does exist. Rarely, if ever, is a census conducted, however. Therefore, the researcher id forced to draw inferences from a sample.

Types of Statistical Error and Statistical Power

Power: probability of correctly rejecting the null hypothesis when it false, that is, correctly finding a hypothesized relationship when it exist.

Determined as a function of:
1. The statistical significance level set by the researcher for type I error (alpha)
2. The sample sized used in the analysis
3. The effect size being examined

Interpreting statistical inferences requires that the researcher specify the acceptable levels of statistical error due to using sample (known as sampling error). The most common approach is to specify the level of type I error, also known as alpha. The type I error is the probability of rejecting the null hypothesis when actually true, or in simple terms, the chance of the test showing statistical significance when it actually is not present – the case of a “false positive”. By specifying an alpha level, the researcher sets the allowable limits for error and indicates the probability of concluding that significance exists when it really does not.

When specifying the level of type I error, the researcher also determines an associated error, termed the type II error or beta. The type II error is the probability of failing to reject the null hypothesis when actually false. An even more interesting probability is 1 – beta, termed the power of the statistical inference test. Power is probability of correctly rejecting the null hypothesis when it should be rejected. Thus power is the probability that statistical significance will be indicated if it is present. The relationship of the different error probabilities in the hypothetical setting of testing for the difference in two means is shown here:

Although specifying alpha establishes the level of acceptable statistical significance, it is the level of power that dictates the probability of success in finding the differences if they actually exist. Then why not set both alpha and beta at acceptable levels ? Because the type I and type II error are inversely related, and as the type I error becomes more restrictive (moves closer to zero), the probability of a type II error increases. Reducing the type I errors therefore reduces the power of the statistical test. Thus, the researcher must strike a balance between the level of alpha and resulting power.

Impact on Statistical Power
But why can’t high levels of power always be achieved ? Power is not solely a function of alpha. It is actually determined by three factors:

Effect size: The probability of achieving statistical significance is based not only on statistical considerations but also on the actual magnitude of the effect of interest (e.g., a difference of means between two groups or the correlation between variables) in the population, termed the effect size. As one would expect, a larger effect is more likely to be found than a smaller effect, and thus more likely to impact the power of the statistical test. To assess the power of any statistical test, the researcher must first understand the effect being examined. Effect size are defined in standardized terms for ease of comparison. Mean differences are stated in terms of standard deviations, so that an effect size of .5 indicates that the mean difference is one-half of a standard deviation. For correlations, the effect size is based on the actual correlation between the variables

2. Alpha: As note earlier, as alpha becomes more restrictive, power decreases. Therefore, as the researcher reduces the chance of incorrectly saying an effect is significant when it is not, the probability of correctly finding an effect also decreases. Conventional guidelines suggest alpha levels of .05 or .01. The researcher must consider the impact of this decision on the power before selecting the alpha, however.

3. Sample size: At any given alpha level, increased sample sizes always produce greater power of statistical test. A potential problem then becomes too much power. By “too much” mean that by increasing sample size, smaller and smaller effects will be found to be statistically significant, until at very large sample sizes almost any effect is significant. The researcher must always be aware that sample size can affect the statistical test by either making it insensitive (at small sample sizes) or overly sensitive (at very large sample sizes).

The relationship among alpha, sample size, effect size, and power are quite complicated, and a number of sources of guidance are available. To achieve such power levels, all three factors – alpha, sample size and effect size – must be considered simultaneously.

Hypothesis Testing

The objective of statistics is to make inferences about unknown population parameters based on information contained in sample data. These inferences are phrased in two ways, as estimates of the respective parameters or as test of hypotheses about their values.

In many ways the formal procedure for hypothesis testing is similar to the scientific method. The scientist observes nature, formulate a theory, and then tests this theory against observations. The scientist poses a theory concerning one or more population parameters – that they equal specified values, then samples the population and compares observation with theory. If the observations disagree with the theory, the scientist rejects the hypothesis. If not, the scientist concludes either that the theory is true or that sample did not detect the difference between the real and hypothesized values of the population parameters.

Hypothesis tests are conducted in all fields in which theory can be tested against observation. Hypotheses can be subjected to statistical verification by comparing the hypotheses with observed sample data.

The objective of a statistical test is to test a hypothesis concerning the values of one or more population, called research hypothesis. For example, suppose that a political candidate, Jones, claims that he will gain more than 50% of votes in a city election and thereby emerge as the winner. If we don’t believe Jones’s claim, we might seek to support the research hypothesis that Jones is not favored by more than 50% of the electorate. For this research hypothesis, also called the alternative hypothesis, is obtaining by showing (using sample data as evidence) that the converse of the alternative hypothesis, the null hypothesis, is false. Thus support for one theory is obtained by showing lack of support for its converse, in a sense a proof by contradiction. Since we seek support for the alternative hypothesis that Jones’s claim is false, our alternative hypothesis is that p, the probability of selecting a voter favoring Jones, is less than .5. If we can show that the data support the rejection of the null hypothesis, p-.5 (the minimum value needed for a plurality), in favor of the alternative hypothesis, p<.5, we have achieved our research objective. Although it is common to speak of testing a null hypothesis, keep in mind that the research objective is usually to show support for the alternative hypothesis.

The element of a statistical test:
1. Null hypothesis, Ho
2. Alternative hypothesis, H1
3. Test statistics
4. Rejection region

The functioning parts of statistical test are the test statistic and associated rejected region. The test statistic is a function of sample measurement upon which statistical decision will be based. The rejection region specifies the value of the test statistic for which the null hypothesis is rejected. If for a particular sample the computed value of the test statistic fall in the rejection region, we reject the null hypothesis H0 and accept the alternative hypothesis H1. If the value of the test statistic does not fall into the rejection region, we accept H0.

Decision must often be made based on sample data. The statistical procedures that guide the decision making process are known as test of hypotheses. Sample observations of characteristic under consideration are made and descriptive statistics are calculated. These sample statistics are then analyzed and question is answered based on the results of the analysis. Because the data used to answer the questions are sample data, there is always chance that answer will be wrong. If the sample is not truly representative of the population from which it was taken, the type I and type II errors can occur. Thus, when a test of hypothesis is performed, it is essential that the confidence level – the probability that the statement is correct – be stated.

1. Stating the Hypothesis
When tests of hypothesis are to be used to answer questions, the first step is to state what is to be proved.

The statement that is to be proved is known as the null hypothesis or H0

A second hypothesis inconsistent with the null hypothesis is called alternative hypothesis or H1
Statement is what the data analysis will attempt to prove or disprove. If analysis shows that the statement is true, fine. But if the analysis indicates that the statement is not true, a fallback position is needed.

It is strongly recommended that the null hypothesis always be stated as an equality. Although this isn’t necessary for statistical purposes, it does make later analysis much easier. The alternative hypothesis is then expressed either as the direction (less than or greater than) inequality or as a non directional inequality. The wording of the initial question determines the nature of the inequality used in the statement of the alternative hypothesis. A question involving “better than”, “faster than”, ”stronger than”, or similar terminology would require a directional inequality. The phrase “same as” or “not any different than” would imply a non directional inequality. The statement of the alternative hypothesis must be consistent with the observed sample data.

When the alternative hypothesis is stated as a directional inequality the procedure is called a one tailed test of hypothesis.

A non directional inequality in the alternative hypothesis signifies a two tailed test of hypothesis.

2. Specifying the Confidence Level
After both the null and the alternative hypotheses have been stated, the second step is to specify the confidence level. Usually the selection is arbitrary. However, there may be organizational guidelines that specify the confidence level. Common confidence level are 90 percent, 95 percent, and 99 percent. A brief statement or an equation defining the confidence level in terms of alpha is usually sufficient; for example, the notation alpha = 0.05 might appear after the hypothesis. This would designate 95 percent confidence.

3. Collecting Sample Data
The third step in testing hypothesis is the collection of sample data. After the null hypothesis has been identified – the equality of means, proportions, standard deviations, or whatever - the nature of the required data can be specified. The data must then be collected, and the appropriate sample descriptive statistics must be calculated.

4. Calculating Test Statistics
After sample test statistics have been calculated, the appropriate test statistics must be calculated. There are many test statistics that may be calculated. The specific test statistic used will depend on the nature of the null and alternative hypotheses.

5. Identifying Table Statistics or Using P-value
After the test statistics is calculated, the table statistic is determined. The nature of the alternative hypothesis, the sample size, and the specific statistic being tested will determine which of the standard distribution tables, such as the normal curve, student-t, or chi-square, should be used.

6. Decision Making
The following rule will govern all of the decision, provided common sense is applied.
-. If the absolute value of the test statistic is less than or equal to the table statistics or if p-value greater than alpha, then there is not sufficient evidence to reject the null hypothesis or – the null hypothesis is accepted as being true.
-. If the absolute value of the test statistic is greater than the table statistics or if p-value less than alpha, then there is sufficient evidence to reject the null hypothesis as being true – this would imply that the alternative hypothesis must be true.

-. Mathematical Statistics with Application, William Mendenhall, Richard L. Sceaffer, Dennis D. Wackerly,
-. Fundamentals of Industrial Quality Control, 3rd edition, Lawrence S. AFT, St. Lucie Press, London, 1998