Analysis of Variance – ANOVA

The Analysis of Variance – ANOVA procedure is one of the most powerful statistical techniques. ANOVA is a general technique that can be used to test the hypothesis that the means among two or more groups are equal, under the assumption that the sampled populations are normally distributed. Suppose we wish to study the effect of temperature on a passive component such as a resistor. We select three different temperatures and observe their effect on the resistors. This experiment can be conducted by measuring all the participating resistors before placing resistors each in three different ovens. Each oven is heated to a selected temperature. Then we measure the resistors again after, say, 24 hours and analyze the responses, which are the differences between before and after being subjected to the temperatures. The temperature is called a factor. The different temperature settings are called levels. In this example, there are three levels or settings of the factor Temperature.

A factor is an independent treatment variable whose settings (values) are controlled and varied by the experimenter. The intensity setting of a factor is the level. Levels may be quantitative numbers or, in many cases, simply “present” or “not present” (“0” or “1”). For example, the temperature setting in the resistor experiment may be:100oF, 200oF and 300oF. We can simply call them: Level 1; Level 2 and Level 3

  • The 1-way ANOVA
    In the experiment above, there is only one factor, temperature, and the analysis of variance that we will be using to analyze the effect of temperature is called a one-way or one-factor ANOVA.
  • The 2-way or 3-way ANOVA
    We could have opted to also study the effect of positions in the oven. In this case, there would be two factors, temperature and oven position. Here we speak of a two-way or two-factor ANOVA. Furthermore, we may be interested in a third factor, the effect of time. Now we deal with a three-way or three-factor ANOVA. In each of these ANOVA’s, we test a variety of hypotheses of equality of means (or average responses when the factors are varied).

 ANOVA is defined as a technique where the total variation present in the data is portioned into two or more components having the specific source of variation. In the analysis, it is possible to attain the contribution of each of these sources of variation to the total variation. It is designed to test whether the means of more than two quantitative populations are equal. It consists of classifying and cross-classifying statistical results and helps in determining whether the given classifications are important in affecting the results.
The assumptions in the analysis of variance are:

  • Normality
  • Homogeneity
  • Independence of error

Whenever any of these assumptions are not met, the analysis of variance technique cannot be employed to yield valid inferences.

With the analysis of variance, the variations in response measurement are partitioned into components that reflect the effects of one or more independent variables. The variability of a set of measurements is proportional to the sum of squares of deviations used to calculate the variance:


Analysis of variance partitions the sum of squares of deviations of individual measurements from the grand mean (called the total sum of squares) into parts: the sum of squares of treatment means plus a remainder which is termed the experimental or random error. When an experimental variable is highly related to the response, it’s part of the total sum of the squares that will be highly inflated. This condition is confirmed by comparing the variable sum of squares with that of the random error sum of squares using an F test.

Why use Anova and Not Use t-test Repeatedly?

  • The t-test, which is based on the standard error of the difference between two means, can only be used to test differences between two means
  • With more than two means, could compare each means with each other mean using t-tests
  • Conducting multiple t-tests can lead to severe inflation of the Type I error rate (false positives) and is NOT RECOMMENDED.
  • ANOVA is used to test for differences among several means without increasing the Type I error rate
  • The ANOVA uses data from all groups to estimate standard errors, which can increase the power of the analysis

Why Look at Variance When Interested in Means?

  • Three groups tightly spread about their respective means, the variability within each group is relatively small
  • Easy to see that there is a difference between the means of the three groups
  • Three groups have the same means as in previous figure but the variability within each group is much larger
  • Not so easy to see that there is a difference between the means of the three groups
  • To distinguish between the groups, the variability between (or among) the groups must be greater than the variability of, or within, the groups
  • If the within-groups variability is large compared with the between-groups variability, any difference between the groups is difficult to detect
  • To determine whether or not the group means are significantly different, the variability between groups and the variability within groups are compared


Suppose there are k populations that are from a normal distribution with unknown parameters. A random sample X1, X2, X3……………… Xk is taken from these populations
which hold the assumptions. If μ1, μ2, μ3………… μk are k population means, the null hypothesis is:
H0 : μ1 = μ2 = μ3………… = μk (i.e. all means are equal)
HA : μ1 ≠ μ2 ≠ μ3………… ≠ μk  (i.e. all means are not equal)

The steps in carrying out the analysis are:

  1.  Calculate variance between the samples
    The variance between samples measures the difference between the sample mean of each group and the overall mean. It also measures the difference from one group to another. The sum of squares between the samples is denoted by SSB. For calculating variance between the samples, take the total of the square of the deviations of the means of various samples from the grand average and divide this total by the degree of freedom, k-1, where k = no. of samples.
  2. Calculate variance within samples
    The variance within samples measures the inter-sample or within-sample differences due to chance only. It also measures the variability around the mean of each group. The sum of squares within the samples is denoted by SSW. For calculating variance within the samples, take the total sum of squares of the deviation of various items from the mean values of the respective samples and divide this total by the degree of freedom, n-k, where n = total number of all the observations and k = number of samples.
  3. Calculate the total variance
    The total variance measures the overall variation in the sample mean. The total sum of squares of variation is denoted by SST. The total variation is calculated by taking the squared deviation of each item from the grand average and dividing this total by the degree of freedom, n-1 where n = total number of observations.
  4.  Calculate the F ratio
    It measures the ratio of between–column variance and within-column variance. If there is a real difference between the groups, the variance between groups will be significantly larger than the variance within the groups.
    F = ( Variance between the Groups ) / Variance within the Groups
    F = SSB / SSW
  5. Decision Rule
    At a given level of significance E =0.05 and at n-k and k-1 degrees of freedom, the value of F is tabulated from the table. On comparing the values, if the calculated value is greater than the tabulated value, reject the null hypothesis. That means the test is significant or there is a significant difference between the sample means.
  6. Applicability of ANOVA
    Analysis of variance has wide applicability from experiments. It is used for two different purposes:
    • It is used to estimate and test hypotheses about population means.
    • It is used to estimate and test hypotheses about population variances.

An analysis of variance to detect a difference in three or more population means first requires obtaining some summary statistics for calculating variance of a set of data as shown below:              


  1. ΣX2  is called the crude sum of squares
  2. (ΣX)2 / N is the CM (correction for the mean), or CF (correction factor)
  3. ΣX2 – (ΣX)2 / N is termed SS (total sum of squares, or corrected SS).

  4. In the one-way ANOVA, the total variation in the data has two parts: the variation among treatment means and the variation within treatments.
  5. The  grand average GM = ΣX/N
  6. The total SS (Total SS) is then:
    Total SS = Σ(Xi – GM)2 Where Xi is any individual measurement.
  7. Total SS = SST + SSE Where SST = treatment sum of squares and SSE is the experimental error sum of squares.
  8. Sum of the squared deviations of each treatment average from the grand average or grand mean.
  9. Sum of the squared deviations of each individual observation within a treatment from the treatment average.For the ANOVA calculations:
  10. Total Treatment CM  Σ(TCM)=
  11. SST = Σ(TCM) – CM
  12. SSE = Total SS – SST (Always obtained by difference)
  13. Total DF = N – 1 (Total Degrees of Freedom)
  14. TDF = K – 1 (Treatment DF = Number of treatments minus 1)
  15. EDF = (N – 1) – (K – 1) = N – K (Error DF, always obtained by difference)
  16. MST =SST/TFD=SST/(K-1) (Mean Square Treatments)
  17. MSE = SSE/EDF=SSE/(N-K)  (Mean Square Error)To test the null hypothesis:
  18. H0 : μ1 = μ2 = μ3………… = μk            H1 : At least one mean different
  19. F = MST/MSE         When F > Fα , reject H0

Example: As an example of a comparison of three means, consider a single factor experiment: The following coded results were obtained from a single factor randomized experiment, in which the outputs of three machines were compared. Determine if there is a significant difference in the results (α = 0.05).
ΣX=30    N=15           Total DF=N-1=15-1=14
GM = ΣX/N = 30/15 = 2.0
ΣX2  = 222                    CM=(ΣX)2/N=(30)2/15 =60
Total SS = ΣX2 – CM = 222 – 60 = 162
Σ(TCM) = 197.2
SST = Σ(TCM) – CM =197.2 – 60 = 137.2 and
SSE = Total SS – SST = 162 – 137.2 = 24.8

The completed ANOVA table is:

Since the computed value of F (33.2) exceeds the critical value of F, the null hypothesis is rejected. Thus, there is evidence that a real difference exists among the machine means.
σe is the pooled standard deviation of within treatments variation. It can also be considered the process capability sigma of individual measurements. It is the variation within measurements which would still remain if the difference among treatment means were eliminated.

EXAMPLE: The bursting strengths of diaphragms were determined in an experiment . Use analysis of variance techniques to determine if there is a difference at a level of 0.05.

The origination of these data could similarly be measurements from

  • Parts manufactured by 7 different operators
  •  Parts manufactured on 7 different machines
  • Time for purchase order requests from 7 different sites
  • Delivery time of 7 different suppliers

An analysis of variance tests the hypothesis for equality of treatment means, or it tests that the treatment effects are zero, which is expressed as

H0 : μ1 = μ2 = μ3………… = μk            H1 : At least one mean different

This analysis indicates that rejection of the null hypothesis is appropriate because the p-value is lower than 0.05.  The probability values for the test of homogeneity of variances indicates that there is not enough information to reject the null hypothesis of equality of variances. No pattern or outlier data are apparent in either the “residuals versus order of the data” or “residuals versus fitted values .” The normal probability plot and histogram indicate that the residuals may not be normally distributed. Perhaps a transformation of the data could improve this fit; however, it is doubtful that any difference would be large enough to be of practical importance .


The  analysis of variance  indicated that there was a significant difference in the bursting strengths of seven different types of rubber diaphragms (k = 7). We will now determine which diaphragms differ from the grand mean. A data summary of the mean and variance for each rubber type, each having four observations (n = 4), is

The overall mean is

The pooled estimate for the standard deviation is

The number of degrees of freedom is (n – 1)k = (4 – 1)(7) = 21 . For a significance level of 0.05 with 7 means and 21 degrees of freedom, it is determined by interpolation from Table below that h0.05 = 2.94. The upper and lower decision lines are then


It will be seen that the two-way analysis procedure is an extension of the patterns described in the one-way analysis. Recall that a one-way ANOVA has two components of variance: Treatments and experimental error (may be referred to as columns and error or rows and error). In the two-way ANOVA, there are three components of variance: Factor A treatments, Factor B treatments, and experimental error (may be referred to as columns, rows, and errors).

In a two-way analysis of variance, the treatments constitute different levels affected by more than one factor. For example, sales of car parts, in addition to being affected by the point of sale display, might also be affected by the price charged, the location of the store, and the number of competitive products. When two independent factors have an effect on the dependent factor, analysis of variance can be used to test for the effects of two factors simultaneously. Two sets of hypotheses are tested with the same data at the same time.
Suppose there are k populations that are from a normal distribution with unknown parameters. A random sample X1, X2, X3……………… Xk is taken from these populations which hold the assumptions. The null hypothesis for this is that all population means are equal against the alternative that the members of at least one pair are not equal. The hypothesis follows:
H0 : μ1 = μ2 = μ3………… = μk
HA : Not all means μj are Equal.

If the population means are equal, each population effect is equal to zero against the alternatives. The test hypothesis is

H0 : β1 = β2 = β3………… = βk
HA : Not all means βj are Equal.

  1. Calculate variance between the rows
    The variance between rows measures the difference between the sample mean of each row and the overall mean. It also measures the difference from one row to another. The sum of squares between the rows is denoted by SSR. For calculating variance between the rows, take the total of the square of the deviations of the means of various sample rows from the grand average and divide this total by the degree of freedom, r-1 , where r= no. of rows.
  2. Calculate variance between the columns
    The variance between columns measures the difference between the sample mean of each column and the overall mean. It also measures the difference from one column to another. The sum of squares between the columns is denoted by SSC. For calculating variance between the columns, take the total of the square of the
    deviations of the means of various sample columns from the grand average and divide this total by the degree of freedom, c-1, where c= no. of columns.
  3. Calculate the total variance
    The total variance measures the overall variation in the sample mean. The total sum of squares of variation is denoted by SST. The Total variation is calculated by taking the squared deviation of each item from the grand average and divide this total by the degree of freedom, n-1 where n= total number of observations.
  4.  Calculate the variance due to error
    The variance due to error or Residual Variance in the experiment is by chance variation. It occurs when there is some error in taking observations, or making calculations, or sometimes due to lack of information about the data. The sum of squares due to error is denoted by SSE. It is calculated as:
    Error Sum of Squares = Total Sum of Squares – Sum of Squares between Columns – Sum of Squares Between Rows.
    The degree of freedom, in this case, will be (c-1)(r-1).
  5. Calculate the F Ratio
    It measures the ratio of between–column variance and within-row variance with variance due to error.
    F = Variance between the Columns / Variance due to Error
    F = SSC / SSE
    F = Variance between the Rows / Variance due to Error
    F = SSR / SSE
  6.  Decision Rule At a given level of significance α=0.05 and at n-k and k-1 degrees of freedom, the value of F is tabulated from the table. On comparing the values, if the calculated value is greater than the tabulated value, reject the null hypothesis. This means that the test is significant or, there is a significant difference between the sample means.

Example: Three different subjects were taught by two different instructors to three different students with the following results. The responses are examination results as a percentage. The null hypothesis: instructor and subject  means do not differ.

ΣX=1190    N=18           Total DF=N-1=18-1=17
GM = ΣX/N = 1190/18 = 266.11
ΣX2  = 81844                    CM=(ΣX)2/N=(1190)2/18 =78672.22
Total SS = ΣX2 – CM = 81844 – 78672.22 = 3171.78
ColSq = column total squared and divided by the no. of observations in the column
RowSq = row total squared and divided by the no. of observations in the row

SSCol = ΣColSq – CM = 79544.67 – 78672.22 = 872.44
SSRow = ΣRowSq – CM = 80677.78 – 78672.22 = 2005.56
SSE = Total SS – SSCol – SSRow = 3171.78 – 872.44 – 2005.56 = 293.78
The next step is to construct the ANOVA table.

anova 4

If no interaction: Col DF=Col-1 =3-1=2 Row DF=Row-1=2-1 =1
ErrorDF=Total DF-Col DF-Row DF=17-2-1=14
Col F = MSCol/MSE = 436.22/20.98 = 20.79. This is larger than critical F = 3.74. Therefore, the null hypothesis of equal material means is rejected.

Row F = MSRow/MSE = 2005.56/20.98 = 95.59. This is larger than critical F = 4.60. Therefore, the null hypothesis of equal instructor means is rejected.

The difference between total sigma (13.66) and error sigma (4.58) is due to the significant difference in instructor means and material means. If the instructor and study material differences were only due to chance causes, the sigma variation in the data would be equal to SlGe, the square root of the mean square error.

It should be noted, in the example above, the data was listed in six cells. That is six experimental combinations. There were also 3 replications (students) in each cell (k = 3). When k is greater than 1 in a 2 factor ANOVA, there is the opportunity to analyze for a possible interaction between the two factors.

Interaction effect:

A similar analysis pattern is noted here. The data in each cell is summed, and that total is divided by the number of observations in that cell.

CellSq = (Sumcell)2/k            InterSq = Σ(CellSq)
SSInter = lnterSq – CM – SSCol – SSRow

For the sum of squares interaction (SS Inter), it is not enough to just subtract the correction for the mean (CM) as was done to determine the main effects of SSCol and SSRow. This is because the data is replicated cells is affected by the treatment levels of the two factors of which it is a part as well as a possible interaction effect. To net out the interaction effect, it is necessary to also subtract the sum of squares column and row factors previously calculated. The cell-by-cell calculations are shown below

Σ(CellSq) = (264)2/3 + (202)2/3 + (224)2/3 + (186)2/3 + (146)2/3 + (168)2/3
Σ(CellSq) = 23232 + 13601.33 + 16725.33 + 11532 + 7105.33 + 9408
Σ(CellSq) = 81604

SSInter = 81604 – 78672.22 – 872.44 – 2005.56 = 53.78
SSError = TotSS – SSCol – SSRow – SSInter
SSError = 3171.78 – 872.44 – 2005.56 – 53.78 = 240
The null hypothesis for the interaction effect is that there is no interaction. See the revised ANOVA table below:
With interaction: Replications per cell = k = 3
CoIDF=Col-1=3-1=2 RowDF=Row-1=2-1=1
Inter DF = (Col – 1)(Row – 1) = (3 -1)(2 – 1) = 2
Error DF = Total DF -Col DF – Row DF – Inter DF =17 -2 -1- 2 =12
The interaction calculated F (1.34) is less than critical F (3.89). The null hypothesis of no interaction is not rejected. There is an advantage in analyzing for possible interaction if the opportunity exists. The more effects which are significant, the greater the amount of total variation which is explained and the smaller the MS error(unexplained variation). As the MS error is the divisor in the F ratio, a smaller MS error increases the sensitivity of testing effects.

Components of Variance

The analysis of variance can be extended with a determination of the COV (components of variance). The COV table uses the MS (mean square), F, and F(alpha) columns from the previous ANOVA TABLE and adds columns for EMS (expected mean square), variance, adjusted variance, and percent contribution to design data variation. The model for the ANOVA is: -The model states that any measurement (X) represents the combined effect of the population mean (μ), the different Subject (M), the different instructors (I), the Subject/instructor interaction (MI), and the experimental error (2). Where: I represent materials at 3 Levels, j represents instructors at 2 levels, k represents 3 replications per cell.
The variance coefficients are equal to the number of values use in  calculating the  respective MS. Subject coef = k x Row = 3 x 2 = 6, instructors coef = k x Col = 3 x 3 = 9 Interaction coef = k = 3. The general variance equation is given by:
Effect Variance = (MS Effect – MS Error)/(Variance Coefficient)
M Var = (436.22 – 20)/6 = 69.37

 I Var = (2005.56 – 20)/9 = 220.62
MI Var = (26.89 – 20)/3 = 2.30

  Error Var = 20
Material differences are significant and account for 22.21% of the variation in the data. Instructor differences are significant and account for 70.65% of the variation in the data. The Subject/ instructor interaction is not significant and shows as a negligible contribution. Experimental error accounts for only 6.40% of the total variation. The reason for the adjusted variance column is that variance calculations are negative when the mean square effect is less than the mean square error. Negative mean squares are considered to have a value of 0. Knowing the percent contribution aids in establishing priorities when taking improvement actions.

EXAMPLE: A battery is to be used within a device that is subjected to extreme temperature variations. At some point in time during development, an engineer can only select one of three plate material types. After product shipment the engineer has no control over temperature; however, he/she believes that temperature could degrade the effective life of the battery. The engineer would like to determine if one of the material types is robust to temperature variations. The table below describes the observed effective life (hours) of this battery at controlled temperatures within a laboratory

Using an α = 0.05 criterion, we conclude that there is a significant interaction between material types and temperature because its probability value is less than 0.05 [and F0 > (F0.05,4,27 = 2.73)]. We also conclude that the main effects of material type and temperature are also significant because each of their probabilities is less than 0.05 [and F0 > (F0.05,4,27 = 3.35)].

A plot of the average response at each factor level is shown in Figure above, which aids the interpretation of experimental results. The significance of the interaction term in our model is shown as the lack of parallelism of these lines. From this plot, we note a degradation in life with an increase in temperature regardless of material type. If it is desirable for this battery to experience less loss of life at elevated temperature, type 3 material seems to be the best choice of the three materials. Whenever there is a difference in the rows’ or columns’ means, it can be beneficial to make additional comparisons. This analysis shows these differences; however, the significance of the interaction can obscure comparison tests. One approach to address this situation is to apply the test at only one level of a factor at a time.
Using this strategy, let us examine the data for significant differences at 70°F (i .e ., level 2 of temperature). We can use ANOM techniques to gain insights into factor levels relative to the grand mean. The ANOM output shown in the Figure below indicates that material types 1 and 3 are different from the grand mean.

Tukey’s multiple comparison test shown below indicates that for a temperature level of 70°F the mean battery life between material types 2 and 3 cannot be shown differently. In addition, the mean battery life for material type 1 is significantly lower than that of both battery types 2 and 3.

The coefficient of determination (R2) can help describe the amount of variability in battery life explained by battery material, temperature, and the interaction of the material with temperature. From the analysis of variance output, we note

SSmodel = SSmaterial + SStemperature + SSinteraction
= 10,683 + 39,118 + 9613
= 59,414

which results in

From this, we conclude that about 77% of the variability is described by our model factors. The adequacy of the underlying model should be checked before the adoption of conclusions. The figure below gives a normal plot of the residuals and a plot of residuals versus the fitted values for the analysis of variance analysis. The normal probability plot of the residuals does not reveal anything of particular concern. The residual plot of residuals versus fitted values seems to indicate a mild tendency for the variance of the residuals to increase as battery life increases. The residual plots of battery type and temperature seem to indicate that material type 1 and low temperature might have more variability. However, these problems, in general, do not appear to be large enough to have a dramatic impact on the analysis and conclusions.

ANOVA Table for an A x B Factorial Experiment

In a factorial experiment involving factor A at a level and factor B at b levels, the total sum of squares can be partitioned into:
Total SS = SS(A) + SS(B) + SS(AB) + SSE

ANOVA Table for a Randomized Block Design

The randomized block design implies the presence of two independent variables, blocks, and treatments. The total sum of squares of the response measurements can be partitioned into three parts, the sum of the squares for the blocks, treatments, and error. The analysis of a randomized block design is of less complexity than an A x B factorial experiment.
Goodness-of-Fit Tests

GOF (goodness-of-fit) tests are part of a class of procedures that are structured in cells. In each cell, there is an observed frequency, (Fo). From the nature of the problem, one either knows the expected or theoretical frequency, (Fe) or can calculate it. Chi-square (χ2) is then summed across all cells according to the  formula: The calculated chi-square is then compared to the chi-square critical value for the following appropriate degrees of freedom:

Uniform Distribution (GOF):

Example: Is a game die balanced? The null hypothesis, H0, states the die is honest and balanced. When a die is rolled, the expectation is that each side should come up an equal number of times. It is obvious there will be random departures from this theoretical expectation if the die is honest. A die was tossed 48 times with the following results:

The calculated chi-square is 8.75. The critical chi-square χ20.05,5 = 11.07. The calculated chi-square does not exceed the critical chi-square. Therefore, the hypothesis of an honest die cannot be rejected. The random departures from theoretical expectation could well be explained by chance cause.

Normal Distribution (GOF):

Example:     The following data (105 observations) is taken from an – R chart. There is sufficient data for ten cells. The alternative would be six cells which are too few. Twelve integer cells fit the range of the data. The null hypothesis: the data was obtained from a normal distribution.

 = 15.4, sigma = 1.54, number of effective cells = 6, DF = 3 and χ20.05,3 = 7.81

One degree of freedom is lost because estimates μ. The second degree of freedom is lost because SD estimates sigma. The third degree of freedom is lost because sample N represents the population.

  • Col A: The cell boundaries are one half unit from the cell midpoint.
  • Col B: The cell middle values are integers.
  • Col C: The observed frequencies in each cell are Fo.
  • Col D: Distances from are measured from cell boundaries.
  • Col E: Distances from are divided by SD to transform distances into 2 units.
  • Col F: 2 units are converted into cumulative normal distribution probabilities.
  • Col G: The theoretical probability in each cell is obtained by taking the difference between cumulative probabilities in Column F. The top cell theoretical  probability boundary is 1.0000.
  • Col H: The theoretical frequency in each cell is the product of N and Column G.
  • Col l: Each cell is required to have a theoretical frequency equal to or greater than four. Therefore, the top four cells must be added to the cell whose midpoint is 18. The bottom three cells must be added to the cell whose midpoint is 13. Thus, there are six effective cells, all of which have a theoretical frequency equal to or greater than four.
  • Col J: The observed frequency cells must be pooled to match the theoretical frequency cells. It does not matter if the observed frequencies are less than four.
  • Col K: The contributions to chi square are obtained by squaring the difference between Column I and Column J and dividing by Column l.

Conclusion: Since the calculated chi-square, 6.057, is less than the critical chi-square, 7.81, we fail to reject the null hypothesis of normality, and therefore, conclude that the data is from a normal distribution.

Poisson Distribution (GOF)

Example: The bead drum is an attribute variable, random sample generating device, which was used to obtain the following data. In this exercise red beads represent defects. Seventy-five constant size samples were obtained. The goodness-of-fit test is analyzed based on sample statistics. The null hypothesis is that the bead drum samples represent a Poisson distribution.
N = 75
Sample Avg = 269/75 = 3.59
DF = 7 – 2 = 5
χ20.05,5 = 11.07
One degree of freedom is lost because  (sample average = 3.59) estimates μ. The second degree of freedom is lost because N (number of samples) estimates the population.

  • Col A: Values of c which matched the actual distribution of sample defects found.
  • Col B: The probability that c defects would occur given the average value of the samples.
  • Col C: The theoretical number of defects that would occur (N x Col B).
  • Col D: The observed frequency of each number of defects.
  • Col E: The required minimum frequency of four for each effective cell resulted in pooling at both tails of the theoretical Poisson distribution.
  • Col F: The observed frequency distribution of defects must also be pooled to match the effective theoretical distribution.
  • Col G: The contributions to chi square are obtained from squaring the difference between Fe and Fo and dividing the result by Fe.
  • Col H: Total defects found result from the product of number of defects and observed frequency.

Conclusion: Since the calculated chi-square of 4.47 is less than the critical chi-square value of 11.07 at the 95% confidence level, we fail to reject the null hypothesis that the bead drum samples represent a Poisson distribution.

Binomial Distribution (GOF)

Example: The null hypothesis states that the following industrial sample data comes from a binomial population of defectives (N = 80). In this case, we will estimate the probability of a defective from the sample data, p = 0.025625.
One degree of freedom is. lost because the total sample frequency represents the population. The second degree of freedom is lost because  is used to estimate μ :

  • Col A: The range of defectives matching the observed sample data.
  • Col B: The probability of observed cell defective count given sample size N and d.
  • Col C: The expected theoretical frequency (cell probability)(N).
  • Col D: The observed cell frequency count from the 80 samples.
  • Col E: Theoretical frequency with cells pooled to meet n = 4 minimum.
  • Col F: Observed cell frequency pooled to match theoretical frequency pooled cells.
  • Col G: Contributions to chi square (Fe – Fo)2/Fe.
  • Col H: The count of defectives by cell (d)(Fo).
    Conclusion: The calculated chi square = 13.30. The critical chi square = 9.49. Since the calculated value is greater than the critical value, the null hypothesis that the sample data represents the binomial distribution is rejected at the 95% confidence level.

Contingency Tables

A two-way classification table (rows and columns) containing original frequencies can be analyzed to determine whether the two variables (classifications) are independent or have significant associations. R. A. Fisher determined that when the marginal totals (of rows and columns) are analyzed in a certain way, that the chi-square procedure will test whether there is a dependency between the two classifications. In addition, a contingency coefficient (correlation) can be calculated. If the chi-square test shows a significant dependency, the contingency coefficient shows the strength of the correlation. It often happens that results obtained in samples do not always agree exactly with the theoretically expected results according to rules of probability. A measure of the difference found between observed and expected frequencies is supplied by the  statistic chi-square, χ2, where:
If χ2 = 0, the observed and theoretical frequencies agree exactly. If χ2 > 0, they do not agree exactly. The larger the value of χ2, the greater the discrepancy between observed and theoretical frequencies. The chi-square distribution is an appropriate reference distribution for critical values when the expected frequencies are at least equal to 5.
Example: The calculation for the E (expected or theoretical) frequency will be demonstrated in the following example. Five hospitals tried a new drug to alleviate the symptoms of emphysema. The results were classified at three levels: no change, slight improvement, marked improvement. The percentage matrix is shown in the table below. While the results expressed as percentages do suggest differences among hospitals, ratios presented as percentages can be misleading.
A proper analysis requires that original data be considered as frequency counts. The table below lists the original data on which the percentages are based. The calculation of expected, or theoretical, frequencies is based on the marginal totals. The marginal totals for the frequency data are the column totals, the row totals, and the grand total. The null hypothesis is that all hospitals have the same proportions over the three levels of classifications. To calculate the expected frequencies for each of the 15 cells under the null hypothesis requires the manipulation of the marginal totals as illustrated by the following calculation for one cell. Consider the count of 15 for the Hospital Alno change cell. The expected value, E, is:

The same procedure repeated for the other 14 cells yields

Each of these 15 cells makes a contribution to chi-square (χ2). For the same selected (illustrative) cell, the contribution is ChI Square over all cells. 

Assume alpha to be 0.01. The degrees of freedom for contingency tables is: d.f. = (rows – 1) x (columns -1).

For this example: d.f. = (5 – 1) x (3 – 1) = 8

The critical chi square: χ20.01,8 = 20.09
The calculated chi-square is larger than the critical chi-square. Therefore, one rejects the null hypothesis of hospital equality of results. The alternative hypothesis is that hospitals differ.

Coefficient of Contingency (C)

The degree of relationship, association or dependence of the classifications in a contingency table is by where N equals the grand frequency total.
The contingency coefficient is:

The maximum value of C is never greater than 1.0, and is dependent on the total number of rows and columns. For the example data, the maximum coefficient of contingency is:Where: k = min of (r, c) and r = rows, c = columns
There is a Yates correction for continuity test that can be performed when the contingency table has exactly two columns and two rows. That is, the degrees of freedom is equal to 1.

Correlation of Attributes

Contingency table classifications often describe characteristics of objects or individuals. Thus, they are often referred to as attributes and the degree of dependence, association, or relationship is called correlation of attributes. For (k = r = c) tables, the correlation coefficient, φ, is defined as: The value of φ falls between 0 and 1. If the calculated value of chi-square is significant, then φ is significant. In the above example, rows and columns are not equal and the correlation calculation is not applied.

 Back to Home Page

If you need assistance or have any doubt and need to ask any question  contact us at: You can also contribute to this discussion and we shall be happy to publish them. Your comment and suggestion is also welcome.

Leave a Reply