Type I and Type III Sums of Squares

For a good general overview of ANOVA procedures, the four type of estimable functions and their associated Sums of Squares see the introductory chapters of the SAS/STAT guide.

As a general rule we want the Type III, Marginal Sums of Squares for a factor, i.e. corrected for as many other factors in the model as possible. Type III Sums of Squares also provide estimates which are not a function of the frequency of observations in any group, i.e. for unbalanced data structures, where we have unequal numbers of observations in each group, the group(s) with more observations do not per se have more importance than group(s) with fewer observations. For purely nested designs, some polynomial regressions, and some models involving balanced data fitted in the right order, we can sometimes need Type I, Sequential, Sums of Squares; however, more often we should in fact be using a nested or mixed models procedure in such cases.

Type I (Sequential) Sums of Squares

The Sums of Squares obtained by fitting effects in the order specified in the model.

Y_i = b₀ + b₁ X_i1 + b₂ X_i2 + b₃ X_i3 + e_i

The Type I Sums of Squares for b₁ are the Sums of Squares obtained from fitting b₁ over and above the mean. i.e. R(b₁ | µ). They are the 'marginal' Sums of Squares for b₁ if one fitted the model

Y_i = b₀ + b₁ X_i1 + e_i

The Type I Sums of Squares for b₂ are the Sums of Squares obtained from fitting b₂ after b₁, i.e. R(b₂ | b₁₁, but not for any other factors we may have measured and be including in our model. They are the 'marginal' Sums of Squares for b₂ if one fitted the model

Y_i = b₀ + b₁ X_i1 + b₂ X_i2 + e_i

Note that in the above model

SSR_m = R( b₂, b₁ | µ)

and R( b₂, b₁ | µ) = R( b₁ | µ) + R( b₂ | b₁, µ)

i.e. the Sequential Sums of Squares sum to the Sums of Squares for the model corrected for the mean.

Similarly, the Type I Sums of Squares for b₃ are the Sums of Squares obtained from fitting b₃ after b₂ and b₁, i.e. R( b₃ | b₂, b₁, µ). Thus we have 'corrected' for the effect of b₁ and b₂. They are the marginal Sums of Squares for b₃ if one fitted the model

Y_i = b₀ + b₁ X_i1 + b₂ X_i2 + b₃ X_i3 + e_i

Thus R(b₁, b₂, b₃ | µ) = R(b₁ | µ) + R( b₂ | b₁, µ) + R(b₃ | b₂, b₁, µ)

The Type I, Sequential, Sums of Squares for each effect will change if the order of the effects in the model is changed!

If one has a 'balanced' experiment, when each amount of X₁ has every amount of X₂ and X₃ equally represented, then the Type I (Sequential) Sums of Squares for each effect will also equal the Type III (Marginal) Sums of Squares.

An example with plots of maize and Nitrogen, Phosphorus and Potassium fertilisers.

Nitrogen Phosphorus Potassium Maize Yield

10 10 10 65

10 10 20 80

10 10 30 104

10 20 10 87

10 20 20 108

10 20 30 126

10 30 10 107

10 30 20 126

10 30 30 148

20 10 10 86

20 10 20 107

20 10 30 129

20 20 10 107

20 20 20 126

20 20 30 148

20 30 10 125

20 30 20 144

20 30 30 168

30 10 10 108

30 10 20 129

30 10 30 141

30 20 10 125

30 20 20 143

30 20 30 168

30 30 10 149

30 30 20 163

30 30 30 184

Nitrogen	Phosphorus	Potassium	Maize Yield
10	10	10	65
10	10	20	80
10	10	30	104
10	20	10	87
10	20	20	108
10	20	30	126
10	30	10	107
10	30	20	126
10	30	30	148
20	10	10	86
20	10	20	107
20	10	30	129
20	20	10	107
20	20	20	126
20	20	30	148
20	30	10	125
20	30	20	144
20	30	30	168
30	10	10	108
30	10	20	129
30	10	30	141
30	20	10	125
30	20	20	143
30	20	30	168
30	30	10	149
30	30	20	163
30	30	30	184

SAS code, balanced experiment
SAS code, unbalanced experiment

Take both these data sets and SAS code and run them through SAS. Examine the outputs, paying particular attention to the PROC GLM analyses, the Type I and Type III Sums of Squares for each of the analyses.

Type III (Marginal) Sums of Squares

The Sums of Squares obtained by fitting each effect after all the other terms in the model, i.e. the Sums of Squares for each effect corrected for the other terms in the model. The marginal (Type III) Sums of Squares do not depend upon the order in which effects are specified in the model.

Y_i = b₀ + b₁ X_i1 + b₂ X_i2 + b₃ X_i3 + e_i

The marginal Sums of Squares do NOT sum to the Sums of Squares for the model corrected for the mean, i.e.

SSR_m = R(b₁, b₂, b₃ | µ)

R(b₁, b₂, b₃ | µ) ne R(b₁ | µ, b₂, b₃) + R(b₂ | µ, b₁, b₃) + R(b₃ | µ, b₁, b₂)

The marginal (Type III) Sums of Squares are preferable in most cases since they correspond to the variation attributable to an effect after correcting for any other effects in the model. They are unaffected by the frequency of observations.

A case where they are not preferable is the case when we have a purely nested design, in this case the main effect within which the effect is nested should be considered by using the Type I Sums of Squares for that main effect in a model where other effects preceed the main effect and the nested effect.

For example, let us consider that we have a nested design, with 3 treatments to be applied to apple trees and that we are then going to weigh 6 apples from each tree. We have 12 trees, 4 per treatment. Trees are the experimental unit. The model will be

Y_ijk = µ + trt_i + tree_ij + apple_ijk

Then the Sums of Squares that we compute will be

R(trt, tree_within_trt | µ), R( tree_within_trt | µ, trt) and R(trt | µ).

Thus we can see that we need the Type I Sums of Squares for treatment (over and abouve the mean) and for trees within treatments over and above the effect of treatments. This is a purely nested design. Other than this type of case of a purely nested design we should stick to Type III Sums of Squares.

Nitrogen	Phosphorus	Potassium	Maize Yield
10	10	10	65
10	10	20	80
10	10	30	104
10	20	10	87
10	20	20	108
10	20	30	126
10	30	10	107
10	30	20	126
10	30	30	148
20	10	10	86
20	10	20	107
20	10	30	129
20	20	10	107
20	20	20	126
20	20	30	148
20	30	10	125
20	30	20	144
20	30	30	168
30	10	10	108
30	10	20	129
30	10	30	141
30	20	10	125
30	20	20	143
30	20	30	168
30	30	10	149
30	30	20	163
30	30	30	184

Nitrogen	Phosphorus	Potassium	Maize Yield
10	10	10	65
10	10	20	80
10	10	30	104
10	20	10	87
10	20	20	108
10	20	30	126
10	30	10	107
10	30	20	126
10	30	30	148
20	10	10	86
20	10	20	107
20	10	30	129
20	20	10	107
20	20	20	126
20	20	30	148
20	30	10	125
20	30	20	144
20	30	30	168
30	10	10	108
30	10	20	129
30	10	30	141
30	20	10	125
30	20	20	143
30	20	30	168
30	30	10	149
30	30	20	163
30	30	30	184

Nitrogen	Phosphorus	Potassium	Maize Yield
10	10	10	65
10	10	20	80
10	10	30	104
10	20	10	87
10	20	20	108
10	20	30	126
10	30	10	107
10	30	20	126
10	30	30	148
20	10	10	86
20	10	20	107
20	10	30	129
20	20	10	107
20	20	20	126
20	20	30	148
20	30	10	125
20	30	20	144
20	30	30	168
30	10	10	108
30	10	20	129
30	10	30	141
30	20	10	125
30	20	20	143
30	20	30	168
30	30	10	149
30	30	20	163
30	30	30	184