See Steel, Torrie and Dickey, Chapter 8, and Multiple Comparisons and Multiple Tests using the SAS System, Westfall et al.
Pre-planned comparisons that are linearly independent of one another, hence involving no more contrasts than the appropriate degrees of freedom, can be made using either t-tests or F tests (although it should be noted that this does not, of itself address the issue of multiple comparisons). If we want to test all the possible differences, or if we wish to make tests suggested by the data then simple t-tests or F tests are no longer appropriate, since the overall probability level will likely be toohigh (the risk of false positives). We need tests appropriate to multiple comparisons. Multiple comparison tests are characterised by considering the number of tests that could be made.
There are a number of multiple comparison tests :
Here we look at Scheffé's test; because it is a valid, fairly conservative test, sufficiently generalised to be applicable to unequal designs. It is very general, so that all possible contrasts can be tested for significance, or confidence intervals contructed for corresponding linear functions of parameters.
Consider the example from the Completely Randomised Design, One-way ANOVA.
Treatments | ||||||
---|---|---|---|---|---|---|
3DOk1 | 3DOk5 | 3DOk4 | 3DOk7 | 3DOk13 | composite | |
means | 28.82 | 23.90 | 14.35 | 19.92 | 13.26 | 18.7 |
We shall look at the multiple comparison test due to Scheffé. We can use it for computing a Confidence Interval, and also for computing a Critical Difference, to allow us to determine whether a difference can be considered to be statistically significantly different,or not.
First of all we need to decide on our Hypothesis; our Null Hypothesis (Ho) and our Alternative Hypothesis (HA).
Our Null Hypothesis (Ho) will be that there is no difference, i.e. that the difference = Zero.
Our Alternative Hypothesis (HA) will be that there is a difference, i.e. that the difference is not equal to Zero.
Basically it involves estimating the difference of any particular desired contrast, and the standard error, using the appropriate methodology, i.e. using our matrix k`!
We then need to determine the Ftabulated, which will depend on our probability level, numerator and denominator degrees of freedom, hence the importance of being able to compute the correct tabulated F value for any given numerrator and denominator degrees of freedom (see the section on computing F vaalues).
Then we compute the critical difference as :
Then, if the absolute estimated difference is greater than the critical difference, we can declare that the difference is statistically significant, i.e. reject Ho, where Ho is that the difference is zero.
Thus, for example, suppose that we have analysed these data and see that there appears to be a difference between the 5 straight inoculants and the mixture. So, we wish to compare the average of the 5 inoculants vs. the 6th (which is the combined mixture of the 5 inoculants), then the difference is:
The tabulated t value for 22 residual degrees of freedom at the 5% level is 2.074. Thus, using a simple (inappropriate) t statistic we would require a critical difference (estimate) of at least t * s.e.; i.e. 2.074 * 1.77 = 3.67
If we did not have this as a pre-planned comparison, but rather after our analysis we noted this difference and wanted to know if it was significant we should be using a multiple comparison test, e.g. Scheffé's test. The calculations are as above, to compute the estimate of the difference and the standard error. We have 6 treatments, so s = 6, and s - 1 = 5, the degrees of freedom for treatments.
F 5%, 5 d.f., 22 d.f. | = 2.66 |
![]() | = 3.6469 |
Then the critical difference = 1.77 * 3.6469 = 6.45. Thus the estimated difference must exceed 6.45 to be considered statistically significant, at the 5% level!! Therefore in this case we would accept the null hypothesis, that the treatments do not in fact differ significantly from one another.
We can also use the Sheffé's multiple comparison method to compute a Confidence Interval; which will simply be our estimate +/- the Critical difference! This will apply equally well to an estimate of a difference, or a Least Square mean.
How do we do this with SAS (or any other statistical package)?
With SAS we can use the estimate statement, after the model statement, to
compute the estimate of the difference between two 'treatments' or levels
together with the standard error of the estimate. Then a little bit of
work by hand with a calculator will give us the critical difference.
In addition, if we have carried out an experiment and then made the statistical analysis and want to compare all the treatments (for example in the above experiment) we are now able to have SAS do the Scheffé's test for us when we compare the Least Squares Means. We shall not obtain the estimates of the differences, nor a critical difference. Rather SAS will provide us with a table of the probabilities of the differences, adjusted for the multiple comparisons, using the method of Scheffé. The SAS statements for the CRD would be:
proc glm data=crd1; classes trt; model y = crd; lsmeans trt/stderr pdiff; lsmeans trt/stderr pdiff adjust=scheffe; /* Scheffe's test */ lsmeans trt/stderr pdiff adjust=bon; /* Boneferoni's test */ estimate 'trt1-trt2' trt 1 -1 0 0 0 0; estimate 'trt1-trt3' trt 1 0 -1 0 0 0; estimate 'trt1-trt4' trt 1 0 0 -1 0 0; estimate 'trt1-trt5' trt 1 0 0 0 -1 0; estimate 'trt1-trt6' trt 1 0 0 0 0 -1; estimate 'trt2-trt3' trt 0 1 -1 0 0 0; estimate 'trt2-trt4' trt 0 1 0 -1 0 0; estimate 'trt2-trt5' trt 0 1 0 0 -1 0; estimate 'trt2-trt6' trt 0 1 0 0 0 -1; estimate 'trt3-trt4' trt 0 0 1 -1 0 0; estimate 'trt3-trt5' trt 0 0 1 0 -1 0; estimate 'trt3-trt6' trt 0 0 1 0 0 -1; estimate 'trt4-trt5' trt 0 0 0 1 -1 0; estimate 'trt4-trt6' trt 0 0 0 1 0 -1; estimate 'trt5-trt6' trt 0 0 0 0 1 -1; estimate 'trt1-5 vs 6' trt .2 .2 .2 .2 .2 -1; run;
The output produced from the above analysis is shown here:
The SAS System |
The GLM Procedure |
Class Level Information | ||
Class | Levels | Values |
trt | 6 | 1 2 3 4 5 6 |
Number of observations | 28 |
The SAS System |
The GLM Procedure |
Dependent Variable: y |
Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
Model | 5 | 812.674500 | 162.534900 | 12.72 | <.0001 |
Error | 22 | 281.118000 | 12.778091 | ||
Corrected Total | 27 | 1093.792500 |
R-Square | Coeff Var | Root MSE | y Mean |
0.742988 | 17.98564 | 3.574646 | 19.87500 |
Source | DF | Type I SS | Mean Square | F Value | Pr > F |
trt | 5 | 812.6745000 | 162.5349000 | 12.72 | <.0001 |
Source | DF | Type III SS | Mean Square | F Value | Pr > F |
trt | 5 | 812.6745000 | 162.5349000 | 12.72 | <.0001 |
The GLM Procedure |
Least Squares Means |
LSMeans = fitted values, mu + trti |
Pr = probability that the real value of mu + trti = Zero |
trt | y LSMEAN | Standard Error | Pr > |t| | LSMEAN Number |
1 | 28.8200000 | 1.5986301 | <.0001 | 1 |
2 | 23.9000000 | 1.7873228 | <.0001 | 2 |
3 | 14.3500000 | 1.7873228 | <.0001 | 3 |
4 | 19.9200000 | 1.5986301 | <.0001 | 4 |
5 | 13.2600000 | 1.5986301 | <.0001 | 5 |
6 | 18.7000000 | 1.5986301 | <.0001 | 6 |
Least Squares Means for effect trt Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: y |
||||||
i/j | 1 | 2 | 3 | 4 | 5 | 6 |
1 | 0.0523 | <.0001 | 0.0007 | <.0001 | 0.0002 | |
2 | 0.0523 | 0.0010 | 0.1112 | 0.0002 | 0.0412 | |
3 | <.0001 | 0.0010 | 0.0298 | 0.6539 | 0.0833 | |
4 | 0.0007 | 0.1112 | 0.0298 | 0.0075 | 0.5949 | |
5 | <.0001 | 0.0002 | 0.6539 | 0.0075 | 0.0250 | |
6 | 0.0002 | 0.0412 | 0.0833 | 0.5949 | 0.0250 |
|
The GLM Procedure |
Least Squares Means |
Adjustment for Multiple Comparisons: Scheffe |
Least Squares Means for effect trt Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: y |
||||||
i/j | 1 | 2 | 3 | 4 | 5 | 6 |
1 | 0.5345 | 0.0004 | 0.0288 | <.0001 | 0.0098 | |
2 | 0.5345 | 0.0391 | 0.7360 | 0.0106 | 0.4745 | |
3 | 0.0004 | 0.0391 | 0.3990 | 0.9989 | 0.6587 | |
4 | 0.0288 | 0.7360 | 0.3990 | 0.1683 | 0.9975 | |
5 | <.0001 | 0.0106 | 0.9989 | 0.1683 | 0.3607 | |
6 | 0.0098 | 0.4745 | 0.6587 | 0.9975 | 0.3607 |
The GLM Procedure |
Least Squares Means |
Adjustment for Multiple Comparisons: Bonferroni |
Least Squares Means for effect trt Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: y |
||||||
i/j | 1 | 2 | 3 | 4 | 5 | 6 |
1 | 0.7843 | <.0001 | 0.0106 | <.0001 | 0.0028 | |
2 | 0.7843 | 0.0155 | 1.0000 | 0.0031 | 0.6181 | |
3 | <.0001 | 0.0155 | 0.4475 | 1.0000 | 1.0000 | |
4 | 0.0106 | 1.0000 | 0.4475 | 0.1121 | 1.0000 | |
5 | <.0001 | 0.0031 | 1.0000 | 0.1121 | 0.3744 | |
6 | 0.0028 | 0.6181 | 1.0000 | 1.0000 | 0.3744 |
The GLM Procedure |
Dependent Variable: y |
Parameter | Estimate | Standard Error | t Value | Pr > |t| |
trt1-trt2 | 4.9200000 | 2.39794514 | 2.05 | 0.0523 |
trt1-trt3 | 14.4700000 | 2.39794514 | 6.03 | <.0001 |
trt1-trt4 | 8.9000000 | 2.26080436 | 3.94 | 0.0007 |
trt1-trt5 | 15.5600000 | 2.26080436 | 6.88 | <.0001 |
trt1-trt6 | 10.1200000 | 2.26080436 | 4.48 | 0.0002 |
trt2-trt3 | 9.5500000 | 2.52765612 | 3.78 | 0.0010 |
trt2-trt4 | 3.9800000 | 2.39794514 | 1.66 | 0.1112 |
trt2-trt5 | 10.6400000 | 2.39794514 | 4.44 | 0.0002 |
trt2-trt6 | 5.2000000 | 2.39794514 | 2.17 | 0.0412 |
trt3-trt4 | -5.5700000 | 2.39794514 | -2.32 | 0.0298 |
trt3-trt5 | 1.0900000 | 2.39794514 | 0.45 | 0.6539 |
trt3-trt6 | -4.3500000 | 2.39794514 | -1.81 | 0.0833 |
trt4-trt5 | 6.6600000 | 2.26080436 | 2.95 | 0.0075 |
trt4-trt6 | 1.2200000 | 2.26080436 | 0.54 | 0.5949 |
trt5-trt6 | -5.4400000 | 2.26080436 | -2.41 | 0.0250 |
trt1-5 vs 6 | 1.3500000 | 1.76574465 | 0.76 | 0.4527 |
Consider the example from the Factorial Design.
We have a simple Factorial Design, with both factors being fixed effects, Diet with 3 levels and Sex with 2 levels, so that the effects are all being tested against the Residual Mean Square. The interaction effect was statistically significant (Pr < .0034). Thus the main effects loose their importance and we should be looking at the 'Simple Effects'. From this 3 X 2 Factorial we therefore have 6 Diet X Sex combinations, or 'Simple Effects'. Therefore in terms of 'Simple Effects', if we wish to make multiple comparisons or ad hoc, post priori tests we have 6 'levels', hence 5 degrees of freedom. These 5 degrees of freedom are equal to the 2 d.f. for Diet + 1 d.f. for Sex + the 2 d.f. for the Diet X Sex interaction.
Simple effects | Estimate ± s.e. |
---|---|
µ + a1 + b1 + ab11 | 105.25 ± 1.40 |
µ + a1 + b2 + ab12 | 95.00 ± 1.40 |
µ + a1 + b3 + ab13 | 91.25 ± 1.40 |
µ + a2 + b1 + ab21 | 102.25 ± 1.40 |
µ + a2 + b2 + ab22 | 102.25 ± 1.40 |
µ + a2 + b3 + ab23 | 89.50 ± 1.40 |
Using the same principles as above, for a multiple comparison test using Scheffé's test, we have 6 'treatments' (combinations); therefore s = 6, and s-1 = 5. The residual degrees of freedom are 18.
F 5%, 5 d.f., 18 d.f. | = 2.77 |
![]() | = 3.7216 |
Consider the difference between A1B2 ( 95.00) and A2B2 (102.25).
Then the critical difference is 1.98*3.7216 = 7.369
Thus an aposteriori test, à la Scheffé's test, would accept the null hypothesis, that there is no difference, since the difference is less than the critical difference; whereas a simple t-test would reject the null hypothesis :
t-calculated = 7.25/1.98 = 3.66
and the tabulated t value for 5% and 18 d.f. is 2.101; less than our computed t value.
How to do this using SAS? The approach to use is much the same as that described above for a CRD design, we use the SAS estimate statement to get SAS to compute the estimate and standard error of the particular contrast that we are interested in; the rest we do by hand!
The above approaches can be extended to Nested Designs and Analyses, effectively the only difference is that in a Nested (Subsampling) Analysis the Residual Mean Square (Error) is replaced by the appropriate Mean Square, the same one as used in the Analysis of Variance.