Chapter 6 The 2k Factorial Design
1
6.1 Introduction • The special cases of the general factorial design (Chapter 5) • k factors and each factor has only two levels • Levels: – quantitative (temperature, pressure,…), or qualitative (machine, operator,…) – High and low – Each replicate has 2 × … × 2 = 2k observations 2
• Assumptions: (1) the factor is fixed, (2) the design is completely randomized and (3) the usual normality assumptions are satisfied • Wildly used in factor screening experiments
3
6.2 The 22 Factorial Design • Two factors, A and B, and each factor has two levels, low and high. • Example: the concentration of reactant v.s. the amount of the catalyst (Page 219)
4
• “-” And “+” denote the low and high levels of a factor, respectively • Low and high are arbitrary terms • Geometrically, the four runs form the corners of a square • Factors can be quantitative or qualitative, although their treatment in the final model will be different
5
• Average effect of a factor = the change in response produced by a change in the level of that factor averaged over the levels if the other factors. • (1), a, b and ab: the total of n replicates taken at the treatment combination. • The main effects: 1 1 A = {[ ab − b] + [a − (1)]} = [ab + a − b − (1)] 2n 2n ab + a b + (1) = − = y A+ − y A− 2n 2n 1 1 B = {[ ab − a ] + [b − (1)]} = [ab + b − a − (1)] 2n 2n ab + b a + (1) = − = y B+ − y B− 2n 2n 6
• The interaction effect:
1 1 AB = {[ ab − b] − [a − (1)]} = [ab + (1) − a − b] 2n 2n ab + (1) b + a = − 2n 2n
• In that example, A = 8.33, B = -5.00 and AB = 1.67 • Analysis of Variance • The total effects: Contrast A = ab + a − b − (1) Contrast B = ab + b − a − (1) Contrast AB = ab + (1) − a − b 7
• Sum of squares: [ab + a − b − (1)]2 SS A = 4n [ab + b − a − (1)]2 SS B = 4n [ab + (1) − b − a ] 2 SS AB = 4n 2 2 2 n y 2 SS T = ∑∑∑ y ijk − ⋅⋅⋅ 4n i =1 j =1 k =1 SS E = SS T − SS A − SS B − SS AB
8
Response:Conversion ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Source Squares Model 291.67 A 208.33 B 75.00 AB 8.33 Pure Error 31.33 Cor Total 323.00
DF 3 1 1 1 8 11
Mean Square 97.22 208.33 75.00 8.33 3.92
F Value 24.82 53.19 19.15 2.13
Prob > F 0.0002 < 0.0001 0.0024 0.1828
Std. Dev. Mean C.V.
1.98 27.50 7.20
R-Squared Adj R-Squared Pred R-Squared
0.9030 0.8666 0.7817
PRESS
70.50
Adeq Precision
11.669
The F-test for the â&#x20AC;&#x153;modelâ&#x20AC;? source is testing the significance of the overall model; that is, is either A, B, or AB or some combination of these effects important?
9
• Table of plus and minus signs: I
A
B
AB
(1)
+
–
–
+
a
+
+
–
–
b
+
–
+
–
ab
+
+
+
+
10
• The regression model:
y = β 0 + β1 x1 + β 2 x 2 + ε
– x1 and x2 are coded variables that represent the two factors, i.e. x1 (or x2) only take values on –1 and 1. – Use least square method to get the estimations of the coefficients – For that example,
8.33 − 5.00 yˆ = 27.5 + x1 + x2 2 2
– Model adequacy: residuals (Pages 224~225) and normal probability plot (Figure 6.2) 11
• Response surface plot:
yˆ = 18.33 + 0.8333Conc − 5.00Catalyst – Figure 6.3
12
6.3 The 23 Design • Three factors, A, B and C, and each factor has two levels. (Figure 6.4 (a)) • Design matrix (Figure 6.4 (b)) • (1), a, b, ab, c, ac, bc, abc • 7 degree of freedom: main effect = 1, and interaction = 1
13
14
• Estimate main effect:
1 A= [a − (1) + ab − b + ac − c + abc − bc] 4n = y A+ − y A− a + ab + ac + abc (1) + b + c + bc = − 4n 4n 1 = [a + ab + ac + abc − (1) − b − c − bc] 4n
• Estimate two-factor interaction: the difference between the average A effects at the two levels of 1 B AB = [abc − bc + ab − b − ac + c − a + (1)] 4n abc + ab + c + (1) bc + b + ac + a = − 4n 4n 15
• Three-factor interaction:
1 ABC = {[ abc − bc] − [ac − c] − [ab − b] + [a − (1)]} 4n 1 = [abc − bc − ac + c − ab + b + a − (1)] 4n
• Contrast: Table 6.3 – Equal number of plus and minus – The inner product of any two columns = 0 – I is an identity element – The product of any two columns yields another column – Orthogonal design • Sum of squares: SS = (Contrast)2/8n 16
Table of – and + Signs for the 23 Factorial Design (pg. 231) Factorial Effect Treatment Combination
I
A
B
AB
C
AC
BC
ABC
(1) a
+
–
–
+
–
+
+
–
+
+
–
–
–
–
+
+
b
+
–
+
–
–
+
–
+
ab
+
+
+
+
–
–
–
–
c
+
–
–
+
+
–
–
+
ac
+
+
–
–
+
+
–
–
bc
+
–
+
–
+
–
+
–
abc
+
+
+
+
+
+
+
+
Contrast
24
18
6
14
2
4
4
Effect
3.00
2.25
0.75
1.75
0.25
0.50
0.50
17
â&#x20AC;˘ Example 6.1
A = carbonation, B = pressure, C = speed, y = fill deviation 18
â&#x20AC;˘ Estimation of Factor Effects
Model Error Error Error Error Error Error Error Error Error
Term Effect Intercept A 3 B 2.25 C 1.75 AB 0.75 AC 0.25 BC 0.5 ABC 0.5 LOF 0 P Error
SumSqr % Contribution
Lenth's ME Lenth's SME
1.25382 1.88156
36 20.25 12.25 2.25 0.25 1 1
46.1538 25.9615 15.7051 2.88462 0.320513 1.28205 1.28205
5
6.41026
19
• ANOVA Summary – Full Model Response:Fill-deviation ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Source Squares Model 73.00 A 36.00 B 20.25 C 12.25 AB 2.25 AC 0.25 BC 1.00 ABC 1.00 Pure Error 5.00 Cor Total 78.00
DF 7 1 1 1 1 1 1 1 8 15
Mean Square 10.43 36.00 20.25 12.25 2.25 0.25 1.00 1.00 0.63
F Value 16.69 57.60 32.40 19.60 3.60 0.40 1.60 1.60
Prob > F 0.0003 < 0.0001 0.0005 0.0022 0.0943 0.5447 0.2415 0.2415
Std. Dev. Mean C.V.
0.79 1.00 79.06
R-Squared 0.9359 Adj R-Squared Pred R-Squared
0.8798 0.7436
PRESS
20.00
Adeq Precision
13.416
20
• The regression model and response surface: – The regression model: 3.00 2.25 1.75 0.75 yˆ = 1.00 + x1 + x2 + x3 + x1 x 2 2 2 2 2
– Response surface and contour plot (Figure 6.7) Coefficient Factor Estimate Intercept 1.00 A-Carbonation 1.50 B-Pressure 1.13 C-Speed 0.88 AB 0.38
Standard 95% CI 95% CI DF Error Low High 1 0.20 0.55 1.45 1 0.20 1.05 1.95 1 0.20 0.68 1.57 1 0.20 0.43 1.32 1 0.20 -0.072 0.82
21
â&#x20AC;˘ Contour & Response Surface Plots â&#x20AC;&#x201C; Speed at the High Level 30.00
F ill-d e via tio n
2
2
4.875 3.5625 2.25
3.125
F i l l -d e v i a ti o n
B : P re s s u re
28.75
0.9375 -0 . 3 7 5
2.25 27.50
1.375
0.5 26.25
30.00 12.00
28.75
25.00
2
10.00
2
11.50
27.50
11.00
B : P re ssu re 2 6 . 2 5 10.50
11.00
11.50
10.50
A : C a rb o n a ti o n
12.00
25.00
A : C a r b o n a tio n
22
10.00
• Refine Model – Remove Nonsignificant Factors Response: Fill-deviation ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Source Squares Model 70.75 A 36.00 B 20.25 C 12.25 AB 2.25 Residual 7.25 LOF 2.25 Pure E 5.00 C Total 78.00
DF 4 1 1 1 1 11 3 8 15
Mean Square 17.69 36.00 20.25 12.25 2.25 0.66 0.75 0.63
F Value 26.84 54.62 30.72 18.59 3.41
Prob > F < 0.0001 < 0.0001 0.0002 0.0012 0.0917
1.20
0.3700
Std. Dev. 0.81 Mean 1.00 C.V. 81.18
R-Squared Adj R-Squared Pred R-Squared
0.9071 0.8733 0.8033
PRESS
Adeq Precision
15.424
15.34
23
6.4 The General 2k Design • k factors and each factor has two levels • Interactions • The standard order for a 24 design: (1), a, b, ab, c, ac, bc, abc, d, ad, bd, abd, cd, acd, bcd, abcd k two-factor interactions 2 k three-factor interactions 3 M 1 k − factor interaction
24
• The general approach for the statistical analysis: – Estimate factor effects – Form initial model (full model) – Perform analysis of variance (Table 6.9) – Refine the model – Analyze residual – Interpret results • Contrast = (a ± 1)(b ± 1) (k ± 1) ABC ... K
2 ABC K = k Contrast ABCK n2 1 SS ABCK = k (Contrast ABCK ) 2 n2
25
6.5 A Single Replicate of the 2k Design • These are 2k factorial designs with one observation at each corner of the “cube” • An unreplicated 2k factorial design is also sometimes called a “single replicate” of the 2k • If the factors are spaced too closely, it increases the chances that the noise will overwhelm the signal in the data
26
• Lack of replication causes potential problems in statistical testing – Replication admits an estimate of “pure error” (a better phrase is an internal estimate of error) – With no replication, fitting the full model results in zero degrees of freedom for error • Potential solutions to this problem – Pooling high-order interactions to estimate error (sparsity of effects principle) – Normal probability plotting of effects (Daniels, 1959) 27
• Example 6.2 (A single replicate of the 24 design) – A 24 factorial was used to investigate the effects of four factors on the filtration rate of a resin – The factors are A = temperature, B = pressure, C = concentration of formaldehyde, D= stirring rate
28
29
â&#x20AC;˘ Estimates of the effects Model Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error
Term Intercept A B C D AB AC AD BC BD CD ABC ABD ACD BCD ABCD
Effect
SumSqr % Contribution
21.625 3.125 9.875 14.625 0.125 -18.125 16.625 2.375 -0.375 -1.125 1.875 4.125 -1.625 -2.625 1.375
1870.56 39.0625 390.062 855.563 0.0625 1314.06 1105.56 22.5625 0.5625 5.0625 14.0625 68.0625 10.5625 27.5625 7.5625
Lenth's ME Lenth's SME
32.6397 0.681608 6.80626 14.9288 0.00109057 22.9293 19.2911 0.393696 0.00981515 0.0883363 0.245379 1.18763 0.184307 0.480942 0.131959
6.74778 13.699
30
â&#x20AC;˘ The normal probability plot of the effects N o rm a l p lo t 99
A
N o rm a l % p r o b a b ility
95 90
AD
80
C
70
D
50 30 20 10 5
AC 1
- 18.12
- 8.19
1.75
E ffe c t
11.69
21.62
31
Inte ra c tio n G ra p h
Inte ra c tio n G ra p h
C : C o n c e n tra tio n
104
D : S tirrin g R a te
104
88.4426
F iltr a tio n R a te
F iltra tio n R a te
88.75
72.8851
73.5
57.3277
58.25
41.7702
43
- 1.00
- 0.50
0.00
0.50
1.00
- 1.00
- 0.50
A : T e m p e r a tu re
0.00
0.50
A : T e m p e ra tu re
32
1.00
• B is not significant and all interactions involving B are negligible • Design projection: 24 design => 23 design in A,C and D • ANOVA table (Table 6.13)
33
Response:Filtration Rate ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Source Model A C D AC AD Residual Cor Total
Sum of Squares 5535.81 1870.56 390.06 855.56 1314.06 1105.56 195.12 5730.94
Std. Dev. Mean C.V.
4.42 70.06 6.30
R-Squared 0.9660 Adj R-Squared Pred R-Squared
0.9489 0.9128
PRESS
499.52
Adeq Precision
20.841
DF 5 1 1 1 1 1 10 15
Mean Square 1107.16 1870.56 390.06 855.56 1314.06 1105.56 19.51
F Value 56.74 95.86 19.99 43.85 67.34 56.66
Prob >F < 0.0001 < 0.0001 0.0012 < 0.0001 < 0.0001 < 0.0001
34
• The regression model: Final Equation in Terms of Coded Factors: Filtration Rate = +70.06250 +10.81250 * Temperature +4.93750 * Concentration +7.31250 * Stirring Rate -9.06250 * Temperature * Concentration +8.31250 * Temperature * Stirring Rate
• Residual Analysis (P. 251) • Response surface (P. 252)
35
N o rm a l p lo t o f re s id ua ls 99
N o rm a l % p ro b a b ility
95 90 80 70 50 30 20 10 5 1
- 1.83
- 0.96
- 0.09
0.78
1.65
S tu d e n tize d R e s id u a ls
36
â&#x20AC;˘ Half-normal plot: the absolute value of the effect estimates against the cumulative normal probabilities. H a lf N o rm a l p lo t
H a l f N o r m a l % p r o b a b i l i ty
99
97
A
95 90
AC
85
AD
80 70
C
D
60 40 20 0
0 .00
5 .41
1 0.81
|E ffe c t|
1 6.22
2 1.63
37
â&#x20AC;˘ Example 6.3 (Data transformation in a Factorial Design)
A = drill load, B = flow, C = speed, D = type of mud, y = advance rate of the drill 38
â&#x20AC;˘ The normal probability plot of the effect estimates H a lf N o rm a l p lo t 99
H a lf N o rm a l % p ro b a b ility
97
B
95 90
C
85
D
80
BD BC
70 60 40 20 0
0.00
1.61
3.22
4.83
6.44
|E ffe c t|
39
â&#x20AC;˘ Residual analysis R e s id ua ls vs . P re d ic te d
N o rm a l p lo t o f re s id ua ls 2.58625 99
1.44875
90 80 70
R e s id u a ls
N o rm a l % p r o b a b ility
95
50
0.31125
30 20 10
- 0.82625
5 1
- 1.96375 - 1.96375
- 0.82625
0.31125
R e s id u a l
1.44875
2.58625
1.69
4 .70
7.70
10.71
P re d ic te d
40
13.71
• The residual plots indicate that there are problems with the equality of variance assumption • The usual approach to this problem is to employ a transformation on the response • In this example,
y* = ln y
41
H a lf N o rm a l p lo t 99
H a lf N o rm a l % p ro b a b ility
97
B
95 90
No indication of large interaction effects
C
85
D
80
Three main effects are large
70
What happened to the interactions?
60 40 20 0
0.00
0.29
0.58
0.87
1.16
|E ffe c t|
42
Response: adv._rate Transform: Natural log Constant: 0.000 ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Model 7.11 3 2.37 164.82 < 0.0001 B 5.35 1 5.35 371.49 < 0.0001 C 1.34 1 1.34 93.05 < 0.0001 D 0.43 1 0.43 29.92 0.0001 Residual 0.17 12 0.014 Cor Total 7.29 15 Std. Dev. 0.12 Mean 1.60 C.V. 7.51
R-Squared Adj R-Squared Pred R-Squared
0.9763 0.9704 0.9579
PRESS
Adeq Precision
34.391
0.31
43
â&#x20AC;˘ Following Log transformation Final Equation in Terms of Coded Factors: Ln(adv._rate) = +1.60 +0.58 * B +0.29 * C +0.16 * D
44
N o rm a l p lo t o f re s id ua ls
R e s id ua ls vs . P re d ic te d 0.194177
99
0.104087
90 80
R e s id u a ls
N o rm a l % p r o b a b ility
95
70 50
0.0139965
30 20 10
- 0.0760939
5 1
- 0.166184
- 0.166184
- 0.0760939
0.0139965
0.104087
0.194177
0.57
1.08
R e s id u a l
1.60
P r e d ic te d
45
2.11
2.63
• Example 6.4: – Two factors (A and D) affect the mean number of defects – A third factor (B) affects variability – Residual plots were useful in identifying the dispersion effect – The magnitude of the dispersion effects: 2 + S (i ) Fi* = ln 2 − S (i )
– When variance of positive and negative are equal, this statistic has an approximate normal distribution 46
6.6 The Addition of Center Points to the 2k Design • Based on the idea of replicating some of the runs in a factorial design • Runs at the center provide an estimate of error and allow the experimenter to distinguish between two possible models: k
k
k
First-order model (interaction) y = β 0 + ∑ β i xi + ∑∑ β ij xi x j + ε i =1
k
k
i =1 j >i
k
k
Second-order model y = β 0 + ∑ β i xi + ∑∑ β ij xi x j + ∑ β ii xi2 + ε i =1
i =1 j >i
i =1
47
yF = yC ⇒ no "curvature" The hypotheses are: k
H 0 : ∑ β ii = 0 i =1 k
H1 : ∑ β ii ≠ 0 i =1
SS Pure Quad
nF nC ( yF − yC ) 2 = nF + nC
This sum of squares has a single degree of freedom 48
â&#x20AC;˘ Example 6.6
nC = 5 Usually between 3 and 6 center points will work well Design-Expert provides the analysis, including the F-test for pure quadratic curvature 49
Response: yield ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Source Model A B AB Curvature Pure Error Cor Total
Sum of Squares 2.83 2.40 0.42 2.500E-003 2.722E-003 0.17 3.00
Std. Dev. Mean
0.21 40.44
R-Squared Adj R-Squared
C.V.
0.51
Pred R-Squared
N/A
PRESS
N/A
Adeq Precision
14.234
DF 3 1 1 1 1 4 8
Mean Square 0.94 2.40 0.42 2.500E-003 2.722E-003 0.043
F Value 21.92 55.87 9.83 0.058 0.063
Prob > F 0.0060 0.0017 0.0350 0.8213 0.8137
0.9427 0.8996
50
â&#x20AC;˘ If curvature is significant, augment the design with axial runs to create a central composite design. The CCD is a very effective design for fitting a second-order response surface model
51