Response Surface Methods for Optimization
The experiment designs mentioned in Two Level Factorial Experiments and Highly Fractional Factorial Designs help the experimenter identify factors that affect the response. Once the important factors have been identified, the next step is to determine the settings for these factors that result in the optimum value of the response. The optimum value of the response may either be a maximum value or a minimum value, depending upon the product or process in question. For example, if the response in an experiment is the yield from a chemical process, then the objective might be to find the settings of the factors affecting the yield so that the yield is maximized. On the other hand, if the response in an experiment is the number of defects, then the goal would be to find the factor settings that minimize the number of defects. Methodologies that help the experimenter reach the goal of optimum response are referred to as response surface methods. These methods are exclusively used to examine the "surface," or the relationship between the response and the factors affecting the response. Regression models are used for the analysis of the response, as the focus now is on the nature of the relationship between the response and the factors, rather than identification of the important factors.
Response surface methods usually involve the following steps:
- The experimenter needs to move from the present operating conditions to the vicinity of the operating conditions where the response is optimum. This is done using the method of steepest ascent in the case of maximizing the response. The same method can be used to minimize the response and is then referred to as the method of steepest descent.
- Once in the vicinity of the optimum response the experimenter needs to fit a more elaborate model between the response and the factors. Special experiment designs, referred to as RSM designs, are used to accomplish this. The fitted model is used to arrive at the best operating conditions that result in either a maximum or minimum response.
- It is possible that a number of responses may have to be optimized at the same time. For example, an experimenter may want to maximize strength, while keeping the number of defects to a minimum. The optimum settings for each of the responses in such cases may lead to conflicting settings for the factors. A balanced setting has to be found that gives the most appropriate values for all the responses. Desirability functions are useful in these cases.
Method of Steepest Ascent
The first step in obtaining the optimum response settings, after the important factors have been identified, is to explore the region around the current operating conditions to decide what direction needs to be taken to move towards the optimum region. Usually, a first order regression model (containing just the main effects and no interaction terms) is sufficient at the current operating conditions because the operating conditions are normally far from the optimum response settings. The experimenter needs to move from the current operating conditions to the optimum region in the most efficient way by using the minimum number of experiments. This is done using the method of steepest ascent. In this method, the contour plot of the first order model is used to decide the settings for the next experiment, in order to move towards the optimum conditions. Consider a process where the response has been found to be a function of two factors. To explore the region around the current operating conditions, the experimenter fits the following first order model between the response and the two factors:
The response surface plot for the model, along with the contours, is shown in the figure below. It can be seen in the figure that in order to maximize the response, the most efficient direction in which to move the experiment is along the line perpendicular to the contours. This line, also referred to as the path of steepest ascent, is the line along which the rate of increase of the response is maximum. The steps along this line to move towards the optimum region are proportional to the regression coefficients, of the fitted first order model.
Experiments are conducted along each step of the path of steepest ascent until an increase in the response is not seen. Then, a new first order model is fit at the region of the maximum response. If the first order model shows a lack of fit, then this indicates that the experimenter has reached the vicinity of the optimum. RSM designs are then used explore the region thoroughly and obtain the point of the maximum response. If the first order model does not show a lack of fit, then a new path of steepest ascent is determined and the process is repeated.
The yield from a chemical process is found to be affected by two factors: reaction temperature and reaction time. The current reaction temperature is 230 and the reaction time is 65 minutes. The experimenter wants to determine the settings of the two factors such that maximum yield can be obtained from the process. To explore the region around the current operating conditions, the experimenter decides to use a single replicate of the design. The range of the factors for this design are chosen to be (225, 235) for the reaction temperature and (55, 75) minutes for the reaction time. The unreplicated design is also augmented with five runs at the center point to estimate the error sum of squares, , and check for model adequacy. The response values obtained for this design are shown next.
In DOE++, this design can be set up using the properties shown next.
The resulting design and the analysis results are shown next.
Note that the results shown are in terms of the coded values of the factors (taking -1 as the value of the lower settings for reaction temperature and reaction time and +1 as the value for the higher settings for these two factors). The results show that the factors, (temperature) and (time), affect the response significantly but their interaction does not affect the response. Therefore the interaction term can be dropped from the model for this experiment. The results also show that Curvature is not a significant factor. This indicates that the first order model is adequate for the experiment at the current operating conditions. Using these two conclusions, the model for the current operating conditions, in terms the coded variables is:
where represents the yield and and are the predictor variables for the two factors, and , respectively. To further confirm the adequacy of the model of the equation given above, the experiment can be analyzed again after dropping the interaction term, . The results are shown next.
The results show that the lack-of-fit for this model (because of the deficiency created in the model by the absence of the interaction term) is not significant, confirming that the model is adequate.
Path of Steepest Ascent
The contour plot for the model used in the above example is shown next.
The regression coefficients for the model are and . To move towards the optimum, the experimenter needs to move along the path of steepest ascent, which lies perpendicular to the contours. This path is the line through the center point of the current operating conditions (, ) with a slope of . Therefore, in terms of the coded variables, the experiment should be moved 1.1625 units in the direction for every 0.4875 units in the direction. To move along this path, the experimenter decides to use a step-size of 10 minutes for the reaction time, . The coded value for this step size can be obtained as follows. Recall from Multiple Linear Regression Analysis that the relationship between coded and actual values is:
Thus, for a step-size of 10 minutes, the equivalent step size in coded value for is:
In terms of the coded variables, the path of steepest ascent requires a move of 1.1625 units in the direction for every 0.4875 units in the direction. The step-size for , in terms of the coded value corresponding to any step-size in , is:
Therefore, the step-size for the reaction temperature, , in terms of the coded variables is:
This corresponds to a step of approximately 12 for temperature in terms of the actual value as shown next:
Using a step of 12 and 10 minutes, the experimenter conducts experiments until no further increase is observed in the yield. The yield values at each step are shown in the table given next.
The yield starts decreasing after the reaction temperature of 350 and the reaction time of 165 minutes, indicating that this point may lie close to the optimum region. To analyze the vicinity of this point, a design augmented by five center points is selected. The range of exploration is chosen to be 345 to 355 for reaction temperature and 155 to 175 minutes for reaction time. The response values recorded are shown next.
The results for this design are shown next.
In the results, Curvature is displayed as a significant factor. This indicates that the first order model is not adequate for this region of the experiment and a higher order model is required. As a result, the methodology of steepest ascent can no longer be used. The presence of curvature indicates that the experiment region may be close to the optimum. Special designs that allow the use of second order models are needed at this point.
A second order model is generally used to approximate the response once it is realized that the experiment is close to the optimum response region where a first order model is no longer adequate. The second order model is usually sufficient for the optimum region, as third order and higher effects are seldom important. The second order regression model takes the following form for factors:
The model contains regression parameters that include coefficients for main effects (), coefficients for quadratic main effects () and coefficients for two factor interaction effects (... ). A full factorial design with all factors at three levels would provide estimation of all the required regression parameters. However, full factorial three level designs are expensive to use as the number of runs increases rapidly with the number of factors. For example, a three factor full factorial design with each factor at three levels would require runs while a design with four factors would require runs. Additionally, these designs will estimate a number of higher order effects which are usually not of much importance to the experimenter. Therefore, for the purpose of analysis of response surfaces, special designs are used that help the experimenter fit the second order model to the response with the use of a minimum number of runs. Examples of these designs are the central composite and Box-Behnken designs.
Central Composite Designs
Central composite designs are two level full factorial () or fractional factorial () designs augmented by a number of center points and other chosen runs. These designs are such that they allow the estimation of all the regression parameters required to fit a second order model to a given response.
The simplest of the central composite designs can be used to fit a second order model to a response with two factors. The design consists of a full factorial design augmented by a few runs at the center point (such a design is shown in figure (a) given below). A central composite design is obtained when runs at points (), (), () and () are added to this design. These points are referred to as axial points or star points and represent runs where all but one of the factors are set at their mid-levels. The number of axial points in a central composite design having factors is . The distance of the axial points from the center point is denoted by and is always specified in terms of coded values. For example, the central composite design in figure (b) given below has , while for the design of figure (c) .
It can be noted that when , each factor is run at five levels (, , , and ) instead of the three levels of , and . The reason for running central composite designs with is to have a rotatable design, which is explained next.
A central composite design is said to be rotatable if the variance of any predicted value of the response, , for any level of the factors depends only on the distance of the point from the center of the design, regardless of the direction. In other words, a rotatable central composite design provides constant variance of the estimated response corresponding to all new observation points that are at the same distance from the center point of the design (in terms of the coded variables).
The variance of the predicted response at any point, , is given as follows:
The contours of for the central composite design in figure (c) above are shown in the figure below. The contours are concentric circles indicating that the central composite design of figure (c) is rotatable. Rotatability is a desirable property because the experimenter does not have any prior information about the location of the optimum. Therefore, a design that provides equal precision of estimation in all directions would be preferred. Such a design will assure the experimenter that no matter what direction is taken to search for the optimum, he/she will be able to estimate the response value with equal precision. A central composite design is rotatable if the value of for the design satisfies the following equation:
where is the number of replicates of the runs in the original factorial design and is the number of replicates of the runs at the axial points. For example, a central composite design with two factors, having a single replicate of the original factorial design, and a single replicate of all the axial points, would be rotatable for the following value:
Thus, a central composite design in two factors, having a single replicate of the original design and axial points, and with , is a rotatable design. This design is shown in figure (c) above.
A central composite design is said to be spherical if all factorial and axial points are at same distance from the center of the design. Spherical central composite designs are obtained by setting . For example, the rotatable design in the figure above (c) is also a spherical design because for this design .
Central composite designs in which the axial points represent the mid levels for all but one of the factors are also referred to as face-centered central composite designs. For these designs, and all factors are run at three levels, which are , and in terms of the coded values (see the figure below).
In Highly Fractional Factorial Designs, highly fractional designs introduced by Plackett and Burman were discussed. Plackett-Burman designs are used to estimate main effects in the case of two level fractional factorial experiments using very few runs. [G. E. P. Box and D. W. Behnken (1960)] introduced similar designs for three level factors that are widely used in response surface methods to fit second-order models to the response. The designs are referred to as Box-Behnken designs. The designs were developed by the combination of two level factorial designs with incomplete block designs. For example, the figure below shows the Box-Behnken design for three factors. The design is obtained by the combination of design with a balanced incomplete block design having three treatments and three blocks (for details see [Box 1960, Montgomery 2001]).
The advantages of Box-Behnken designs include the fact that they are all spherical designs and require factors to be run at only three levels. The designs are also rotatable or nearly rotatable. Some of these designs also provide orthogonal blocking. Thus, if there is a need to separate runs into blocks for the Box-Behnken design, then designs are available that allow blocks to be used in such a way that the estimation of the regression parameters for the factor effects are not affected by the blocks. In other words, in these designs the block effects are orthogonal to the other factor effects. Yet another advantage of these designs is that there are no runs where all factors are at either the or levels. For example, in the figure below the representation of the Box-Behnken design for three factors clearly shows that there are no runs at the corner points. This could be advantageous when the corner points represent runs that are expensive or inconvenient because they lie at the end of the range of the factor levels. A few of the Box-Behnken designs available in DOE++ are presented in Appendix F.
Continuing with the example in Method of Steepest Ascent, the first order model was found to be inadequate for the region near the optimum. Once the experimenter realized that the first order model was not adequate (for the region with a reaction temperature of 350 and reaction time of 165 minutes), it was decided to augment the experiment with axial runs to be able to complete a central composite design and fit a second order model to the response. Notice the advantage of using a central composite design, as the experimenter only had to add the axial runs to the design with center point runs, and did not have to begin a new experiment. The experimenter decided to use to get a rotatable design. The obtained response values are shown in the figure below.
Such a design can be set up in DOE++ using the properties shown in the figure below.
The resulting design is shown in the figure shown next.
Results from the analysis of the design are shown in the next figure.
The results in the figure above show that the main effects, and , the interaction, , and the quadratic main effects, and , (represented as AA and BB in the figure) are significant. The lack-of-fit test also shows that the second order model with these terms is adequate and a higher order model is not needed. Using these results, the model for the experiment in terms of the coded values is:
The response surface and the contour plot for this model, in terms of the actual variables, are shown in the below figures (a) and (b), respectively.
Analysis of the Second Order Model
Once a second order model is fit to the response, the next step is to locate the point of maximum or minimum response. The second order model for factors can be written as:
The point for which the response, , is optimized is the point at which the partial derivatives, , , are all equal to zero. This point is called the stationary point. The stationary point may be a point of maximum response, minimum response or a saddle point. These three conditions are shown in the following figures (a), (b) and (c) respectively.
Notice that these conditions are easy to identify, in the case of two factor experiments, by the inspection of the contour plots. However, when more than two factors exist in an experiment, then the general mathematical solution for the location of the stationary point has to be used. The equation given above can be written in matrix notation as:
Then the stationary point can be determined as follows:
Thus, the stationary point is:
The optimum response is the response corresponding to . The optimum response can be obtained to get:
Once the stationary point is known, it is necessary to determine if it is a maximum or minimum or saddle point. To do this, the second order model has to be transformed to the canonical form. This is done by transforming the model to a new coordinate system such that the origin lies at the stationary point and the axes are parallel to the principal axes of the fitted response surface, shown next.
The resulting model equation then takes the following form:
where the s are the transformed independent variables, and s are constants that are also the eigenvalues of the matrix . The nature of the stationary point is known by looking at the signs of the s. If the s are all negative, then is a point of maximum response. If the s are all positive then is a point of minimum response. If the s have different signs, then is a saddle point.
Then the and matrices for this model are:
The stationary point is:
Then, in terms of the actual values, the stationary point can be found as:
To find the nature of the stationary point the eigenvalues of the matrix can be obtained as follows using the determinant of the matrix :
This gives us:
Solving the quadratic equation in returns the eigenvalues and . Since both the eigenvalues are negative, it can be concluded that the stationary point is a point of maximum response. The predicted value of the maximum response can be obtained as:
In DOE++, the maximum response can be obtained by entering the required values as shown in the figure below. In the figure, the goal is to maximize the response and the limits of the search range for maximizing the response are entered as 90 and 100. The value of the maximum response and the corresponding values of the factors obtained are shown in the second figure following. These values match the values calculated in this example.
In many cases, the experimenter has to optimize a number of responses at the same time. For the example in Method of Steepest Ascent, assume that the experimenter has to also consider two other responses: cost of the product (which should be minimized) and the pH of the product (which should be close to 7 so that the product is neither acidic nor basic). The data is presented in the figure below.
The problem in dealing with multiple responses is that now there might be conflicting objectives because of the different requirements of each of the responses. The experimenter needs to come up with a solution that satisfies each of the requirements as much as possible without compromising too much on any of the requirements. The approach used in DOE++ to deal with optimization of multiple responses involves the use of desirability functions that are discussed next (for details see [Derringer and Suich, 1980]).
Under this approach, each th response is assigned a desirability function, , where the value of varies between 0 and 1. The function, is defined differently based on the objective of the response. If the response is to be maximized, as in the case of the previous example where the yield had to be maximized, then is defined as follows:
where represents the target value of the th response, , represents the acceptable lower limit value for this response and represents the weight. When the function is linear. If then more importance is placed on achieving the target for the response, . When , less weight is assigned to achieving the target for the response, . A graphical representation is shown in figure (a) below.
If the response is to be minimized, as in the case when the response is cost, is defined as follows:
Here represents the acceptable upper limit for the response (see figure (b) above).
There may be times when the experimenter wants the response to be neither maximized nor minimized, but instead stay as close to a specified target as possible. For example, in the case where the experimenter wants the product to be neither acidic nor basic, there is a requirement to keep the pH close to the neutral value of 7. In such cases, the desirability function is defined as follows (see figure (c) above):
Once a desirability function is defined for each of the responses, assuming that there are responses, an overall desirability function is obtained as follows:
where the s represent the importance of each response. The greater the value of , the more important the response with respect to the other responses. The objective is to now find the settings that return the maximum value of .
To illustrate the use of desirability functions, consider the previous example with the three responses of yield, cost and pH. The response surfaces for the two additional responses of cost and pH are shown next in the figures (a) and (b), respectively.
In terms of actual variables, the models obtained for all three responses are as shown next:
Assume that the experimenter wants to have a target yield value of 95, although any value of yield greater than 94 is acceptable. Then the desirability function for yield is:
For the cost, assume that the experimenter wants to lower the cost to 400, although any cost value below 415 is acceptable. Then the desirability function for cost is:
For the pH, a target of 7 is desired but values between 6.9 and 7.1 are also acceptable. Thus, the desirability function here is:
Notice that in the previous equations all weights used ( s) are 1. Thus, all three desirability functions are linear. The overall desirability function, assuming equal importance () for all the responses, is:
The objective of the experimenter is to find the settings of and such that the overall desirability, , is maximum. In DOE++, the settings for the desirability functions for each of the three responses can be entered as shown in the next figure.
Based on these settings, DOE++ solves this optimization problem to obtain the following solution:
The overall desirability achieved with this solution can be calculated easily. The values of each of the response for these settings are:
Based on the response values, the individual desirability functions are:
Then the overall desirability is:
This is the same as the Global Desirability displayed by DOE++ in the figure above. At times, a number of solutions may be obtained from DOE++, and it is up to the experimenter to choose the most feasible one.