The Lognormal Distribution
The lognormal distribution is commonly used to model the lives of units whose failure modes are of a fatigue-stress nature. Since this includes most, if not all, mechanical systems, the lognormal distribution can have widespread application. Consequently, the lognormal distribution is a good companion to the Weibull distribution when attempting to model these types of units. As may be surmised by the name, the lognormal distribution has certain similarities to the normal distribution. A random variable is lognormally distributed if the logarithm of the random variable is normally distributed. Because of this, there are many mathematical similarities between the two distributions. For example, the mathematical reasoning for the construction of the probability plotting scales and the bias of parameter estimators is very similar for these two distributions.
Lognormal Probability Density Function
The lognormal distribution is a 2-parameter distribution with parameters and . The pdf for this distribution is given by:
- . values are the times-to-failure
- = mean of the natural logarithms of the times-to-failure
- = standard deviation of the natural logarithms of the times-to-failure
The lognormal pdf can be obtained, realizing that for equal probabilities under the normal and lognormal pdfs, incremental areas should also be equal, or:
Taking the derivative yields:
Lognormal Distribution Functions
The Mean or MTTF
The mean of the lognormal distribution, , is discussed in Kececioglu :
The mean of the natural logarithms of the times-to-failure, , in terms of and is given by:
The median of the lognormal distribution, , is discussed in Kececioglu :
The mode of the lognormal distribution, , is discussed in Kececioglu :
The Standard Deviation
The standard deviation of the lognormal distribution, , is discussed in Kececioglu :
The standard deviation of the natural logarithms of the times-to-failure, , in terms of and is given by:
The Lognormal Reliability Function
The reliability for a mission of time , starting at age 0, for the lognormal distribution is determined by:
As with the normal distribution, there is no closed-form solution for the lognormal reliability function. Solutions can be obtained via the use of standard normal tables. Since the application automatically solves for the reliability we will not discuss manual solution methods. For interested readers, full explanations can be found in the references.
The Lognormal Conditional Reliability Function
The lognormal conditional reliability function is given by:
Once again, the use of standard normal tables is necessary to solve this equation, as no closed-form solution exists.
The Lognormal Reliable Life Function
As there is no closed-form solution for the lognormal reliability equation, no closed-form solution exists for the lognormal reliable life either. In order to determine this value, one must solve the following equation for :
The Lognormal Failure Rate Function
The lognormal failure rate is given by:
As with the reliability equations, standard normal tables will be required to solve for this function.
Characteristics of the Lognormal Distribution
- The lognormal distribution is a distribution skewed to the right.
- The pdf starts at zero, increases to its mode, and decreases thereafter.
- The degree of skewness increases as increases, for a given
- For the same , the pdf 's skewness increases as increases.
- For values significantly greater than 1, the pdf rises very sharply in the beginning, (i.e., for very small values of near zero), and essentially follows the ordinate axis, peaks out early, and then decreases sharply like an exponential pdf or a Weibull pdf with .
- The parameter, , in terms of the logarithm of the is also the scale parameter, and not the location parameter as in the case of the normal pdf.
- The parameter , or the standard deviation of the in terms of their logarithm or of their , is also the shape parameter and not the scale parameter, as in the normal pdf, and assumes only positive values.
Lognormal Distribution Parameters in ReliaSoft's Software
In ReliaSoft's software, the parameters returned for the lognormal distribution are always logarithmic. That is: the parameter represents the mean of the natural logarithms of the times-to-failure, while represents the standard deviation of these data point logarithms. Specifically, the returned is the square root of the variance of the natural logarithms of the data points. Even though the application denotes these values as mean and standard deviation, the user is reminded that these are given as the parameters of the distribution, and are thus the mean and standard deviation of the natural logarithms of the data. The mean value of the times-to-failure, not used as a parameter, as well as the standard deviation can be obtained through the QCP or the Function Wizard.
Estimation of the Parameters
As described before, probability plotting involves plotting the failure times and associated unreliability estimates on specially constructed probability plotting paper. The form of this paper is based on a linearization of the cdf of the specific distribution. For the lognormal distribution, the cumulative density function can be written as:
which results in the linear equation of:
The normal probability paper resulting from this linearized cdf function is shown next.
The process for reading the parameter estimate values from the lognormal probability plot is very similar to the method employed for the normal distribution (see The Normal Distribution). However, since the lognormal distribution models the natural logarithms of the times-to-failure, the values of the parameter estimates must be read and calculated based on a logarithmic scale, as opposed to the linear time scale as it was done with the normal distribution. This parameter scale appears at the top of the lognormal probability plot.
The process of lognormal probability plotting is illustrated in the following example.
8 units are put on a life test and tested to failure. The failures occurred at 45, 140, 260, 500, 850, 1400, 3000, and 9000 hours. Estimate the parameters for the lognormal distribution using probability plotting.
In order to plot the points for the probability plot, the appropriate unreliability estimate values must be obtained. These will be estimated through the use of median ranks, which can be obtained from statistical tables or the Quick Statistical Reference in Weibull++. The following table shows the times-to-failure and the appropriate median rank values for this example:
These points may now be plotted on normal probability plotting paper as shown in the next figure.
Draw the best possible line through the plot points. The time values where this line intersects the 15.85% and 50% unreliability values should be projected up to the logarithmic scale, as shown in the following plot.
The natural logarithm of the time where the fitted line intersects is equivalent to . In this case, . The value for is equal to the difference between the natural logarithms of the times where the fitted line crosses and At , ln . Therefore, .
Rank Regression on Y
Performing a rank regression on Y requires that a straight line be fitted to a set of data points such that the sum of the squares of the vertical deviations from the points to the line is minimized.
The least squares parameter estimation method, or regression analysis, was discussed in Parameter Estimation and the following equations for regression on Y were derived, and are again applicable:
In our case the equations for and are:
where the is estimated from the median ranks. Once and are obtained, then and can easily be obtained from the above equations.
The Correlation Coefficient
The estimator of is the sample correlation coefficient, , given by:
Lognormal Distribution RRY Example
14 units were reliability tested and the following life test data were obtained:
|Life Test Data|
|Data point index||Time-to-failure|
Assuming the data follow a lognormal distribution, estimate the parameters and the correlation coefficient, , using rank regression on Y.
Construct a table like the one shown next.
The median rank values ( ) can be found in rank tables or by using the Quick Statistical Reference in Weibull++ .
The values were obtained from the standardized normal distribution's area tables by entering for and getting the corresponding value ( ).
Given the values in the table above, calculate and :
The mean and the standard deviation of the lognormal distribution are obtained using equations in the Lognormal Distribution Functions section above:
The correlation coefficient can be estimated as:
The above example can be repeated using Weibull++ , using RRY.
The mean can be obtained from the QCP and both the mean and the standard deviation can be obtained from the Function Wizard.
Rank Regression on X
Performing a rank regression on X requires that a straight line be fitted to a set of data points such that the sum of the squares of the horizontal deviations from the points to the line is minimized.
Again, the first task is to bring our cdf function into a linear form. This step is exactly the same as in regression on Y analysis and all the equations apply in this case too. The deviation from the previous analysis begins on the least squares fit part, where in this case we treat as the dependent variable and as the independent variable. The best-fitting straight line to the data, for regression on X (see Parameter Estimation), is the straight line:
The corresponding equations for and are:
and the is estimated from the median ranks. Once and are obtained, solve the linear equation for the unknown , which corresponds to:
Solving for the parameters we get:
The correlation coefficient is evaluated as before using equation in the previous section.
Lognormal Distribution RRX Example
Using the same data set from the RRY example given above, and assuming a lognormal distribution, estimate the parameters and estimate the correlation coefficient, , using rank regression on X.
The table constructed for the RRY example also applies to this example as well. Using the values in this table we get:
Using for Mean and Standard Deviation we get:
The correlation coefficient is found using the equation in previous section:
Note that the regression on Y analysis is not necessarily the same as the regression on X. The only time when the results of the two regression types are the same (i.e., will yield the same equation for a line) is when the data lie perfectly on a line.
Using Weibull++ , with the Rank Regression on X option, the results are:
Maximum Likelihood Estimation
As it was outlined in Parameter Estimation, maximum likelihood estimation works by developing a likelihood function based on the available data and finding the values of the parameter estimates that maximize the likelihood function. This can be achieved by using iterative methods to determine the parameter estimate values that maximize the likelihood function. However, this can be rather difficult and time-consuming, particularly when dealing with the three-parameter distribution. Another method of finding the parameter estimates involves taking the partial derivatives of the likelihood equation with respect to the parameters, setting the resulting equations equal to zero, and solving simultaneously to determine the values of the parameter estimates. The log-likelihood functions and associated partial derivatives used to determine maximum likelihood estimates for the lognormal distribution are covered in Appendix D .
Note About Bias
See the discussion regarding bias with the normal distribution for information regarding parameter bias in the lognormal distribution.
Lognormal Distribution MLE Example
Using the same data set from the RRY and RRX examples given above and assuming a lognormal distribution, estimate the parameters using the MLE method.
Solution In this example we have only complete data. Thus, the partials reduce to:
Substituting the values of and solving the above system simultaneously, we get:
Using the equation for mean and standard deviation in the Lognormal Distribution Functions section above, we get:
The variance/covariance matrix is given by:
The method used by the application in estimating the different types of confidence bounds for lognormally distributed data is presented in this section. Note that there are closed-form solutions for both the normal and lognormal reliability that can be obtained without the use of the Fisher information matrix. However, these closed-form solutions only apply to complete data. To achieve consistent application across all possible data types, Weibull++ always uses the Fisher matrix in computing confidence intervals. The complete derivations were presented in detail for a general function in Confidence Bounds. For a discussion on exact confidence bounds for the normal and lognormal, see The Normal Distribution.
Fisher Matrix Bounds
Bounds on the Parameters
The lower and upper bounds on the mean, , are estimated from:
For the standard deviation, , is treated as normally distributed, and the bounds are estimated from:
where is defined by:
If is the confidence level, then for the two-sided bounds and for the one-sided bounds.
The variances and covariances of and are estimated as follows:
where is the log-likelihood function of the lognormal distribution.
Bounds on Time(Type 1)
The bounds around time for a given lognormal percentile, or unreliability, are estimated by first solving the reliability equation with respect to time, as follows:
The next step is to calculate the variance of
The upper and lower bounds are then found by:
Solving for and we get:
Bounds on Reliability (Type 2)
The reliability of the lognormal distribution is:
where . Let , the above equation then becomes:
The bounds on are estimated from:
The upper and lower bounds on reliability are:
Likelihood Ratio Confidence Bounds
Bounds on Parameters
As covered in Parameter Estimation, the likelihood confidence bounds are calculated by finding values for and that satisfy:
This equation can be rewritten as:
For complete data, the likelihood formula for the normal distribution is given by:
where the values represent the original time-to-failure data. For a given value of , values for and can be found which represent the maximum and minimum values that satisfy likelihood ratio equation. These represent the confidence bounds for the parameters at a confidence level where for two-sided bounds and for one-sided.
Example: LR Bounds on Parameters
Lognormal Distribution Likelihood Ratio Bound Example (Parameters)
Five units are put on a reliability test and experience failures at 45, 60, 75, 90, and 115 hours. Assuming a lognormal distribution, the MLE parameter estimates are calculated to be and Calculate the two-sided 75% confidence bounds on these parameters using the likelihood ratio method.
The first step is to calculate the likelihood function for the parameter estimates:
where are the original time-to-failure data points. We can now rearrange the likelihod ratio equation to the form:
Since our specified confidence level, , is 75%, we can calculate the value of the chi-squared statistic, We can now substitute this information into the equation:
It now remains to find the values of and which satisfy this equation. This is an iterative process that requires setting the value of and finding the appropriate values of , and vice versa.
The following table gives the values of based on given values of .
These points are represented graphically in the following contour plot:
(Note that this plot is generated with degrees of freedom , as we are only determining bounds on one parameter. The contour plots generated in Weibull++ are done with degrees of freedom , for use in comparing both parameters simultaneously.) As can be determined from the table the lowest calculated value for is 4.1145, while the highest is 4.4708. These represent the two-sided 75% confidence limits on this parameter. Since solutions for the equation do not exist for values of below 0.24 or above 0.48, these can be considered the two-sided 75% confidence limits for this parameter. In order to obtain more accurate values for the confidence limits on , we can perform the same procedure as before, but finding the two values of that correspond with a given value of Using this method, we find that the 75% confidence limits on are 0.23405 and 0.48936, which are close to the initial estimates of 0.24 and 0.48.
Bounds on Time and Reliability
In order to calculate the bounds on a time estimate for a given reliability, or on a reliability estimate for a given time, the likelihood function needs to be rewritten in terms of one parameter and time/reliability, so that the maximum and minimum values of the time can be observed as the parameter is varied. This can be accomplished by substituting a form of the normal reliability equation into the likelihood function. The normal reliability equation can be written as:
This can be rearranged to the form:
where is the inverse standard normal. This equation can now be substituted into likelihood function to produce a likelihood equation in terms of and :
The unknown variable depends on what type of bounds are being determined. If one is trying to determine the bounds on time for a given reliability, then is a known constant and is the unknown variable. Conversely, if one is trying to determine the bounds on reliability for a given time, then is a known constant and is the unknown variable. Either way, the above equation can be used to solve the likelihood ratio equation for the values of interest.
Example: LR Bounds on Time
Lognormal Distribution Likelihood Ratio Bound Example (Time)
For the same data set given for the parameter bounds example, determine the two-sided 75% confidence bounds on the time estimate for a reliability of 80%. The ML estimate for the time at is 55.718.
In this example, we are trying to determine the two-sided 75% confidence bounds on the time estimate of 55.718. This is accomplished by substituting and into the likelihood function, and varying until the maximum and minimum values of are found. The following table gives the values of based on given values of .
This data set is represented graphically in the following contour plot:
As can be determined from the table, the lowest calculated value for is 43.634, while the highest is 66.085. These represent the two-sided 75% confidence limits on the time at which reliability is equal to 80%.
Example: LR Bounds on Reliability
Lognormal Distribution Likelihood Ratio Bound Example (Reliability)
For the same data set given above for the parameter bounds example, determine the two-sided 75% confidence bounds on the reliability estimate for . The ML estimate for the reliability at is 64.261%.
In this example, we are trying to determine the two-sided 75% confidence bounds on the reliability estimate of 64.261%. This is accomplished by substituting and into the likelihood function, and varying until the maximum and minimum values of are found. The following table gives the values of based on given values of .
This data set is represented graphically in the following contour plot:
As can be determined from the table, the lowest calculated value for is 43.444%, while the highest is 81.508%. These represent the two-sided 75% confidence limits on the reliability at .
Bayesian Confidence Bounds
Bounds on Parameters
From Parameter Estimation, we know that the marginal distribution of parameter is:
- is , non-informative prior of .
is an uniform distribution from - to + , non-informative prior of . With the above prior distributions, can be rewritten as:
The one-sided upper bound of is:
The one-sided lower bound of is:
The two-sided bounds of is:
The same method can be used to obtained the bounds of .
Bounds on Time (Type 1)
The reliable life of the lognormal distribution is:
The one-sided upper on time bound is given by:
The above equation can be rewritten in terms of as:
From the posterior distribution of get:
The above equation is solved w.r.t. The same method can be applied for one-sided lower bounds and two-sided bounds on Time.
Bounds on Reliability (Type 2)
The one-sided upper bound on reliability is given by:
From the posterior distribution of is:
The above equation is solved w.r.t. The same method is used to calculate the one-sided lower bounds and two-sided bounds on Reliability.
Example: Bayesian Bounds
Lognormal Distribution Bayesian Bound Example (Parameters)
Determine the two-sided 90% Bayesian confidence bounds on the lognormal parameter estimates for the data given next:
The data points are entered into a times-to-failure data sheet. The lognormal distribution is selected under Distributions. The Bayesian confidence bounds method only applies for the MLE analysis method, therefore, Maximum Likelihood (MLE) is selected under Analysis Method and Use Bayesian is selected under the Confidence Bounds Method in the Analysis tab.
The two-sided 90% Bayesian confidence bounds on the lognormal parameter are obtained using the QCP and clicking on the Calculate Bounds button in the Parameter Bounds tab as follows:
Lognormal Distribution Examples
Complete Data Example
Determine the lognormal parameter estimates for the data given in the following table.
|Non-Grouped Times-to-Failure Data|
|Data point index||State F or S||State End Time|
Using Weibull++, the computed parameters for maximum likelihood are:
For rank regression on
For rank regression on
Complete Data RRX Example
From Kececioglu [20, p. 347]. 15 identical units were tested to failure and following is a table of their failure times:
Published results (using probability plotting):
Weibull++ computed parameters for rank regression on X are:
The small differences are due to the precision errors when fitting a line manually, whereas in Weibull++ the line was fitted mathematically.
Complete Data Unbiased MLE Example
From Kececioglu [19, p. 406]. 9 identical units are tested continuously to failure and failure times were recorded at 30.4, 36.7, 53.3, 58.5, 74.0, 99.3, 114.3, 140.1 and 257.9 hours.
The results published were obtained by using the unbiased model. Published Results (using MLE):
This same data set can be entered into Weibull++ by creating a data sheet capable of handling non-grouped time-to-failure data. Since the results shown above are unbiased, the Use Unbiased Std on Normal Data option in the User Setup must be selected in order to duplicate these results. Weibull++ computed parameters for maximum likelihood are:
Suspension Data Example
From Nelson [30, p. 324]. 96 locomotive controls were tested, 37 failed and 59 were suspended after running for 135,000 miles. The table below shows the failure and suspension times.
|Nelson's Locomotive Data|
|Number in State||F or S||Time|
The distribution used in the publication was the base-10 lognormal. Published results (using MLE):
Published 95% confidence limits on the parameters:
Published variance/covariance matrix:
To replicate the published results (since Weibull++ uses a lognormal to the base ), take the base-10 logarithm of the data and estimate the parameters using the normal distribution and MLE.
- Weibull++ computed parameters for maximum likelihood are:
- Weibull++ computed 95% confidence limits on the parameters:
- Weibull++ computed/variance covariance matrix:
Interval Data Example
Determine the lognormal parameter estimates for the data given in the table below.
|Non-Grouped Data Times-to-Failure with Intervals|
|Data point index||Last Inspected||State End Time|
This is a sequence of interval times-to-failure where the intervals vary substantially in length. Using Weibull++, the computed parameters for maximum likelihood are calculated to be:
For rank regression on :
For rank regression on :