Turnbull Reliability

User Notes for a Modified Turnbull Reliability Estimate

Oct. 1, 2001, revision number 4

April 2, 2008, revision number 5

Feb. 24, 2016, DevnTurn.xlsx

March 11, 2016 corrected errors

March 12, 2016 added EM-algorithm

Larry George, Ph.D., ASQ CRE and Fellow, pstlarry@cal.berkeley.edu

Introduction to left-and-right-censored life data

The Excel workbooks, Turn.xls and Turn1.xlsm, compute the nonparametric maximum likelihood estimates (npmle) of the reliability and return rate functions from ships and left and right “censored,” grouped ages at returns and survivors’ ages. The workbooks deal with missing ships and returns data. Turn.xls and Turn1.xlsm contain spreadsheets: data, Query2, npmle, Matrix, annual cohorts, and Summary, for entering ships returns data, for analyses, and for summarizing results. Ships are units sold, installed, or started, and returns are complaints, failures, repairs, or even spares sales counts, grouped by accounting interval.

“Censored” is a statistical term for unobserved information about ages at returns (complaints, failures, repairs, or even spares sales), before or after known times or ages. Left censored means ages at returns are unknown if they are less than censoring ages, and right censored means ages at returns are unknown if they are greater than censoring ages. Left censoring occurs when returns data collection begins some time after the product is shipped. Right censoring typically occurs at the ages of the oldest units in the field. Grouped ships and returns data are usually displayed in a “Nevada” table, table 1, so called, because the full table looks like Nevada on its side. Returns counts began in January 2001 but didn’t include 2001 ships or returns.

Table 1. Known ships and censored returns counts. Known returns are Jan-June 2001

The estimation problem is that, not only are some ages at returns unknown, the returns counts in censored age intervals are unknown. The workbook applies the method in [Turnbull, Giolo] modified to estimate unknown numbers of censored returns. Turnbull’s method requires total left-and-right-censored return counts even though their failure times are unknown. The method also works when totals are unknown. Turn.xls also estimates some unknown, early ships counts. 

In Turn.xls and Turn1.xlsm, the censored totals are unknown and estimated. This is implemented by recursive iteration; i.e., Excel formulas set up circular computations. Estimated missing returns counts from table 1 are used to estimate reliability, and the reliability estimate is used to estimate missing returns counts, etc.

This modification of Turnbull’s method also yields the maximum likelihood estimator by the Estimation-Maximization (EM) algorithm [Dempster et al 1977], because the log likelihood,

Sum [r(j;left)*ln F(t(j)) + r(j;observed)*ln f(t(j)) + r(j;right)*ln (1-F(t(j))],

except for the combinatorial constant, is linear in the unknown, censored returns counts, r(j;left) and r(j;right) for the j-th censoring interval. Here 1-F(.) is the unknown reliability function to be estimated, f(.) is the corresponding probability density function, and the t(j) are the censoring ages or observed ages at returns. The first sum is over left censored intervals, the second sum is over all observed return times, and the third sum is over all right censored intervals.

Data

The Turn.xls:data spreadsheet contains the ships data and estimates of missing ships data. I assumed ships ramped from zero to the average in the first six months. You may wish to change the ships data. Enter old ships data in the data spreadsheet, and enter new ships data to the right of the old data. Revised ships data will be used automatically in other spreadsheet calculations.

The Turn.xls:Query2 spreadsheet contains year 2001 RMA returns data. It estimates the month of manufacture from the date code and the year of manufacture. Then it interpolates the ages at return, for the observed year 2001 returns. These returns were entered manually into table 1 of the Turn.xls:Matrix spreadsheet and shown in table 1 above.

Enter new RMA data for Jul-01 through Dec-01 into the Turn.xls:Matrix or Turn1.xlsm:Matrix spreadsheet. Enter returns counts into the cell in the row and column that correspond to the month of manufacture and the month of return, respectively. Data already in the matrix may be changed.

Computation

The Turn1.xlsm workbook contains an "Auto_Open" macro that initializes the iterative computations of the EM-algorithm, if you allow macros to run. It also runs a Trace() VBA function to sum the diagonal of a square matrix.  

The Matrix spreadsheet, table 1, estimates missing returns from the reliability estimate in table 2, using the Kaplan-Meier reliability npmle [Kaplan and Meier]. The data in table 1 are used to estimate the age-specific reliability and return rate functions in columns two through seven of table 2. Table 1 in the workbooks estimate expected values of missing returns, and table 2 implements the Kaplan-Meier reliability npmle.

If you change data, then you will need to recalculate estimates by hitting F9. Repeat hitting F9 until the reliability estimates in table 2 of the Matrix spreadsheet stop changing. If nothing happens, please check the Files menu, Options command, Formula option and make sure that iterations are enabled. I allow 100 iterations. If you have less data, simply make cells not in your data-range equal to zero. If you add data, insert rows and columns and copy formulas very carefully, following the formula patterns in existing cells. Send your workbook to me for checking, pstlarry@cal.berkeley.edu.  

Columns two through seven of the Matrix spreadsheet, table 2, estimate the reliability and monthly return rate functions from all data, including estimates. The pdf (discrete probability density function) in column seven of table 2 is used in table 1 to estimate the expected values of missing returns. The remaining columns of table 2 estimate the reliability and monthly return rate functions from parts made each year.

The npmle spreadsheet of Turn1.xlsm computes the reliability npmle by optimization of the log-likelihood function, using Solver, not the EM-algorithm recursive iteration. Estimates agree closely.

The cohort spreadsheets of Turn1.xlsm estimate the reliability of units shipped in 2011, 2012, 2013, and 2014. These estimates differ from cohort reliability estimates in the Matrix spreadsheet, because the cohort spreadsheets use only ships and returns from each annual cohort; the Matrix spreadsheet uses estimated returns from all ships for reliability estimates for 2011, 2012, 2013, and 2014.

What if you don't have life data?

If returns counts are not identifiable corresponding to ships as in table 1, then you have data similar to table 2. If total returns counts since ships began are known as in table 2, then the npmle is different [George and Agrawal, http://sites.google.com/site/fieldreliability/home/ProdSpec]. If some returns counts are left-censored, the EM-algorithm may be used to find the reliability npmle, but it is not as simple as estimating unknown returns recursively, because the log likelihood function is not linear in the missing returns counts. Nevertheless it has been done; MGInfiM.nb and MarEMR1.xls. The latter workbook simultaneously estimates reliability in terms of age at failure and the distribution of sell-through time.

Table 2. Ships and returns counts when returns are not associated with ships

Summary

The Summary spreadsheets graph the reliability, broom charts, and monthly return rate estimates, by cohort. Modify the graph titles, data ranges, and scales to suit your needs.

The Turn.xls:Summary spreadsheet also estimates the empirical standard deviations of the reliability and monthly return rate estimates. It uses these standard deviations to estimate empirical confidence limits. (These are not confidence limits on entire functions; they are confidence limits on individual values.) The npmle reliability and return rate should be approximately asymptotically normally distributed, so the normal distribution is used for confidence limits. You can change the confidence level, called alpha, by changing the value in the cell in the third row, column eight. Hit F9 to re-compute the confidence limits.

Note on using the workbooks

If you are accustomed to the A1 style of spreadsheet cell notation, then select the Tools menu, Options command. Excel will display a tab dialog box. Select the General tab. Click on the R1C1 Reference Style check box to remove the check mark.

The Turn.xls:Matrix spreadsheet makes estimates from each year of ships data to see whether there have been changes or improvements in reliability. There doesn’t appear to be any change in reliability from year to year. The Turn1.xlsm Summary shows annual cohorts differ in reliability.

Contact me if you have questions or would like additional functions such as forecasts and early warning control charts. Turn.xls and Turn1.xlsm could be used to estimate the standard deviations and confidence limits on returns forecasts, for early warning and for spares stock level recommendations. I would be happy to revise the workbooks and the user guide so that they will satisfy your needs.

References

Dempster A.P., N.M. Laird, D.B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. (series B), Vol.  39, pp 1-38, 1977

George, L. L. and A. Agrawal, “Estimation of a Hidden Service Distribution of an M/G/Infinity Service System,” Naval Research Logistics Quarterly, pp. 549-555, September 1973, Vol. 20, No. 3

Giolo, Suely Ruiz, “Turnbull’s Nonparametric Estimator for Interval-Censored Data,” Tech. Report, Univ. of Paraná, 2004

Kaplan, E. L. and P. Meier, “Nonparametric estimation from incomplete observations,” J. Amer. Statist. Assn., Vol. 53, pp 457-481, June 1958

Turnbull, Bruce W., “Nonparametric estimation of a survivorship function with doubly censored data,” J. Amer. Statist. Assn., Vol. 69, No. 345, pp 169-174, March 1974