A Tukey’s g-and-h distribution based approach with PSO for degradation reliability modeling

Jinbao Zhang (Harbin Institute of Technology, Harbin, China)
Yongqiang Zhao (School of Mechatronic Engineering, Harbin Institute of Technology, Harbin, China)
Ming Liu (Harbin Institute of Technology, Harbin, China)
Lingxian Kong (Harbin Institute of Technology, Harbin, China)

Engineering Computations

ISSN: 0264-4401

Article publication date: 7 August 2019

Issue publication date: 15 August 2019

2399

Abstract

Purpose

A generalized distribution with wide range of skewness and elongation will be suitable for the data mining and compatible for the misspecification of the distribution. Hence, the purpose of this paper is to present a distribution-based approach for estimating degradation reliability considering these conditions.

Design/methodology/approach

Tukey’s g-and-h distribution with the quantile expression is introduced to fit the degradation paths of the population over time. The Newton–Raphson algorithm is used to approximately evaluate the reliability. Simulation verification for parameter estimation with particle swarm optimization (PSO) is carried out. The effectiveness and validity of the proposed approach for degradation reliability is verified by the two-stage verification and the comparison with others’ work.

Findings

Simulation studies have proved the effectiveness of PSO in the parameter estimation. Two degradation datasets of GaAs laser devices and crack growth are performed by the proposed approach. The results show that it can well match the initial failure time and be more compatible than the normal distribution and the Weibull distribution.

Originality/value

Tukey’s g-and-h distribution is first proposed to investigate the influence of the tail and the skewness on the degradation reliability. In addition, the parameters of the Tukey’s g-and-h distribution is estimated by PSO with root-mean-square error as the object function.

Keywords

Citation

Zhang, J., Zhao, Y., Liu, M. and Kong, L. (2019), "A Tukey’s g-and-h distribution based approach with PSO for degradation reliability modeling", Engineering Computations, Vol. 36 No. 5, pp. 1699-1715. https://doi.org/10.1108/EC-11-2017-0428

Publisher

:

Emerald Publishing Limited

Copyright © 2019, Jinbao Zhang, Yongqiang Zhao, Ming Liu and Lingxian Kong.

License

Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode


1. Introduction

Reliability is a significant indicator to the performance estimation and the quality control of products. For highly reliably products, failure data are uneasy to get even in the accelerated test. In this case, degradation data will be an efficient alternative for reliability estimation. The existing models for degradation reliability in detail have been summarized (Gorjian et al., 2009; Ye and Xie, 2015). A common approach is to extrapolate the degradation paths to the failure threshold and obtain the pseudo-lifetime data for reliability estimation (Lu and Meeker, 1993). The other way to estimate reliability is based on the relationship between the degradation data and the failure threshold (Zuo et al., 1999). The relationship between models based on the degradation data and models based on the pseudo-lifetime data are stated by Meeker and Escobar (1998). However, the present models for degradation reliability are based on normal distribution, Weibull distribution and so on, and misspecification of distributions may exist. Hence, a generalized distribution is required.

The g-and-h distribution generated by the transformation of the standard normal distribution, is first proposed by Tukey (1977). Basic properties are studied (Jorge and Boris, 1984; Hoaglin et al., 1985; Macgillivray, 1992) and the modifications are extensively investigated (Morgenthaler and Tukey, 2000; Fischer, 2010). It has extensive range of skewness and elongation and corresponding control parameters, which could depict the properties of data in detail. Comparison of goodness-of-fit has been performed among gh-transformed distributions and other distributions, and the results proved that in most cases the original Tukey’s g-and-h distribution would provide a better fit (Klein and Mittnik, 2002). In practice, Tukey’s g-and-h distribution has been widely used in various fields, such as common stock return (Badrinath and Chatterjee, 1991), modeling extreme wind speeds (Dupuis and Field, 2004), operational risk (Degen et al., 2007), value-at-risk (Jiménez and Arunachalam, 2011), and simulation studies (He and Raghunathan, 2006). Nevertheless, it has been not investigated in reliability yet.

The advantages of Tukey’s g-and-h distribution for fitting degradation paths are summarized as follows:

  • It has wide range of skewness and kurtosis which could be robust with extreme values and outliers, and can judge whether the distribution is heavy tail.

  • It is expressed by a quantile regression and can be easily introduced into the location-scale model, which will well deal with heteroscedasticity over time.

  • It can accommodate a rich class of common univariate distributions, such as normal, lognormal, Weibull and exponential (Morgenthaler and Tukey, 2000; Chaudhuri and Ghosh, 2016).

One prominent problem lies in the parameter estimation of Tukey’s g-and-h distribution, especially for the degradation situation considering time. There are mainly three methods for parameter estimation, which are quantile method (Hoaglin et al., 1985), method of moments (Headrick et al., 2008), and maximum likelihood method (Majumder and Ali, 2008), respectively. Maximum likelihood estimation (MLE) is the most popular approach; nevertheless, its likelihood function cannot be written in closed form and finally induce the intractable computation. Therefore, numerical methods are introduced to solve the MLE by approximation, which may be complicated (Rayner and Macgillivray, 2002a; Xu and Genton, 2015; Bee and Trapin, 2016). Method of moments is another approach for parameter estimation, but it may be not appropriate for the situation varying with time.

Quantile-based estimation method would be a natural alternative for Tukey’s g-and-h distribution. A weighted quantile-based estimation shows better effects in a number of ways than the numerical MLE for smaller and moderately-sized samples (Rayner and Macgillivray, 2002b). In addition, a rapidly convergent quantile based least squares estimation method (Xu et al., 2014) has been developed to fit the g-and-h distributional family parameters. However, when modeling the degradation path, more parameters including g and h should be estimated. In order to overcome this difficulty, particle swarm optimization (PSO) (Kennedy and Eberhart, 1995) as a heuristic optimization method is brought in for quantile-based estimation. PSO has many advantages including simplicity of implementation, robust, and has been successfully used in data-driven fuzzy clustering (Chen and Zhao, 2009), solving the likelihood objective function of 3-p Weibull distribution for parameter estimation (Örkcü et al., 2015).

In this paper, emphasize is put on the reliability estimation by Tukey’s g-and-h distribution for the degradation data. The rest of the paper is organized as follows. In Section 2, Tukey’s g-and-h distribution and its properties are presented, the influence of parameters on the distributions are investigated as well. In addition, the quantile-based estimation is introduced in detail. Section 3 employs an intelligent optimization method called PSO for the parameter estimation. The practical applications for degradation reliability estimation concerning GaAs laser device and crack growth are performed in Section 4. Conclusion is given in Section 5.

2. Tukey’s g-and-h distribution

Definition Let Z be a random variable with the standard normal distribution, then a random variable U defined by:

(1) U=qg,h(Z)=g1[exp(gZ)1]exp(hZ2/2)
is said to have Tukey’s g-and-h distribution. qg,h(z) should be a strictly increasing monotonic function of z, that is, qg,h(z)>0 with parameters gR and h 0. When h <0, the transformation is no longer strictly monotone increasing and induces a finite support of data (Klein and Mittnik, 2002), so in this paper, the parameter h will be constrained in the interval [0, ∞).

Two subclasses of distributions based on equation (1) are the g and the h classes which are defined as follows:

(2) U=qg,0(Z)=limh0qg,h(Z)=g1{exp(gZ)1}
(3) U=q0,h(Z)=limg0qg,h(Z)=zexp(hZ2/2)

When g = h =0, U = Z, so U has a standard normal distribution. Furthermore, the standard normal distribution could be replaced by other symmetric distributions, such as Laplace distribution, generalized error distribution and so on (Morgenthaler and Tukey, 2000; Klein and Mittnik, 2002).

2.1 Properties

Some properties are introduced for the following verification. The probabilistic density function (pdf) of standard normal distribution is:

(4) φZ(z)=12πexp(z22)
and its cumulative density function (cdf) is:
(5) ΦZ(z)=Pr(Z<z)=12πzexp(w22)dw,<z<+.

The formulae for k-th moment of the g-and-h distribution is:

(6) E[q(z)k]=+q(z)kfZ(z)dz

Then the mean of Tukey’s g-and-h distribution is:

(7) μg,h=E[qg,h(z)]={1g1h[exp(12g21h)1],g0,0h<10,g=0
and the variance:
(8) σg,h2=D[qg,h(z)]=E[qg,h(z)2]E2[qg,h(z)]
where
(9) E[qg,h(z)2]=12exp{g2/(24h)}+exp{2g2/(12h)}g2(12h)1/2

2.2 Relationship with reliability

Since the function

(10) u=qg,h(z)=g1(egz1)ehz2/2
for h >0 is strictly increasing, the cdf of a g-and-h random variable U can be written as:
(11) F(u)=F(z)=Φ[qg,h1(u)]
where z=qg,h1(u), which can be solved with the Newton-Raphson method, that is:
(12) z(n0+1)=z(n0)f(z(n0))/f(z(n0))
in which n0 is the iterative number, and by transformation of equation (10), we can obtain the function
(13) f(z)=exp(hz2/2+gz)exp(hz2/2)gu
and its first derivative
(14) f(z)=(hz+g)exp(hz2/2+gz)hzexp(hz2/2)

The corresponding reliability function is:

(15) R(u)=1F(u)=1Φ(z)

The pdf of U is expressed as:

(16) f(u)=φZ[qg,h1(u)]/qg,h[qg,h1(u)]
and the log-type (Xu and Genton, 2015) is:
(17) ϕ(z)=logf(u)=logφ(z)logqg,h(z)=(1+h)z2/2log{exp(gz)+g1[exp(gz)1]hz}log(2π)/2

The influence of g and h on the pdf and the reliability are illustrated in Figures 1(a) and 1(b). As g increases, the distribution will become more skewed. When g =0, U has a symmetric distribution. As h increases, the tails of the distribution get heavier. Moreover, all curves of reliability meet at the same point FU(0) = Φ(0) = 0.5, which is the median value.

2.3 Quantile-based estimation

Introducing the parameters of location μR and scale σ > 0, the regression model is obtained as follows:

(18) y=μ+σqg,h(z)=μ+σg1[exp(gz)1]exp(hz2/2)

Quantile-based parameter estimation approach was proposed by Tukey (1977), which used a letter value based sequence of quantiles to estimate the parameters sequentially. In this work, a brief description of the letter value based method is carried out with following steps (Hoaglin et al., 1985; Rayner and Macgillivray, 2002b; Bee and Trapin, 2016):

  • Estimate parameter μ by the median of the samples;

  • For zp = –z1–p, then by

    (19) y1py0.5y0.5yp={μ+g1σ[exp(gzp)1]exp(hzp2/2)}μμ{g1σ[exp(gzp)1]exp(hzp2/2)+μ}=exp(gzp)

we can get:

(20) gp=1zplog(y1py0.5y0.5yp)

Calculate gpi=log[(y1piy0.5)/(y0.5ypi)]/zpi with median rank estimator pi of samples, where ypi is the corresponding i-th sample quantile. Finally, the value g will be obtained by the mean of all gpi. In sequence, by

(21) y1py0.5=g1σ[exp(gzp)1]exp(hzp2/2)
the relationship is induced as:
(22) UHSp*=g(y1py0.5)[exp(gzp)1]1=σexp(hzp2/2)

A plot of log(UHSp*) against zp2/2 would have intercept log(σ) and slope h.

  • Regress log [(y1–py0.5)/(exp(–gzp)–1)] or log [(y0.5y1–p)/(1 – exp(–gzp))] on zp2/2 depending on g positive or negative to estimate σ and h. By adjusting quantile selections, a quantile estimation method that works well for g and h estimation is demonstrated.

3. Parameter estimation of the g-and-h distribution

3.1 Degradation data description

Degradation data are generally obtained from repeated measurements with designed time points and certain number of units. The measurements are stopped until the degradation measurements reach the pre-specified failure threshold or the terminated time. The assumptions about the manner which the test and measurement should be conducted (Lu and Meeker, 1993) are as follows:

  • Sample units are randomly selected from a population or production process and random measurement errors are independent across time.

  • Sample units are tested in a particular homogenous environment such as the same constant temperature.

  • Measurement (or inspection) times are pre-specified, the same across all the test assets, and may or may not be equally spaced in time.

  • The lifetimes of the sample units are assumed to be independent and identically distributed random variables.

In Figure 2, the degradation data may be censored at the threshold, however, they are of great significance on the regression of the mean with time-dependent (Zhou et al., 2012). So the rank of the censored data should be retained, and the degradation data should be ordered at every time point to make sure the rank. Consider that n units are tested, and the degradation measurements of each unit are available at time t1, t2, ... , tmi. Given the degradation data (yij, tj), where yij is the measured deterioration for the i-th unit at time tj, and i =1, 2, ⋯, n, j =1, 2, ⋯, mi. The sample path with equation (18) considering time can be rewritten as:

(23) y^ij=μ(tj)+σ(tj)upi
where
(24) upi=qg,h(zpi)=g1[exp(gzpi)1]exp(hzpi2/2)
is the quantile expression of equation (1), and zpi = Φ−1(pi), with the median rank estimator:
(25) pi=(i0.5)/n

Finally, with equations (8) and (9), the mean and the standard deviation (Std.) of Y can be obtained as:

(26) μY=μ(t)+σ(t)μg,h
(27) σY=σ(t)σg,h

The location μ(t) and the scale σ(t) can be selected as follows:

Linear regression:

(28) a0+a1t

Polynomial regression:

(29) a0+a1t+a2t2

Exponential regression:

(30) a0exp(a1t+a2t2)

The reliability function evaluated at the time t is just the probability that the failure lifetime T beyond the time t:

(31) R(t)=P{T>t}=1FT(t)
where F(·) denotes the cdf of the lifetime. For the increasing degradation path, the relationship between them is defined as:
(32) P{T>t}=P{y<D}
where D is the failure threshold. Then the reliability can be obtained by equation (15):
(33) R(t)=F(D)=1Φ[qg,h1(Dμ(t)σ(t))]
and solution of the reliability with equation (12).

According to the section 2.3, the quantile-based estimator is performed without considering the influence of time, which will lead the estimator to be complicated or inappropriate for the degradation modeling. Inspired by the quantile estimator (Lu and Meeker, 1993; Xu et al., 2014; Takeuchi et al., 2003), the unknown parameters will be estimated by minimizing the root-mean-square error (RMSE) on the degradation path of each unit as illustrated in Figure 2 using the equation (34):

(34) RMSE=1ni=1n[1mij=1mi(yijyij)2]

The rank of the censored data should be considered for computation. For convenient solution of the RMSE with many parameters, PSO may be a good choice for its good ability of global optimization and easy implement. It can perform parallel computation and simultaneously estimate all the parameters with the object function.

3.2 Particle swarm optimization

PSO (Kennedy and Eberhart, 1995) is an evolution technology based on the swarm intelligence, rooted from the research of bird predatory behavior. In the target searching space of N dimensions, there will be a group composed with M particles. The position vector of particle Pi in the t-th iteration is Xi(t) = (xi1, xi2, ⋯ xiN)T and the velocity vector Vi(t) = (vi1, vi2, ⋯ viN)T. The individual extreme point pbesti = (pbesti1, pbesti2, ⋯ pbestiN) represents the influence of the memory of particles themselves and makes the particles have the ability of global searching, meanwhile avoids the local minimum. The global extreme point gbesti = (gbesti1, gbesti2, ⋯ gbestiN) represents the influence of the group information, and reflects the information sharing as well as cooperation among the particles.

In the (t +1)-th iteration, the position and velocity of the particle Pi can be updated by the following equations:

(35) vin(t+1)=ωvin(t)+c1rand()(pbestinxin(t))+c2rand()(gbestnxin(t))
(36) xin(t+1)=xin(t)+vin(t+1),i=1,2,,M
where c1 and c2 are learning factors, and chosen to be 2 generally; rand() is the random number between 0 and 1; ω is the inertia weight, which makes the method have better global search ability when big while tend to local search when small. In the computation, the expression of ω is:
(37) ω=ωmax(ωmaxωmin)×Iter/Itermax
where ωmax = 0.9, ωmin = 0.4 (Shi and Eberhart, 1997); Iter is the iterative number and Itermax the maximum iterative number. To prevent the particles leaving the searching space, every dimensional velocity vn of particles is confined in [–vnmax, vnmax]. That is to say, if vn > vnmax, then vn = vnmax; while vn = vnmax, then vn = –vnmax.

3.3 Procedure for parameter estimation with particle swarm optimization

  1. Set the population size M, and specify the convergence precision and the maximum iterative steps;

  2. Initialize the velocity and the position of the particles; and

  3. Set the learning factors c1 and c2, as well as the inertia weight ω;

  4. Give the fitness function namely the RMSE in this paper and the dimensions of the optimized parameters including g, h and the parameters in the location μ(t) and the scale σ(t);

  5. Minimum the fitness function and compute the fitness values of all particles;

  6. Obtain the global extreme value and the individual extreme value, and update the velocity and the position of the particles;

  7. Check whether the terminal condition is satisfied, if not, continue steps (5) to (6); if is, stop computation and output the optimum values.

3.4 Simulation for verification of the estimation algorithm

To verify the effectiveness of PSO in estimating parameters of the proposed approach based on Turkey’s g-and-h distribution, a simulation is performed. Degradation measures are generated with the following at m equispaced time points for n units under the following model:

(38) yij=(a0+a1tj)+(b0+b1tj)g1[exp(gzi)1]exp(hzi2/2)
where i =1, 2, ⋯, n; j =1, 2, ⋯, m, z = ξ + ε, ξN(0, 1), εN(0, 0.05). Different values of concerning parameters are listed in Table I. Considering the influence of the sample number and verification of different parameters, four groups of simulations with ten times are carried out. To control the big fluctuation at different time points, only one group of simulation data obeying the standard normal distribution is generated at every time, but a small disturb belonging to the standard normal distribution with mean 0, variance 0.05 is added at fifteen time points as illustration in Figure 2. In PSO, 1000 particles are set for computation. Mean and variance of the estimated parameters are computed, as well the bias between the true value and the mean. The results have proved that the approach could satisfy requirement of the estimation accuracy for the following study in degradation reliability investigation.

4. Applications

4.1 Degradation description

4.1.1 GaAs laser device.

In this section, a practical case related to the degradation dataset of GaAs laser devices is presented to demonstrate the validity of the proposed estimation method (Meeker and Escobar, 1998). The operation current is measured as the degradation characteristic at 80°C, which increases over time. The device fails when the per cent increase in operating current exceeds a pre-specified threshold D =10. Degradation data from 15 testing samples are collected, with observations made at assigned time point and plotted in Figure 3(a). It has the degradation trend of linearity, and expressions of the location and the scale by equation (28) respectively are:

(39) μ(t)=a0+a1t
(40) σ(t)=b0+b1t

4.1.2 Crack growth.

Sets of fatigue crack growth data versus time (in million cycles) are plotted in Figure 3(b). These data sets were collected on twenty-one test specimens. All specimens have an initial crack length of 0.90 inch. The failure threshold is set as a critical crack length of 1.6 inches. It is noticed that about half of the units do not fail by the end of the test. Obviously, a polynomial function by equation (29) will be necessary to fit the location and the scale with time:

(41) μ(t)=a0+a1t+a2t2
(42) σ(t)=b0+b1t+a2t2

Minimizing the RMSE by PSO, the parameters could be estimated. First, the two important variables mean and Std. are computed by equations (26) and (27), and verified by comparison with the experiment data. Second, the proposed approach is applied in the lower thresholds where no failure occurs. The pseudo-lifetime data obtained by the interpolation in the lower thresholds are used to verify the reliability estimation. Then the proposed approach is used for the practical threshold and compared with others’ results. The two-stage verification can guarantee the validity of the approach, and the whole procedure is illustrated in Figure 4.

4.2 Parameter estimation

The particle number in PSO is set 300, convergence precision 1-e20. For GaAs laser device, the time is divided 250 for scale match, and the iteration is stopped until 1593 steps. For the crack growth, the iteration is stopped until 1740 steps. Parameters of Tukey’s g-and-h distribution estimated with PSO are listed in Table II and the RMSE versus iterative numbers is plotted in Figure 5. The mean and the variance at every test time are computed by equations (26) and (27) and compared with the experiment data in Figure 6, while the relative errors of the mean and the Std. are computed by equation (43) expressed as:

(43) δ=100%×|ζζ^|/ζ
where ζ represents the experiment and ζ^ the simulation, and the results show that they both match well.

4.3 Reliability

KS-test is carried out for normal, Weibull and the g-h distribution. The g-h distribution fails to pass at time 250 for GaAs laser device, as well as at 0.01, 0.02 and 0.03 millions of cycles for crack growth, as shown in Figure 7. The reason may be that the data at the beginning are too concentrated. However, the g-h distribution shows better fitness than the other two distributions at the rest time for the two practical applications with the correlation coefficients in Figure 8. In further, the Q-Q plots at the time t =4000 h of the GaAs laser device and at 0.09 million cycles of the crack growth are respectively extracted as shown in Figure 9, which shows the advantage of the g-h distribution in fitting data distributions. Figure 10 gives the pdf of the degradation data varying with time for GaAs laser device and crack growth respectively. By equation (33), the degradation reliability is computed.

To verify the effectiveness of the approach for degradation reliability, the pseudo-lifetime data are obtained by interpolation at three different thresholds D = 4, D = 5, D = 6 of the GaAs laser device and D = 1.2, D = 1.3, D = 1.4 of the crack growth without censored data in Figure 11. Finally, the practical reliability for the GaAs laser device at failure threshold D = 10 and the crack at failure threshold D = 1.6 are computed respectively and compared with others’ research results as shown in Figure 12, where the pseudo lifetime are obtained by extrapolating the censored data to the failure threshold, especially the first two lifetime data of the GaAs laser device and the first 12 lifetime data of the crack growth are practically failed.

4.4 Results

Regression of the degradation paths with Tukey’s g-and-h distribution shows good fitness with the experiment data. Comparisons of the mean and the Std. between the simulation and the experiment data prove that PSO with RMSE for parameter estimation is effective. The approach with Tukey’s g-and-h distribution for reliability estimation can well match the practical lifetime point at the beginning of failure, which is concerned with the tail the skewness of the distribution. Compared with other methods, the Tukey’s g-and-h distribution for degradation reliability could have a good fit of the pseudo-lifetime at the descent stage of the curve as well.

From the above analysis, the skewness and the tail of the distribution should be considered to obtain reasonable results when modeling the degradation processes, because they can exactly predict the beginning time of the failure. Although the proposed approach may introduce more parameters, they make the model more flexible and reliable results can be governed. However, an important parameter h has to be restricted in the interval [0, ∞] when h <0, in order to approximate the distribution of the degradation data.

5. Conclusions

In this paper, an approach based on Tukey’s g-and-h distribution for degradation reliability is proposed. PSO as an intelligent optimization algorithm is investigated for the parameter estimation and verified with the simulation study. Aim at no explicit expression of Tukey’s g-and-h distribution, the Newton-Raphson algorithm is adopted for solution of reliability. The effectiveness of the proposed approach for the reliability estimation have been demonstrated through two practical degradation data. Clearly, the approach can well estimate the reliability by the two-stage verification. In further, it can well predict the reliability with censored data compared with others’ results.

Figures

Properties of Tukey’s g-and-h distribution

Figure 1.

Properties of Tukey’s g-and-h distribution

Description of degradation data

Figure 2.

Description of degradation data

Laser degradation data

Figure 3.

Laser degradation data

Illustration of the procedure for reliability estimation

Figure 4.

Illustration of the procedure for reliability estimation

RMSE with iterative numbers by PSO

Figure 5.

RMSE with iterative numbers by PSO

Simulation of mean and std. with relative error

Figure 6.

Simulation of mean and std. with relative error

K-S test of normal, Weibull and the g-h distribution

Figure 7.

K-S test of normal, Weibull and the g-h distribution

Correlation coefficients of normal, Weibull and the g-h distribution

Figure 8.

Correlation coefficients of normal, Weibull and the g-h distribution

Q-Q plot of normal, Weibull and the g-h distribution

Figure 9.

Q-Q plot of normal, Weibull and the g-h distribution

Pdf over time

Figure 10.

Pdf over time

Verification by lower failure threshold

Figure 11.

Verification by lower failure threshold

Comparison with others’ work

Figure 12.

Comparison with others’ work

Parameter estimation

Group Sample number Time points Statistics a0 a1 b0 b1 g h
1 15 15 Mean 0.101 0.401 0.104 0.100 0.404 0.141
Variance 0.002 0.002 0.001 0.001 0.053 0.027
True values 0.1 0.4 0.1 0.1 0.4 0.1
Bias 0.001 0.001 0.004 0.000 0.004 0.041
2 20 15 Mean 0.108 0.407 0.106 0.107 0.433 0.095
Variance 0.001 0.001 0.001 0.001 0.055 0.012
True values 0.1 0.4 0.1 0.1 0.4 0.1
Bias 0.008 0.007 0.006 0.007 0.033 −0.005
3 15 15 Mean 0.202 0.599 0.046 0.291 0.283 0.180
Variance 0.000 0.010 0.000 0.003 0.111 0.057
True values 0.2 0.6 0.05 0.3 0.3 0.2
Bias 0.002 −0.001 −0.004 −0.009 −0.017 −0.02
4 15 15 Mean 0.202 0.603 0.041 0.247 0.249 0.095
Variance 0.000 0.008 0.000 0.005 0.073 0.021
True values 0.2 0.6 0.05 0.3 0.2 0.1
Bias 0.002 0.003 −0.009 −0.053 0.049 −0.005

Parameter estimation

Parameter a0 a1 a2 b0 b1 b2 g h RMSE
GaAs laser 0.0051 0.4886 0.035 0.1006 0.4568 0.0 0.2411
Crack 0.9128 1.5147 33.7608 0.0050 0.0 15.5443 0.1723 0.0 0.0222

References

Badrinath, S.G. and Chatterjee, S. (1991), “A data-analytic look at skewness and elongation in common-stock-return distributions”, Journal of Business and Economic Statistics, Vol. 9 No. 2, pp. 223-233.

Bee, M. and Trapin, L. (2016), “A simple approach to the estimation of Tukey's gh distribution”, Journal of Statistical Computation and Simulation, Vol. 86 No. 16, pp. 3287-3302.

Chaudhuri, A. and Ghosh, S.K. (2016), Quantitative Modeling of Operational Risk in Finance and Banking Using Possibility Theory, Springer, Basel.

Chen, D. and Zhao, C. (2009), “Data-driven fuzzy clustering based on maximum entropy principle and PSO”, Expert Systems with Applications, Vol. 36 No. 1, pp. 625-633.

Degen, M., Embrechts, P. and Lambrigger, D.D. (2007), “The quantitative modeling of operational risk: between g-and-h and EVT”, ASTIN Bulletin, Vol. 37 No. 2, pp. 265-291.

Dupuis, D.J. and Field, C.A. (2004), “Large wind speeds: modeling and outlier detection”, Journal of Agricultural, Biological, and Environmental Statistics, Vol. 9 No. 1, pp. 105-121.

Fischer, M. (2010), “Generalized Tukey-type distributions with application to financial and teletraffic data”, Statistical Papers, Vol. 51 No. 1, pp. 41-56.

Gorjian, N., Ma, L., Mittinty, M., Yarlagadda, P. and Sun, Y. (2009), “A review on degradation models in reliability analysis”, Proceedings of the 4th World Conference on Engineering Asset Management, pp. 369-384.

He, Y. and Raghunathan, T.E. (2006), “Tukey's gh distribution for multiple imputation”, The American Statistician, Vol. 60 No. 3, pp. 251-256.

Headrick, T.C., Kowalchuk, R.K. and Sheng, Y. (2008), “Parametric probability densities and distribution functions for Tukey g-and-h transformations and their use for fitting data”, Applied Mathematical Sciences, Vol. 2 No. 9, pp. 449-462.

Hoaglin, D.C., Mosteller, F. and Tukey, J.W. (1985), Exploring Data Tables, Trends, and Shapes, Wiley, New York, NY.

Jiménez, J.A. and Arunachalam, V. (2011), “Using Tukey’s g and h family of distributions to calculate value-at-risk and conditional value-at-risk”, The Journal of Risk, Vol. 13 No. 4, pp. 95-116.

Jorge, M. and Boris, I. (1984), “Some properties of the Tukey g and h family of distributions”, Communication in Statistics-Theory and Methods, Vol. 13 No. 3, pp. 353-369.

Kennedy, J. and Eberhart, R. (1995), “Particle swarm optimization”, Proceedings of IEEE International Conference on Neural Network, Vol. 4, pp. 1942-1948.

Klein, I. and Mittnik, S. (2002), Contributions to Modern Econometrics from Data Analysis to Economic Policy, Springer Science+Business Media, New York, NY.

Lu, J.C. and Meeker, W.Q. (1993), “Using degradation measures to estimate a time-to-failure distribution”, Technometrics, Vol. 35 No. 2, pp. 161-174.

Macgillivray, H.L. (1992), “Shape properties of the g-and-h and Johnson families”, Communications in Statistics-Theory and Methods, Vol. 21 No. 5, pp. 1233-1250.

Majumder, M.M.A. and Ali, M.M. (2008), “A comparison of methods of estimation of parameters of Tukey's gh family of distributions”, Pakistan Journal of Statistics, Vol. 24 No. 2, pp. 135-144.

Morgenthaler, S. and Tukey, J.W. (2000), “Fitting quantiles: doubling, HR, HQ, and HHH distributions”, Journal of Computational and Graphical Statistics, Vol. 9 No. 1, pp. 180-195.

Meeker, W.Q. and Escobar, L.A. (1998), Statistical Methods for Reliability Data, John Wiley and Sons, New York, NY.

Örkcü, H.H., Özsoy, V.S., Aksoy, E. and Dogan, M.I. (2015), “Estimating the parameters of 3-p, Weibull distribution using particle swarm optimization: a comprehensive experimental comparison”, Applied Mathematics and Computation, Vol. 268 No. 9, pp. 201-226.

Rayner, G.D. and Macgillivray, H.L. (2002a), “Numerical maximum likelihood estimation for the g-and-k and generalized g-and-h distributions”, Statistics and Computing, Vol. 12 No. 1, pp. 57-75.

Rayner, G.D. and Macgillivray, H.L. (2002b), “Weighted quantile-based estimation for a class of transformation distributions”, Computational Statistics and Data Analysis, Vol. 39 No. 4, pp. 401-433.

Shi, Y. and Eberhart, R. (1997), “A modified particle swarm optimizer”, IEEE International Conference on Evolutionary Computation, Vol. 6, pp. 69-73.

Takeuchi, I., Yamanaka, N. and Furuhashi, T. (2003), “Robust regression under asymmetric or/and non-constant variance error by simultaneously training conditional quantiles”, Proceedings of the International Joint Conference on Neural Networks, Vol. 3, pp. 1729-1734.

Tukey, J.W. (1977), “Modern techniques in data analysis”, NSF-Sponsored Regional Research Conference.

Xu, G. and Genton, M.G. (2015), “Efficient maximum approximated likelihood inference for Tukey’s g-and-h distribution”, Computational Statistics and Data Analysis, Vol. 91, pp. 78-91.

Xu, Y., Iglewicz, B. and Chervoneva, I. (2014), “Robust estimation of the parameters of g-and-h distributions, with applications to outlier detection”, Computational Statistics and Data Analysis, Vol. 75 No. 7, pp. 66-80.

Ye, Z.S. and Xie, M. (2015), “Stochastic modelling and analysis of degradation for highly reliable products”, Applied Stochastic Models in Business and Industry, Vol. 31 No. 1, pp. 16-32.

Zhou, R., Gebraeel, N. and Serban, N. (2012), “Degradation modeling and monitoring of truncated degradation signals”, IIE Transactions, Vol. 44 No. 9, pp. 793-803.

Zuo, M.J., Jiang, R. and Yam, R.C.M. (1999), “Approaches for reliability modeling of continuous-state devices”, IEEE Transactions on Reliability, Vol. 48 No. 1, pp. 9-18.

Further reading

Chen, Z. and Zheng, S. (2005), “Lifetime distribution based degradation analysis”, IEEE Transactions on Reliability, Vol. 54 No. 1, pp. 3-10.

Peng, C.Y. and Tseng, S.T. (2009), “Mis-specification analysis of linear degradation model”, IEEE Transactions on Reliability, Vol. 58 No. 3, pp. 444-455.

Peng, C.Y. and Tseng, S.T. (2013), “Statistical lifetime inference with skew-Wiener linear degradation models”, IEEE Transactions on Reliability, Vol. 62 No. 2, pp. 338-350.

Acknowledgements

The authors would like to thank the editors and the anonymous reviewers for their useful comments which have greatly improved the paper. This work is supported by the National Natural Science Foundation of China (No. 51505100).

Corresponding author

Yongqiang Zhao can be contacted at: 13B308008@hit.edu.cn

Related articles