Statistical distributions play a prominent role in applied sciences, particularly in biomedical sciences. The medical data sets are generally skewed to the right, and skewed distributions can be used quite effectively to model such kind of data sets. In the present study, therefore, we propose a new family of distributions suitable for modeling right-skewed medical data sets. The proposed family may be called a new generalized-

In medical situations, for example, neck cancer, bladder cancer, stomach cancer, and breast cancer, etc., the hazard rate is shown to have unimodal or modified unimodal shape. The hazard rate for neck, bladder, and breast cancer recurrence after surgical removal has been observed to have unimodal shape. In the very initial phase, the hazard rate for cancer recurrence begins with a low level and then increases gradually after a finite period of time after the surgical removal until reaching a peak before decreasing. Another example of the unimodal shape is the hazard of infection with some new viruses, where it increases in the early stages from a low level till it reaches a peak and then decreases; see Liao et al. [

The parametric methods such as the exponential, Rayleigh, Weibull, lognormal and gamma distributions have been extensively used in fitting bio-medical data; see Zhu et al. [

No doubt, that the parametric models stated above are used frequently in survival analysis. However, unfortunately, still, these models are subject to some sort of deficiencies; see Ahmad et al. [

As we mentioned earlier, that the exponential, Rayleigh, and Weibull are the most frequently used distributions among the parametric models. These distributions, however, are not flexible enough to counter complex forms of the data. For example, the exponential distribution is capable of modeling data with a constant hazard rate function (hrf), only. The hrf of the exponential distribution is given by

which is constant.

On the other hand, the Rayleigh distribution offers data modeling with only increasing hrf. Let

From

Among the parametric models, the Weibull distribution is one of the most commonly used family for modeling such data offering the characteristics of both the exponential and Rayleigh distributions is given by

From

Among the available literature, the frequently used Kaplan–Meier product-limit estimator is one of the flexible methods to model survival data. But, as observed in Miller [

Under these premises, we are motivated to propose new families of distributions. Therefore, in this article, an attempt has been made to propose a new family of distributions to provide the best fit to data in medical sciences and other related fields.

The paper is outlined as follows: the proposed method is presented in Section 3. In Section 4, we define a special sub-model of the proposed family. The maximum likelihood estimation of the model parameters is addressed in Section 5. The source and nature of the data are discussed in Section 6. Model selection criteria are presented in Section 7. In Section 8, we provide a real-life application from medical sciences to illustrate the importance of the new family. Section 9 is devoted to the Bayesian analysis of the data. Finally, some concluding remarks are presented in Section 10.

Let

The cdf of the T-

where,

Using the T-

The density function corresponding to

If

The density function corresponding to

The key motivations for using the NG-

A very simple and convenient method to modify the existing distributions.

To improve the characteristics and flexibility of the existing distributions.

To introduce the extended version of the baseline distribution having closed form of distribution function.

To provide the best fit to data in the medical sciences and other related fields.

Another most important motivation of the proposed approach is to introduce new distributions by adding only one additional parameter rather than adding two or more parameters.

In this section, we introduce a special sub-model of the proposed family, called a new generalized Weibull (NG-W) distribution. Let

The pdf and hrf of the NG-W model are given, respectively, by

and

For different values of the model parameters, plots of the density function of the NG-W distribution are sketched in

The plots for the hrf of the NG-W distribution are presented in

Here, we obtain the maximum likelihood estimators (MLEs) of the model parameters of the

The log-likelihood function can be maximized either directly or by solving the nonlinear likelihood function obtained by differentiating

and

Setting

With the objective of showing the likelihood equations have a unique solution in the parameters; we sketched the profile log-likelihood functions of the parameters of NG-W distribution for the stomach cancer data.

The data set used in this study is representing the remission times of stomach cancer patients released by Cancer Research Foundation. These remission times and are used here only for illustrative purposes. The descriptive measures of the data are presented in

Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. |
---|---|---|---|---|---|

0.100 | 3.952 | 7.190 | 9.422 | 11.400 | 60.000 |

The Kaplan–Meier survival plot of the data is sketched in

From

Constant, if the TTT plot is graphically presented as a straight diagonal.

Increasing, if the TTT plot is concave.

Decreasing, if the TTT plot is convex.

U-shaped if the TTT plot is convex and then concave,

Unimodal, if the TTT plot is concave and then convex.

For further detail, we refer the interested readers to Aarset [

Model selection is one of the fundamental tasks of scientific inquiry to choose a statistical model from a group of candidate models. A number of statistical procedures are available to decide about the goodness of fit among the competing distributions. The most commonly used criteria are the (i) Akaike information criterion (AIC), (ii) Bayesian information criterion (BIC), (iii) Anderson Darling (AD) test statistic and (iv) Kolmogorov Simonrove (KS) test statistic with the corresponding

In this section, we provide data analysis of the stomach cancer patient’s data to illustrate the NG-W model. We fit the proposed model to this data, and the comparison is made with the Weibull, Kumaraswamy–Weibull (Ku-W), and exponentiated Weibull (EW) models.

For the stomach cancer data, the MLEs with standard errors of the competing models are provided in

Dist. | |||||
---|---|---|---|---|---|

NG-W | 0.699 |
2.219 |
2.163 |
||

Weibull | 1.085 | 1.028 | |||

(0.086) | (0.117) | ||||

EW | 0.826 | 1.556 | 1.757 | ||

(0.768) | (0.445) | (0.184) | |||

Ku-W | 0.562 | 0.764 | 2.624 | 5.051 | |

(0.577) | (0.735) | (3.186) | (15.86) |

Dist. | AIC | BIC | AD | KS | P-value |
---|---|---|---|---|---|

NG-W | 160.751 | 168.043 | 0.794 | 0.089 | 0.517 |

Weibull | 163.020 | 174.882 | 0.898 | 0.093 | 0.455 |

EW | 162.909 | 173.401 | 0.834 | 0.093 | 0.468 |

Ku-W | 162.761 | 172.485 | 0.803 | 0.091 | 0.485 |

The plot of the distribution function of the NG-W distribution is displayed in

Bayesian inference procedures have been taken into consideration by many statistical re- searchers, especially researchers in the field of survival analysis and reliability engineering. In this section, a complete sample data is analyzed through Bayesian point of view. We assume that the parameters α, γ and θ of NG-W distribution have independent prior distributions as

where

In the Bayesian estimation, the actual value of the parameter may be adversely affected by the loss when choosing an estimator. This loss can be measured by a function of the parameter and the corresponding estimator. Five well-known loss functions and associated Bayesian estimators and corresponding posterior risk are presented in

Loss function | Bayes estimator | Posterior risk |
---|---|---|

Next, we provide the posterior probability distribution for a complete data set. We define the function

The joint posterior distribution in terms of a given likelihood function L(data) and joint prior distribution

Hence, we get the joint posterior density of parameters

where K is given as

It is clear from

Bayes | ||||||
---|---|---|---|---|---|---|

Loss functions | Estimate | Risk | Estimate | Risk | Estimate | Risk |

SELF | 0.7546 | 0.0027 | 2.1639 | 0.0769 | 2.0502 | 0.0288 |

WSELF | 0.7510 | 0.0036 | 2.1284 | 0.0355 | 2.0361 | 0.0141 |

MSELF | 0.7475 | 0.0047 | 2.0929 | 0.0167 | 2.0220 | 0.0069 |

PLF | 0.7564 | 0.0035 | 2.1816 | 0.0167 | 2.0572 | 0.0140 |

KLF | 0.7528 | 0.0047 | 2.1461 | 0.0166 | 2.0432 | 0.0069 |

Parameters | Credible interval | HPD interval |
---|---|---|

(0.720, 0.789) | (0.659, 0.860) | |

(1.973, 2.335) | (1.614, 2.699) | |

(1.935, 2.160) | (1.707, 2.364) |

In this article, we have introduced a new extension of the Weibull distribution, called a new generalized Weibull distribution. The classical two-parameter Weibull model produced simple monotone hazard shapes, as expected, that did not reect pattern of the unimodal hazard shape which is very important in biomedical research. On the other hand, the new extension of the Weibull model is capable to capture the unimodal hazard pattern. The proposed model along with the two-parameter Weibull, three-parameter exponentiated Weibull and four-parameter Kumaraswamy Weibull were applied to the remission times of the stomach cancer patient’s data. We observe that, in terms of the statistical significance of the model adequacy, suggesting that the NG-W model could play a reasonable role as a good candidate for modeling the stomach cancer data.