Characterizing Log-Logistic ( L L ) Distributions through Methods of Percentiles and L-Moments

The main purpose of this paper is to characterize the log-logistic (LL) distributions through the methods of percentiles and L-moments and contrast with the method of (product) moments. The method of (product) moments (MoM) has certain limitations when compared with method of percentiles (MoP) and method of L-moments (MoLM) in the context of fitting empirical and theoretical distributions and estimation of parameters, especially when distributions with greater departure from normality are involved. Systems of equations based on MoP and MoLM are derived. A methodology to simulate univariate LL distributions based on each of the two methods (MoP and MoLM) is developed and contrasted with MoM in terms of fitting distributions and estimation of parameters. Monte Carlo simulation results indicate that the MoPand MoLM-based LL distributions are superior to their MoM based counterparts in the context of fitting distributions and estimation of parameters. Mathematics Subject Classification: 62G30, 62H12, 62H20, 65C05, 65C10, 65C60, 78M05


Introduction
The two-parameter log-logistic (L L ) distribution considered herein was derived by Tadikamalla and Johnson [1] by transforming Johnson's [2] S L system through a logistic variable. The L L distribution is a continuous distribution with probability density function (pdf) and cumulative distribution function (cdf) expressed, respectively, as: where x ≥ 0 and δ > 0. The pdf in (1) has a single mode, which is at x = 0 for 0 < δ ≤ 1, and at x = e −γ/δ ((δ − 1)/(δ + 1)) 1/δ for δ > 1. When 0 < δ ≤ 1, the pdf in (1) has a shape of reverse J. For the pdf in (1), the rth moment exists only if δ > r.
A variant of log-logistic distribution has received a wider application in a variety of research contexts such as hydrology [3], estimation of scale parameter [4], MCMC simulation for survival analysis [5], and Bayesian analysis [6]. The quantile function of L L distribution with cdf in (2) is given as: where u ∼ uniform(0, 1) is substitued for the cdf in (2). The method of (product) moments (MoM)-based procedure used in fitting theoretical and empirical distributions involves matching of MoM-based indices (e.g., skew and kurtosis) computed from empirical and theoretical distributions [7]. In the context of L L distributions, the MoM-based procedure has certain limitations. One of the limitations is that the parameters of skew and kurtosis are defined for L L distributions only if δ > 3 and δ > 4, respectively. This limitation implies that the MoM-based procedure involving skew and kurtosis cannot be applied for L L distributions with δ ≤ 3.
Another limitation associated with MoM-based application of L L distributions is that the estimators of skew (α 3 ) and kurtosis (α 4 ) computed from sample data are algebraically bounded by the sample size (n) as |α 3 | ≤ √ n and α 4 ≤ n [8]. This limitation implies that for simulating L L distributions with kurtosis (α 4 ) = 48.6541 (as given in Figure 3C in Section 3.2) from samples of size (n) = 25, the largest possible value of the computed sample estimator (α 4 ) of kurtosis (α 4 ) is only 25, which is 51.38 % of the parameter value.
In order to obviate these limitations, this study proposes to characterize the L L distributions through the methods of percentiles and L-moments. The method of percentiles (MoP) introduced by Karian and Dudewicz [14] and the method of L-moments (MoLM) introduced by Hosking [9] are attractive alternatives to the traditional method of (product) moments (MoM) in the context of fitting theoretical and empirical distributions and in estimating parameters. In particular, the advantages of MoP-based procedure over the MoM-based procedure are that (a) MoP-based procedure can estimate parameters and obtain fits even when the MoM-based parameters do not exist, (b) the MoP-based estimators have relatively smaller variability than those obtained using MoMbased eprocedure, (c) the solving of MoP-based system of equations is far more efficient than that associated with the MoM-based system of equations [14][15][16][17]. Likewise, some of the advantages that MoLM-based estimators of L-skew and L-kurtosis have over MoM-based estimators of skew and kurtosis are that they (a) exist whenever the mean of the distribution exists, (b) are nearly unbiased for all sample sizes and distributions, and (c) are more robust in the presence of outliers [8][9][10][11][12][13][18][19][20][21][22].
The rest of the paper is organized as follows. In Section 2, definitions of method of percentiles (MoP) and method of L-moments (MoLM) are provided and systems of equations associated with MoP-and MoLM-based procedures are derived. Also provided in Section 2 are the boundary graphs associated with these procedures. Further, provided in Section 2 are the steps for implimenting the MoP, MoLM, and MoM-based procedures for fitting L L distributions to empirical and theoretical distributions. In Section 3, a comparison among the MoP-, MoLM-, and MoM-based procedures is provided in the context of fitting L L distributions to empirical and theoretical distributions and in the context of estimating parameters using a Monte Carlo simulation example. In Section 4, the results are discussed and concluding remarks are provided.

Method of Percentiles
Let X be a continuous random variable with quantile function q(u) as in (3), then the method of percentiles (MoP) based analogs of location, scale, skew function, and kurtosis function associated with X are respectively defined by median (ρ 1 ), inter-decile range (ρ 2 ), left-right tail-weight ratio (ρ 3 , a skew function), and tail-weight factor (ρ 4 , a kurtosis function) and given as [14, pp. 154-155] where q(u) u=p in (4)-(7) is the (100×p)th percentile with p ∈ (0, 1). Substituting appropriate value of u into the quantile (percentile) function q(u) in (3) and simplifying (4)-(7) yields the following MoP-based system of equations associated with L L distributions: The parameters of median (ρ 1 ), inter-decile range (ρ 2 ), left-right tail-weight ratio (ρ 3 ), and tail-weight factor (ρ 4 ) for the L L distribution are bounded as: where ρ 3 = 1 and ρ 4 = 1/2 are the limiting values when δ → ∞. For a sample (X 1 , X 2 , · · · , X n ) of size n, let denote the order statistics. Let q(u) u=p be the (100 × p)th percentile from this sample, where p ∈ (0, 1). Then, q(u) u=p can be expressed as [14, p. 154 where i is a positive integer and a/b is a proper fraction such that (n + 1)p = i + (a/b). For a sample of data with size n, the MoP-based estimatorsρ 1 -ρ 4 of ρ 1 -ρ 4 can be obtained in two steps as: (a) Use (13) to compute the values of the 10th, 25th, 50th, 75th, and 90th percentiles and (b) substitute these percentiles into (4)-(7) to obtain the sample estimatorsρ 1 -ρ 4 of ρ 1 -ρ 4 . See Section 3 for an example to demonstrate this methodology. Figure 1 (panel A) displays region for possible combinations of ρ 3 and ρ 4 for the MoP-based L L distributions.

Fitting Empirical Distributions
Provided in Figure 2 and Table 1 is an example to demonstrate the advantages of MoP-based fit of L L distributions over the MoLM-and MoM-based fits in the context of fitting empirical distributions (i.e., real-world data). Specifically, Fig. 2 displays the MoP-, MoLM-and MoM-based pdfs of L L distributions superimposed on the histogram of total hospital charges (in US dollars) data of 12,145 heart attack patients discharged from all hospitals in the state of New York in 1993. These data were also used in [17] and can be accessed from the website http://wiki.stat.ucla.edu/socr/index. php/SOCR_Data_AMI_NY_1993_HeartAttacks. The estimates (ρ 1 −ρ 4 ) of median, inter-decile range, left-right tail-weight ratio, and tail-weight factor (ρ 1 −ρ 4 ) were computed from total hospital charges data in two steps as: (a) Obtain the values of the 10th, 25th, 50th, 75th, and 90th percentiles using (13) and (b) substitute these values of percentiles into (4)-(7) to compute the estimatesρ 1 −ρ 4 . The parameter values of γ and δ associated with the MoP-based L L distribution were determined by solving (9) and (10) after substituting the estimates ofρ 2 andρ 3 into the right-hand sides of (9) and (10). The solved values of γ and δ can be used in (8) and (11), respectively, to compute the parameter values of median (ρ 1 ) and tail-weight factor (ρ 4 ). The MoP-based fit was obtained by using a linear transformation in the form x = q(u) + (ρ 1 − ρ 1 ).

Fitting Theoretical Distributions
Provided in Figure 3 is an example to demonstrate the advantages of MoPbased fit of L L distributions over the MoLM-and MoM-based fits in the context of fitting Dagum distribution with shape parameters: p = 2 and a = 5 and scale parameter b = 4. See [12] for a comparison of MoLM and MoM-based fits of Dagum distributions.
The values of ρ 1 − ρ 4 associated with the Dagum distribution were computed using (4)- (7), where the quantile function q(u) of Dagum distribution was used. The parameter values of γ and δ associated with the MoP-based L L distribution were determined by solving (9) and (10) after substituting the values of ρ 2 and ρ 3 of Dagum distribution into the right-hand sides of (9) and (10). These values of γ and δ can be used in (8) and (11), respectively, to compute the parameter values of ρ 1 and ρ 4 associated with the L L distribution. The MoP-based fit was obtained by using a linear transformation x = q(u) + (ρ 1 − ρ 1 ), whereρ 1 is the median of Dagum distribution.
The values of λ 1 , λ 2 , τ 3 , τ 4 associated with the Dagum distribution were computed using (18) and (14)- (17) and using the formulae for τ 3 and τ 4 from Section 2.2. The parameter values of γ and δ associated with the MoLM-based L L distribution were determined by solving (24) and (25) after substituting the values of λ 2 and τ 3 of Dagum distribution into the right-hand sides of (24) and (25). These values of γ and δ can be used in (23) and (26), respectively, to compute the parameter values of λ 1 and τ 4 associated with the L L distribution. The MoLM-based fit was obtained by using a linear transformation x = q(u) + (λ 1 − λ 1 ), whereλ 1 is the L-mean of Dagum distribution.
The values of µ, σ, α 3 and α 4 associated with the Dagum distribution were computed using (28), formulae of mean and standard deviation and (30)

Discussion and Conclusion
One of the advantages of MoP-and MoLM-based procedures over the traditional MoM-based procedure is that the distributions characterized through the former procedures can provide better fits to real-world data and some the-   oretical distributions [8][9][10][11][12][13][14][15][16][17]. In case of L L distributions, inspection of Figures 2  and 3 indicates that the MoP-and MoLM-based procedures provide better fits than the MoM-based procedure in the context of both fitting real-world data and theoretical distributions. Furthermore, the Euclidian distances related with MoP-and MoLM-based fits in Tables 1 and 2 are substantially smaller than those associated with the MoM-based fits. For example, inspection of Table 1 indicates that d = 0.0151 associated with MoP-based fit of L L distribution is approximately one-fifth of d = 0.0735 associated with MoM-based fit of L L distribution over total hospital charges data in Fig. 2. Similarly, d = 0.0239 associated with MoLM-based fit is approximately one-third of d = 0.0735 associated with MoM-based fit.
The MoP-based estimators can be far less biased and less dispersed than the MoM-based estimators when distributions with larger departure from normality are involved [14][15][16][17]. The MoLM-based estimators can also be far less   biased and less dispersed than the MoM-based estimators when sampling is from distributions with more severe departures from normality [8][9][10][11][12][13][18][19][20][21][22]. Inspection of the simulation results in Tables 3-5 clearly indicates that in the context of L L distributions, the MoP-and MoLM-based estimators are superior to their MoM-based counterparts for the estimators of third-and fourth-order parameters. That is, the superiority that MoP-based estimators of left-right tail-weight ratio (ρ 3 ) and tail-weight factor (ρ 4 ) and MoLM-based estimators of L-skew (τ 3 ) and L-kurtosis (τ 4 ) have over their corresponding MoM-based estimators of skew (α 3 ) and kurtosis (α 4 ) is clearly obvious. For example, with samples of size n = 25 the estimates of α 3 and α 4 for the L L distribution in Fig.  3C were, on average, only 36.63% and 3.66% of their respective parameters, whereas the estimates of ρ 3 and ρ 4 for the L L distribution in Fig. 3A were, on average, 105.07% and 90.60% of their respective parameters and the estimates of L-skew and L-kurtosis for the L L distribution in Fig. 3B were, on average, 91% and 93.73% of their respective parameters. From inspection of Tables 3-5, it is also evident that MoP-based estimators of ρ 3 and ρ 4 and MoLM-based estimators of τ 3 and τ 4 are more efficient estimators as their relative standard errors RSE = {(SE/Estimate) × 100} are considerably smaller than those associated with MoM-based estimators of α 3 and α 4 . For example, inspection of Tables 3-5 for n = 500, indicates RSE measures of: RSE(ρ 3 ) = 0.07% and RSE(ρ 4 ) = 0.04% for the L L distribution in Fig. 3A compared with RSE(τ 3 ) = 0.09% and RSE(τ 4 ) = 0.08% for the L L distribution in Fig. 3B and RSE(α 3 ) = 0.36% and RSE(α 4 ) = 1.03% for the L L distribution in Fig. 3C. Thus, MoP-based estimators of ρ 3 and ρ 4 have about the same degree of precision compared to the MoLM-based estimators of τ 3 and τ 4 , whereas both MoP-and MoLM-based estimators have substantially higher precision when compared to the MoM-based estimators of α 3 and α 4 .
In the context of L L distributions, the MoM-based procedure involves solving of (33) and (34) for the parameters of γ and δ after given values (or, estimates) of standard deviation (σ) and skew (α 3 ) are substituted into the right-hand sides of (33) and (34). The solved values of γ and δ can be substituted into (32) and (35), respectively, for computing the values of mean and kurtosis.