Showing 1 to 10 of 201 matching Articles
Results per page:
Export (CSV)
By
Seidel, Wilfried; Mosler, Karl; Alker, Manfred
Post to Citeulike
39 Citations
We show that iterative methods for maximizing the likelihood in a mixture of exponentials model depend strongly on their particular implementation. Different starting strategies and stopping rules yield completely different estimators of the parameters. This is demonstrated for the likelihood ratio test of homogeneity against twocomponent exponential mixtures, when the test statistic is calculated by the EM algorithm.
more …
By
Lee, Sharon X.; McLachlan, Geoffrey J.
Post to Citeulike
36 Citations
Nonnormal mixture distributions have received increasing attention in recent years. Finite mixtures of multivariate skewsymmetric distributions, in particular, the skew normal and skew
$$t$$
mixture models, are emerging as promising extensions to the traditional normal and
$$t$$
mixture models. Most of these parametric families of skew distributions are closely related, and can be classified into four forms under a recently proposed scheme, namely, the restricted, unrestricted, extended, and generalised forms. In this paper, we consider some of these existing proposals of multivariate nonnormal mixture models and illustrate their practical use in several real applications. We first discuss the characterizations along with a brief account of some distributions belonging to the above classification scheme, then references for software implementation of EMtype algorithms for the estimation of the model parameters are given. We then compare the relative performance of restricted and unrestricted skew mixture models in clustering, discriminant analysis, and density estimation on six real datasets from flow cytometry, finance, and image analysis. We also compare the performance of mixtures of skew normal and
$$t$$
component distributions with other nonnormal component distributions, including mixtures with multivariate normalinverseGaussian distributions, shifted asymmetric Laplace distributions and generalized hyperbolic distributions.
more …
By
Balakrishnan, N.; Koutras, M. V.; Milienos, F. S.; Pal, S.
Show all (4)
Post to Citeulike
6 Citations
Cure rate models offer a convenient way to model timetoevent data by allowing a proportion of individuals in the population to be completely cured so that they never face the event of interest (say, death). The most studied cure rate models can be defined through a competing cause scenario in which the random variables corresponding to the timetoevent for each competing causes are conditionally independent and identically distributed while the actual number of competing causes is a latent discrete random variable. The main interest is then in the estimation of the cured proportion as well as in developing inference about failure times of the susceptibles. The existing literature consists of parametric and non/semiparametric approaches, while the expectation maximization (EM) algorithm offers an efficient tool for the estimation of the model parameters due to the presence of right censoring in the data. In this paper, we study the cases wherein the number of competing causes is either a binary or Poisson random variable and a piecewise linear function is used for modeling the hazard function of the timetoevent. Exact likelihood inference is then developed based on the EM algorithm and the inverse of the observed information matrix is used for developing asymptotic confidence intervals. The Monte Carlo simulation study demonstrates the accuracy of the proposed nonparametric approach compared to the results attained from the true correct parametric model. The proposed model and the inferential method is finally illustrated with a data set on cutaneous melanoma.
more …
By
Cappé, Olivier; Douc, Randal; Guillin, Arnaud; Marin, JeanMichel; Robert, Christian P.
Show all (5)
Post to Citeulike
85 Citations
In this paper, we propose an adaptive algorithm that iteratively updates both the weights and component parameters of a mixture importance sampling density so as to optimise the performance of importance sampling, as measured by an entropy criterion. The method, called MPMC, is shown to be applicable to a wide class of importance sampling densities, which includes in particular mixtures of multivariate Student t distributions. The performance of the proposed scheme is studied on both artificial and real examples, highlighting in particular the benefit of a novel RaoBlackwellisation device which can be easily incorporated in the updating scheme.
more …
By
Xiang, Sijia; Yao, Weixin
Post to Citeulike
In this article, we propose and study a new class of semiparametric mixture of regression models, where the mixing proportions and variances are constants, but the component regression functions are smooth functions of a covariate. A onestep backfitting estimate and two EMtype algorithms have been proposed to achieve the optimal convergence rate for both the global parameters and the nonparametric regression functions. We derive the asymptotic property of the proposed estimates and show that both the proposed EMtype algorithms preserve the asymptotic ascent property. A generalized likelihood ratio test is proposed for semiparametric inferences. We prove that the test follows an asymptotic
$$\chi ^2$$
distribution under the null hypothesis, which is independent of the nuisance parameters. A simulation study and two real data examples have been conducted to demonstrate the finite sample performance of the proposed model.
more …
By
Melnykov, Volodymyr; Zhu, Xuwen
Post to Citeulike
Studying crime trends and tendencies is an important problem that helps to identify socioeconomic patterns and relationships of crucial significance. Finite mixture models are famous for their flexibility in modeling heterogeneity in data. A novel approach designed for accounting for skewness in the distributions of matrix observations is proposed and applied to the United States crime data collected between 2000 and 2012 years. Then, the model is further extended by incorporating explanatory variables. A stepbystep model development demonstrates differences and improvements associated with every stage of the process. Results obtained by the final model are illustrated and thoroughly discussed. Multiple interesting conclusions have been drawn based on the developed model and obtained modelbased clustering partition.
more …
By
Lee, Sharon X.; McLachlan, Geoffrey; Pyne, Saumyadipta
Post to Citeulike
Mixture distributions are commonly being applied for modelling and for discriminant and cluster analyses in a wide variety of situations. We first consider normal and tmixture models. As they are highly parameterized, we review methods to enable them to be fitted to large datasets involving many observations and variables. Attention is then given to extensions of these mixture models to mixtures with skew normal and skew tdistributions for the segmentation of data into clusters of nonelliptical shape. The focus is then on the latter models in conjunction with the JCM (joint clustering and matching) procedure for an automated approach to the clustering of cells in a sample in flow cytometry where a large number of cells and their associated markers have been measured. For a class of multiple samples, we consider the use of JCM for matching the samplespecific clusters across the samples in the class and for improving the clustering of each individual sample. The supervised classification of a sample is also considered in the case where there are different classes of samples corresponding, for example, to different outcomes or treatment strategies for patients undergoing medical screening or treatment.
more …
By
Adhya, Sumanta; Banerjee, Tathagata; Chattopadhyay, Gaurangadeb
Post to Citeulike
Suppose that a finite population consists of N distinct units. Associated with the ith unit is a polychotomous response vector, d_{i}, and a vector of auxiliary variable x_{i}. The values x_{i}’s are known for the entire population but d_{i}’s are known only for the units selected in the sample. The problem is to estimate the finite population proportion vector P. One of the fundamental questions in finite population sampling is how to make use of the complete auxiliary information effectively at the estimation stage. In this article a predictive estimator is proposed which incorporates the auxiliary information at the estimation stage by invoking a superpopulation model. However, the use of such estimators is often criticized since the working superpopulation model may not be correct. To protect the predictive estimator from the possible model failure, a nonparametric regression model is considered in the superpopulation. The asymptotic properties of the proposed estimator are derived and also a bootstrapbased hybrid resampling method for estimating the variance of the proposed estimator is developed. Results of a simulation study are reported on the performances of the predictive estimator and its resamplingbased variance estimator from the modelbased viewpoint. Finally, a data survey related to the opinions of 686 individuals on the cause of addiction is used for an empirical study to investigate the performance of the nonparametric predictive estimator from the designbased viewpoint.
more …
By
Saidane, Mohamed; Lavergne, Christian
Post to Citeulike
The deficiencies of stationary models applied to financial time series are well documented. A special form of nonstationarity, where the underlying generator switches between (approximately) stationary regimes, seems particularly appropriate for financial markets. We use a dynamic switching (modelled by a hidden Markov model) combined with a linear conditionally heteroskedastic latent factor model in a hybrid mixedstate latent factor model (MSFM) and discuss the practical details of training such models with a new approximated version of the Viterbi algorithm in conjunction with the expectationmaximization (EM) algorithm to iteratively estimate the model parameters in a maximumlikelihood sense. The performance of the MSFM is evaluated on both simulated and financial data sets. On the basis of outofsample forecast encompassing tests as well as other measures for forecasting accuracy, our results indicate that the use of this new method yields overall better forecasts than those generated by competing models.
more …
By
Koley, Tamalika; Dewanji, Anup
Post to Citeulike
Reparametrization is often done to make a constrained optimization problem an unconstrained one. This paper focuses on the nonparametric maximum likelihood estimation of the subdistribution functions for current status data with competing risks. Our main aim is to propose a method using reparametrization, which is simpler and easier to handle with compared to the constrained maximization methods discussed in Jewell and Kalbfleisch (Biostatistics. 5, 291–306, 2004) and Maathuis (2006), when both the monitoring times and the number of individuals observed at these times are fixed. Then the ExpectationMaximization (EM) algorithm is used for estimating the unknown parameters. We have also established some asymptotic results of these maximum likelihood estimators. Finite sample properties of these estimators are investigated through an extensive simulation study. Some generalizations have been discussed.
more …
