Showing 1 to 10 of 259 matching Articles
Results per page:
Export (CSV)
By
Hornung, Roman
Post to Citeulike
The ordinal forest method is a random forest–based prediction method for ordinal response variables. Ordinal forests allow prediction using both lowdimensional and highdimensional covariate data and can additionally be used to rank covariates with respect to their importance for prediction. An extensive comparison study reveals that ordinal forests tend to outperform competitors in terms of prediction performance. Moreover, it is seen that the covariate importance measure currently used by ordinal forest discriminates influential covariates from noise covariates at least similarly well as the measures used by competitors. Several further important properties of the ordinal forest algorithm are studied in additional investigations. The rationale underlying ordinal forests of using optimized score values in place of the class values of the ordinal response variable is in principle applicable to any regression method beyond random forests for continuous outcome that is considered in the ordinal forest method.
more …
By
Saboor, Abdus; Khan, Muhammad Nauman; Cordeiro, Gauss M.; Pascoa, Marcelino A. R.; Bortolini, Juliano; Mubeen, Shahid
Show all (6)
Post to Citeulike
We introduce a flexible modified beta modifiedWeibull model, which can accommodate both monotonic and nonmonotonic hazard rates such as a useful long bathtub shaped hazard rate in the middle. Several distributions can be obtained as special cases of the new model. We demonstrate that the new density function is a linear combination of modifiedWeibull densities. We obtain the ordinary and central moments, generating function, conditional moments and mean deviations, residual life functions, reliability measures and mean and variance (reversed) residual life. The method of maximum likelihood and a Bayesian procedure are used for estimating the model parameters. We compare the fits of the new distribution and other competitive models to two real data sets. We prove empirically that the new distribution gives the best fit among these distributions based on several goodnessoffit statistics.
more …
By
Chacko, Manoj; Mohan, Rakhi
Post to Citeulike
In medical studies or reliability analysis, the failure of individuals or items may be due to more than one cause or factor. These risk factors in some sense compete for the failure of the experimental units. Analysis of data in this circumstances is called competing risks analysis. In this paper, we consider the analysis of competing risk data under progressive typeII censoring by assuming the number of units removed at each stage is random and follows a binomial distribution. Bayes estimators are obtained by assuming the population under consider follows a Weibull distribution. A simulation study is carried out to study the performance of the different estimators derived in this paper. A real data set is also used for illustration.
more …
By
Guessoum, Zohra; Tatachak, Abdelkader
Post to Citeulike
Lefttruncation and rightcensoring arise frequently when considering lifetime data. When both incompleteness conditions occur, a productlimit estimator was proposed and investigated in the independent case by Tsai et al. (Biometrika74, 883–886, 1987). In the presence of covariates, the conditional version was studied in the αmixing setting by Liang et al. (Test21, 790–810, 2012). Our objective in the present paper is to derive strong uniform consistency rates for the cumulative hazard and the productlimit estimates when the lifetime observations form an associated sequence. Then, as an application we derive a strong uniform consistency rate for the kernel estimator of the hazard rate function considered by Uzunoḡullari and Wang (Biometrika79, 297–310, 1992) in the iid case.
more …
By
Afanasyeva, L. G.
Post to Citeulike
This paper is focused on the stability conditions for a multiserver queueing system with heterogeneous servers and a regenerative input flow X. The main idea is constructing an auxiliary service process Y which is also a regenerative flow and definition of the common points of regeneration for both processes X and Y. Then the traffic rate is defined in terms of the mean of the increments of these processes on a common regeneration period. It allows to use wellknown results from the renewal theory to find the instability and stability conditions. The possibilities of the proposed approach are demonstrated by examples. We also present the applications to transport system capacity analysis.
more …
By
Fujikoshi, Yasunori; Sakurai, Tetsuro
Post to Citeulike
This paper is concerned with selection of variables in twogroup discriminant analysis with the same covariance matrix. We propose a testbased method (TM) drawing on the significance of each variable. Sufficient conditions for the testbased method to be consistent are provided when the dimension and the sample size are large. For the case that the dimension is larger than the sample size, a ridgetype method is proposed. Our results and tendencies therein are explored numerically through a Monte Carlo simulation. It is pointed that our selection method can be applied for highdimensional data.
more …
By
AlSharadqah, Ali; Mojirsheibani, Majid
Post to Citeulike
A longstanding problem in the construction of asymptotically correct confidence bands for a regression function
$$m(x)=E[YX=x]$$
, where Y is the response variable influenced by the covariate X, involves the situation where Y values may be missing at random, and where the selection probability, the density function f(x) of X, and the conditional variance of Y given X are all completely unknown. This can be particularly more complicated in nonparametric situations. In this paper, we propose a new kerneltype regression estimator and study the limiting distribution of the properly normalized versions of the maximal deviation of the proposed estimator from the true regression curve. The resulting limiting distribution will be used to construct uniform confidence bands for the underlying regression curve with asymptotically correct coverages. The focus of the current paper is on the case where
$$X\in \mathbb {R}$$
. We also perform numerical studies to assess the finitesample performance of the proposed method. In this paper, both mechanics and the theoretical validity of our methods are discussed.
more …
By
Maurer, Karsten; Osthus, Dave; Loy, Adam
Post to Citeulike
Can data help us explore and expose the soul of the community? This was the challenge posed by the 2013 Data Exposition. The Knight Foundation, in cooperation with Gallup, furnished data from 43,000 people over 3 years (2008–2010) in 26 communities, which we explored in an effort to discover variables associated with community attachment. Our analysis focused on four cities that stood out after our initial exploration of the data set: State College, PA; Detroit, MI; Milledgeville, GA; and Biloxi, MS. We present our use of surveyweighted binned scatterplots to graphically explore the association between an individual’s community attachment and perceived economic outlook. Additionally, we present a few other analyses we found interesting during our initial exploration which we view as a collection of “short stories”.
more …
By
Li, Lingge; Holbrook, Andrew; Shahbaba, Babak; Baldi, Pierre
Show all (4)
Post to Citeulike
Hamiltonian Monte Carlo is a widely used algorithm for sampling from posterior distributions of complex Bayesian models. It can efficiently explore highdimensional parameter spaces guided by simulated Hamiltonian flows. However, the algorithm requires repeated gradient calculations, and these computations become increasingly burdensome as data sets scale. We present a method to substantially reduce the computation burden by using a neural network to approximate the gradient. First, we prove that the proposed method still maintains convergence to the true distribution though the approximated gradient no longer comes from a Hamiltonian system. Second, we conduct experiments on synthetic examples and real data to validate the proposed method.
more …
By
Lee, Chi Hyun; Ning, Jing; Shen, Yu
Post to Citeulike
Lengthbiased data are frequently encountered in prevalent cohort studies. Many statistical methods have been developed to estimate the covariate effects on the survival outcomes arising from such data while properly adjusting for lengthbiased sampling. Among them, regression methods based on the proportional hazards model have been widely adopted. However, little work has focused on checking the proportional hazards model assumptions with lengthbiased data, which is essential to ensure the validity of inference. In this article, we propose a statistical tool for testing the assumed functional form of covariates and the proportional hazards assumption graphically and analytically under the setting of lengthbiased sampling, through a general class of multiparameter stochastic processes. The finite sample performance is examined through simulation studies, and the proposed methods are illustrated with the data from a cohort study of dementia in Canada.
more …
