Showing 1 to 10 of 2750 matching Articles
Results per page:
By
Vaisman, Radislav; Botev, Zdravko I.; Ridder, Ad
2 Citations
In this paper we describe a sequential importance sampling (SIS) procedure for counting the number of vertex covers in general graphs. The optimal SIS proposal distribution is the uniform over a suitably restricted set, but is not implementable. We will consider two proposal distributions as approximations to the optimal. Both proposals are based on randomization techniques. The first randomization is the classic probability model of random graphs, and in fact, the resulting SIS algorithm shows polynomial complexity for random graphs. The second randomization introduces a probabilistic relaxation technique that uses Dynamic Programming. The numerical experiments show that the resulting SIS algorithm enjoys excellent practical performance in comparison with existing methods. In particular the method is compared with cachet—an exact model counter, and the state of the art SampleSearch, which is based on Belief Networks and importance sampling.
more …
By
Brockwell, Peter J.; Davis, Richard A.
1 Citations
Many time series arising in practice are best considered as components of some vector valued (multivariate) time series {X_{t}} having not only serial dependence within each component series {X_{ti}} but also interdependence between the different component series {X_{ti}} and {X_{tj}}, i ≠ j. Much of the theory of univariate time series extends in a natural way to the multivariate case; however, new problems arise.
more …
By
Pitou, Cynthia; Diatta, Jean
Textual information extraction is a challenging issue in Information Retrieval. Two main approaches are commonly distinguished: texturebased and regionbased. In this paper, we propose a method guided by the quadtree
Quadtree
decomposition. The principle of the method is to recursively decompose regions of a document image is four equal regions, starting from the image of the whole document. At each step of the decomposition process an OCR engine is used for retrieving a given textual information from the obtained regions. Experiments on real invoice data provide promising results.
more …
By
Meintanis, Simos G.; Allison, James; Santana, Leonard
5 Citations
We investigate the finitesample properties of certain procedures which employ the novel notion of the probability weighted empirical characteristic function. The procedures considered are: (1) Testing for symmetry in regression, (2) Testing for multivariate normality with independent observations, and (3) Testing for multivariate normality of random effects in mixed models. Along with the new tests alternative methods based on the ordinary empirical characteristic function as well as other more well known procedures are implemented for the purpose of comparison.
more …
By
Qin, Guoyou; Zhu, Zhongyi; Fung, Wing K.
3 Citations
In this paper, we study the robust estimation of generalized partially linear models (GPLMs) for longitudinal data with dropouts. We aim at achieving robustness against outliers. To this end, a weighted likelihood method is first proposed to obtain the robust estimation of the parameters involved in the dropout model for describing the missing process. Then, a robust inverse probabilityweighted generalized estimating equation is developed to achieve robust estimation of the mean model. To approximate the nonparametric function in the GPLM, a regression spline smoothing method is adopted which can linearize the nonparametric function such that statistical inference can be conducted operationally as if a generalized linear model was used. The asymptotic properties of the proposed estimator are established under some regularity conditions, and simulation studies show the robustness of the proposed estimator. In the end, the proposed method is applied to analyze a real data set.
more …
By
Bianchi, Annamaria
Economic indicators need to be estimated at regional level. Small area estimation based on Mquantile regression has recently been introduced by Chambers and Tzavidis (Biometrika 93:255–268, 2006) and it has proved to provide a valid alternative to traditional methods. Thus far, this method has only been applied to crosssectional data. However, it is well known that the use of panel data may provide significant gains in terms of efficiency of the estimators. This paper explores possible extensions of Mquantilebased small area estimators to the panel data context. A modelbased simulation study is presented.
more …
By
Wollschläger, Daniel
Zusammenfassung
R bietet nicht nur Mittel zur numerischen und grafischen Datenanalyse, sondern ist gleichzeitig eine Programmiersprache, die dieselbe Syntax wie statistische Auswertungen verwendet. Das sehr umfangreiche Thema der Programmierung mit R wird in den folgenden Abschnitten soweit angedeutet, dass nützliche Sprachkonstrukte wie Fallunterscheidungen und Schleifen verwendet sowie einfache Funktionen selbst erstellt werden können. Das Kapitel schließt mit Möglichkeiten, wie sich die Effizienz von Auswertungen steigern lässt.
more …
By
Li, Zhaoyuan; Yao, Jianfeng
1 Citations
In this paper, we generalize two criteria, the determinantbased and tracebased criteria proposed by Saranadasa (J Multivar Anal 46:154–174, 1993), to general populations for high dimensional classification. These two criteria compare some distances between a new observation and several different known groups. The determinantbased criterion performs well for correlated variables by integrating the covariance structure and is competitive to many other existing rules. The criterion however requires the measurement dimension be smaller than the sample size. The tracebased criterion, in contrast, is an independence rule and effective in the “large dimensionsmall sample size” scenario. An appealing property of these two criteria is that their implementation is straightforward and there is no need for preliminary variable selection or use of turning parameters. Their asymptotic misclassification probabilities are derived using the theory of large dimensional random matrices. Their competitive performances are illustrated by intensive Monte Carlo experiments and a real data analysis.
more …
