Showing papers in "Computational Statistics & Data Analysis in 2005"

PDF

Open Access

Journal Article•DOI•

[...]

Michel Tenenhaus¹, Vincenzo Esposito Vinzi², Vincenzo Esposito Vinzi¹, Yves-Marie Chatelin, Carlo Lauro² - Show less +1 more•Institutions (2)

HEC Paris¹, University of Naples Federico II²

01 Jan 2005-Computational Statistics & Data Analysis

TL;DR: PLS path modeling can be used for analyzing multiple tables so as to be related to more classical data analysis methods used in this field and some new improvements are proposed.

...read moreread less

4,839 citations

Journal Article•DOI•

How many principal components? stopping rules for determining the number of non-trivial axes revisited

[...]

Pedro R. Peres-Neto¹, Donald A. Jackson¹, Keith M. Somers¹•Institutions (1)

University of Toronto¹

01 Jun 2005-Computational Statistics & Data Analysis

TL;DR: A Bartlett's test is used to test the significance of the first principal component, indicating whether or not at least two variables share common variation in the entire data set, and a two-step approach appears to be highly effective.

...read moreread less

738 citations

Journal Article•DOI•

Maximum likelihood estimation in nonlinear mixed effects models

[...]

E. Kuhn¹, Marc Lavielle¹•Institutions (1)

University of Paris-Sud¹

01 Jun 2005-Computational Statistics & Data Analysis

TL;DR: A stochastic approximation version of EM for maximum likelihood estimation of a wide class of nonlinear mixed effects models is proposed, able to provide an estimator close to the MLE in very few iterations.

...read moreread less

452 citations

Journal Article•DOI•

Statistical analysis of financial networks

[...]

Vladimir Boginski¹, Sergiy Butenko², Panos M. Pardalos¹•Institutions (2)

University of Florida¹, Texas A&M University²

01 Feb 2005-Computational Statistics & Data Analysis

TL;DR: This work conducts the statistical analysis of this graph and shows that it follows the power-law model, and detects cliques and independent sets in this graph, which allows one to apply a new data mining technique of classifying financial instruments based on stock prices data, which provides a deeper insight into the internal structure of the stock market.

...read moreread less

359 citations

Journal Article•DOI•

Generalized Rayleigh distribution

[...]

Debasis Kundu¹, Mohammad Z. Raqab²•Institutions (2)

Indian Institute of Technology Kanpur¹, University of Jordan²

15 Apr 2005-Computational Statistics & Data Analysis

TL;DR: Different estimation procedures have been used to estimate the unknown parameter(s) and their performances are compared using Monte Carlo simulations, and it is observed that this particular skewed distribution can be used quite effectively in analyzing lifetime data.

...read moreread less

302 citations

Journal Article•DOI•

PLS generalised linear regression

[...]

Philippe Bastien¹, Vincenzo Esposito Vinzi², Michel Tenenhaus²•Institutions (2)

L'Oréal¹, HEC Paris²

01 Jan 2005-Computational Statistics & Data Analysis

TL;DR: The approach proposed for PLS generalised linear regression is simple and easy to implement and can be easily generalised to any model that is linear at the level of the explanatory variables.

...read moreread less

252 citations

Journal Article•DOI•

Bootstrapping heteroskedastic regression models: wild bootstrap vs. pairs bootstrap

[...]

Emmanuel Flachaire¹•Institutions (1)

University of Paris¹

01 Apr 2005-Computational Statistics & Data Analysis

TL;DR: In regression models, appropriate bootstrap methods for inference robust to heteroskedasticity of unknown form are the wildbootstrap and the pairs bootstrap and simulation results suggest that one specific version of the wild bootstrap outperforms the other versions of theWild bootstraps.

...read moreread less

192 citations

Journal Article•DOI•

PCA and PLS with very large data sets

[...]

Nouna Kettaneh, Anders Berglund¹, Svante Wold¹•Institutions (1)

Umeå University¹

01 Jan 2005-Computational Statistics & Data Analysis

TL;DR: A multivariate approach based on projections—PCA and PLS—was introduced to cope with the rapidly increasing volumes of data produced in chemical laboratories and showed promising results.

...read moreread less

189 citations

Journal Article•DOI•

A mixture model for preferences data analysis

[...]

Angela D'Elia, Domenico Piccolo

01 Jun 2005-Computational Statistics & Data Analysis

TL;DR: A mixture model for preferences data, which adequately represents the composite nature of the elicitation mechanism in ranking processes, is proposed and empirical evidence from different data sets confirming the goodness of fit of the proposed model to many real preferences data is shown.

...read moreread less

177 citations

Journal Article•DOI•

Improved biclustering of microarray data demonstrated through systematic performance tests

[...]

Heather Turner¹, Trevor C. Bailey¹, Wojtek J. Krzanowski¹•Institutions (1)

University of Exeter¹

01 Feb 2005-Computational Statistics & Data Analysis

TL;DR: A new algorithm is presented for fitting the plaid model, a biclustering method developed for clustering gene expression data, and a benchmark for future evaluation of bic Lustering methods is established.

...read moreread less

170 citations

Journal Article•DOI•

PLS regression on a stochastic process

[...]

Cristian Preda¹, Gilbert Saporta²•Institutions (2)

Lille University of Science and Technology¹, Conservatoire national des arts et métiers²

01 Jan 2005-Computational Statistics & Data Analysis

TL;DR: The PLS components existence as eigenvectors of some operator and convergence properties of the PLS approximation are proved and the results of an application to stock-exchange data will be compared with those obtained by other methods.

...read moreread less

Journal Article•DOI•

Bundling classifiers by bagging trees

[...]

Torsten Hothorn¹, Berthold Lausen¹•Institutions (1)

University of Erlangen-Nuremberg¹

01 Jun 2005-Computational Statistics & Data Analysis

TL;DR: In this article, classification trees are employed to bundle their predictions for the bootstrap sample, and a combined classifier is developed, which is superior to any of the single classifiers in many applications.

...read moreread less

Journal Article•DOI•

Discrimination on latent components with respect to patterns. Application to multicollinear data

[...]

Hicham Nocairi¹, El Mostafa Qannari¹, Evelyne Vigneau¹, Dominique Bertrand¹•Institutions (1)

Institut national de la recherche agronomique¹

01 Jan 2005-Computational Statistics & Data Analysis

TL;DR: A new presentation of discriminant analysis consists in setting up patterns associated to the various groups and deriving latent variables in such a way that scores in each group are as highly clustered about their pattern as possible.

...read moreread less

Journal Article•DOI•

Regression of a data matrix on descriptors of both its rows and of its columns via latent variables: L-PLSR

[...]

Harald Martens¹, Harald Martens², Endre Anderssen², Arnar Flatberg², Lars Gidskehaug², Martin Høy², Frank Westad¹, Anette Kistrup Thybo, Magni Martens - Show less +5 more•Institutions (2)

Norwegian Food Research Institute¹, Norwegian University of Science and Technology²

01 Jan 2005-Computational Statistics & Data Analysis

TL;DR: The L-PLSR is applied to the analysis of consumer liking data Y of six products assessed by 125 persons, in light of 10 other product descriptors X and 15 other person descriptors Z.

...read moreread less

Journal Article•DOI•

Penalized spline smoothing in multivariable survival models with varying coefficients

[...]

Göran Kauermann¹•Institutions (1)

Bielefeld University¹

15 Apr 2005-Computational Statistics & Data Analysis

TL;DR: A hybrid routine is suggested which combines the mixed model idea with a classical Akaike information criteria and is evaluated with simulations and applied to data on the success and failure of newly founded companies.

...read moreread less

Journal Article•DOI•

Pairwise likelihood inference in spatial generalized linear mixed models

[...]

Cristiano Varin¹, Gudmund Høst², Øivind Skare³•Institutions (3)

University of Padua¹, Research Council of Norway², University of Oslo³

01 Jun 2005-Computational Statistics & Data Analysis

TL;DR: In order to maximize the pairwise likelihood, a new expectation-maximization-type algorithm which uses numerical quadrature is introduced and is found to give reasonable parameter estimates and to be computationally efficient.

...read moreread less

Journal Article•DOI•

Clusterwise PLS regression on a stochastic process

[...]

Cristian Preda¹, Gilbert Saporta²•Institutions (2)

Lille University of Science and Technology¹, Conservatoire national des arts et métiers²

15 Apr 2005-Computational Statistics & Data Analysis

TL;DR: The clusterwise linear regression is studied when the set of predictor variables forms a L 2 -continuous stochastic process and the number of clusters is treated as unknown and the convergence of the clusterwise algorithm is discussed.

...read moreread less

Journal Article•DOI•

Hidden hybrid Markov/semi-Markov chains

[...]

Yann Guédon¹•Institutions (1)

University of Montpellier¹

01 Jun 2005-Computational Statistics & Data Analysis

TL;DR: The forward-backward algorithm, which in particular enables to implement efficiently the E-step of the EM algorithm, and the Viterbi algorithm for the restoration of the most likely state sequence are derived.

...read moreread less

Journal Article•DOI•

Likelihood-ratio tests for normality

[...]

Jin Zhang¹, Yuehua Wu²•Institutions (2)

University of Manitoba¹, Keele University²

01 Jun 2005-Computational Statistics & Data Analysis

TL;DR: Powerful omnibus tests of normality based on the likelihood ratio are proposed, which outperform the best tests in the literature, including the Shapiro-Wilk and Anderson-Darling tests.

...read moreread less

Journal Article•DOI•

Fast and robust bootstrap for LTS

[...]

Gert Willems¹, Stefan Van Aelst²•Institutions (2)

University of Antwerp¹, Ghent University²

01 Apr 2005-Computational Statistics & Data Analysis

TL;DR: An alternative bootstrap method is proposed which is both computationally simple and robust and a simulation study shows that this method performs well, particularly regarding confidence intervals for the regression parameters.

...read moreread less

Journal Article•DOI•

Distribution of Mutual Information from Complete and Incomplete Data

[...]

Marcus Hutter¹, Marco Zaffalon¹•Institutions (1)

Dalle Molle Institute for Artificial Intelligence Research¹

01 Mar 2005-Computational Statistics & Data Analysis

TL;DR: In this paper, the posterior distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution, is analyzed and the exact analytical expression for the mean, and analytical approximations for the variance, skewness and kurtosis are derived.

...read moreread less

Journal Article•DOI•

Estimation of parameters for exponentiated-Weibull family under type-II censoring scheme

[...]

Umesh Singh¹, Pramod K. Gupta¹, Satyanshu K. Upadhyay¹•Institutions (1)

Banaras Hindu University¹

01 Mar 2005-Computational Statistics & Data Analysis

TL;DR: Bayes and classical estimators have been obtained for two-parameter exponentiated-Weibull distribution when sample is available from type-II censoring scheme and it has been seen that the estimators obtained are not available in nice closed forms, although they can be easily evaluated for the given sample by using suitable numerical methods.

...read moreread less

Journal Article•DOI•

Latent class models for mixed variables with applications in Archaeometry

[...]

Irini Moustaki¹, Ioulia Papageorgiou¹•Institutions (1)

Athens University of Economics and Business¹

01 Mar 2005-Computational Statistics & Data Analysis

TL;DR: The latent class model for mixed binary and metric variables is extended to accommodate any type of data (including ordinal and nominal) and its use in Archaeometry for classifying archaeological findings/ objects into groups is discussed.

...read moreread less

Journal Article•DOI•

Relevance measures for subset variable selection in regression problems based on k-additive mutual information

[...]

Ivan Kojadinovic¹•Institutions (1)

École polytechnique de l'université de Nantes¹

01 Jun 2005-Computational Statistics & Data Analysis

TL;DR: Results on the estimation of this index of stochastic dependence in a continuous setting are presented and computationally more efficient approximations of the mutual information based on the notion of k-additive truncation are proposed.

...read moreread less

Journal Article•DOI•

Extension to the product partition model: computing the probability of a change

[...]

Rosangela H. Loschi¹, Frederico R. B. Cruz¹•Institutions (1)

Universidade Federal de Minas Gerais¹

01 Feb 2005-Computational Statistics & Data Analysis

TL;DR: The well-known product partition model (PPM) is considered for the identification of multiple change points in the means and variances of normal data sequences and the posterior distributions of the partitions and the number of change points are extended.

...read moreread less

Journal Article•DOI•

A normal approximation for the chi-square distribution

[...]

Luisa Canal¹•Institutions (1)

University of Trento¹

01 Apr 2005-Computational Statistics & Data Analysis

TL;DR: Numerical results show that the maximum absolute error associated with the new transformation is substantially lower than that found for other power transformations of a chi-square random variable for all the degrees of freedom considered.

...read moreread less

Journal Article•DOI•

The wild bootstrap and heteroskedasticity-robust tests for serial correlation in dynamic regression models

[...]

Leslie G. Godfrey¹, A. R. Tremayne²•Institutions (2)

University of York¹, University of Sydney²

01 Apr 2005-Computational Statistics & Data Analysis

TL;DR: Monte Carlo evidence reported in this paper indicates that asymptotic critical values fail to give good control of finite sample significance levels of heteroskedasticity-robust versions of the standard Lagrange multiplier test, a Hausman-type check, and a new procedure.

...read moreread less

Journal Article•DOI•

Computational algorithms for double bootstrap confidence intervals

[...]

John C. Nankervis¹•Institutions (1)

University of Essex¹

01 Apr 2005-Computational Statistics & Data Analysis

TL;DR: Double bootstrap confidence intervals can be estimated using computational algorithms incorporating simple deterministic stopping rules that avoid unnecessary computations and efficiency gains are examined by means of a Monte Carlo study for examples of confidence intervals for a mean and for the cumulative impulse response in a second order autoregressive model.

...read moreread less

Journal Article•DOI•

Short communication: Optimising k-means clustering results with standard software packages

[...]

David J. Hand¹, Wojtek J. Krzanowski²•Institutions (2)

Imperial College London¹, University of Exeter²

01 Jun 2005-Computational Statistics & Data Analysis

TL;DR: An iterative scheme that generally improves on the default solution is suggested, and this scheme is compared with the ''best of 20 random starts'' method favoured by many users.

...read moreread less

Journal Article•DOI•

On generalized multivariate decision tree by using GEE

[...]

Seong Keon Lee¹•Institutions (1)

Chuo University¹

01 Jun 2005-Computational Statistics & Data Analysis

TL;DR: This paper will modify the tree for univariate response procedure and suggest a new tree-based method that can analyze any type of multiple responses by using generalized estimating equations techniques.

...read moreread less