scispace - formally typeset
Search or ask a question

Statistique exploratoire multidimensionnelle

TL;DR: In this article, Sommaire et al. presented a method for decomposition of the valeurs singulieres in the context of analysis factorielle and classification.
Abstract: Ouvragedestine aux etudiants de 2e cycle Sommaire: METHODES FACTORIELLES: Analyse generale, decomposition aux valeurs singulieres; Analyse en Composantes Principales; Analyse des correspondances; Analyse des correspondances multiples; QUELQUES METHODES DE CLASSIFICATION: Agregation autour des centres mobiles; Classification hierarchique; Classification mixte et description statistique des classes; Complementarite entre analyse factorielle et classificationLIENS AVEC LES METHODES EXPLICATIVES USUELLES, METHODES DERIVEES: Analyse canonique; Regression multiple, modele lineaire; Analyse factorielle discriminante; Modeles log-lineaires; Segmentation; Structures de graphe, analyses locales; Tableaux multiples, groupes de variables VALIDITE ET PORTEE DES RESULTATS: Signification des valeurs propres et des taux d'inertie; Stabilite des axes, des formes, des classes

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: FactoMineR an R package dedicated to multivariate data analysis with the possibility to take into account different types of variables (quantitative or categorical), different kinds of structure on the data, and finally supplementary information (supplementary individuals and variables).
Abstract: In this article, we present FactoMineR an R package dedicated to multivariate data analysis. The main features of this package is the possibility to take into account different types of variables (quantitative or categorical), different types of structure on the data (a partition on the variables, a hierarchy on the variables, a partition on the individuals) and finally supplementary information (supplementary individuals and variables). Moreover, the dimensions issued from the different exploratory data analyses can be automatically described by quantitative and/or categorical variables. Numerous graphics are also available with various options. Finally, a graphical user interface is implemented within the Rcmdr environment in order to propose an user friendly package.

6,472 citations

Journal ArticleDOI
TL;DR: The R package NbClust provides 30 indices which determine the number of clusters in a data set and it offers also the best clustering scheme from different results to the user.
Abstract: Clustering is the partitioning of a set of objects into groups (clusters) so that objects within a group are more similar to each others than objects in different groups. Most of the clustering algorithms depend on some assumptions in order to define the subgroups present in a data set. As a consequence, the resulting clustering scheme requires some sort of evaluation as regards its validity. The evaluation procedure has to tackle difficult problems such as the quality of clusters, the degree with which a clustering scheme fits a specific data set and the optimal number of clusters in a partitioning. In the literature, a wide variety of indices have been proposed to find the optimal number of clusters in a partitioning of a data set during the clustering process. However, for most of indices proposed in the literature, programs are unavailable to test these indices and compare them. The R package NbClust has been developed for that purpose. It provides 30 indices which determine the number of clusters in a data set and it offers also the best clustering scheme from different results to the user. In addition, it provides a function to perform k-means and hierarchical clustering with different distance measures and aggregation methods. Any combination of validation indices and clustering methods can be requested in a single function call. This enables the user to simultaneously evaluate several clustering schemes while varying the number of clusters, to help determining the most appropriate number of clusters for the data set of interest.

1,912 citations


Cites background from "Statistique exploratoire multidimen..."

  • ...Dindex The Dindex (Lebart et al. 2000) is based on clustering gain on intra-cluster inertia....

    [...]

  • ...Lebart, Morineau, and Piron (2000) proposed a criterion based on the first and second derivatives and Halkidi, Vazirgiannis, and Batistakis (2000) and Halkidi and Vazirgiannis (2001) proposed two indices: SD index which is based on the concepts of average scattering for clusters and total…...

    [...]

Posted Content
TL;DR: In this article, the authors consider why institutional forms of modern capitalist economies differ internationally, and propose a typology of capitalism based on the theory of institutional complementarity, which is the outcome of socio-political compromises.
Abstract: This book considers why institutional forms of modern capitalist economies differ internationally, and proposes a typology of capitalism based on the theory of institutional complementarity Different economic models are not simply characterized by different institutional forms, but also by particular patterns of interaction between complementary institutions which are the core characteristics of these models Institutions are not just simply devices which would be chosen by 'social engineers' in order to perform a function as efficiently as possible; they are the outcome of a political economy process Therefore, institutional change should be envisaged not as a move towards a hypothetical 'one best way', but as a result of socio-political compromises Based on a theory of institutions and comparative capitalism, the book proposes an analysis of the diversity of modern economies - from America to Korea - and identifies five different models: the market-based Anglo-Saxon model; Asian capitalism; the Continental European model; the social democratic economies; and the Mediterranean model Each of these types of capitalism is characterized by specific institutional complementarities The question of the stability of the Continental European model of capitalism has been open since the beginning of the 1990s: inferior macroeconomic performance compared to Anglo-Saxon economies, alleged unsustainability of its welfare systems, too rigid markets, etc The book examines the institutional transformations that have taken place within Continental European economies and analyses the political project behind the attempts at transforming the Continental model It argues that Continental European economies will most likely stay very different from the market-based economies, and caat political strategies promoting institutional change aiming at convergence with the Anglo-Saxon model are bound to meet considerable opposition

1,611 citations

Journal ArticleDOI
TL;DR: This study closely examine functional diversity indices to clarify their accuracy, consistency, and independence, and recommends using the new functional richness indices that consider intraspecific variability and thus empty space in the functional niche space.
Abstract: Functional diversity is the diversity of species traits in ecosystems. This concept is increasingly used in ecological research, yet its formal definition and measurements are currently under discussion. As the overall behavior and consistency of functional diversity indices have not been described so far, the novice user risks choosing an inaccurate index or a set of redundant indices to represent functional diversity. In our study we closely examine functional diversity indices to clarify their accuracy, consistency, and independence. Following current theory, we categorize them into functional richness, evenness, or divergence indices. We considered existing indices as well as new indices developed in this study. The new indices aimed at remedying the weaknesses of currently used indices (e.g., by taking into account intraspecific variability). Using virtual data sets, we test (1) whether indices respond to community changes as expected from their category and (2) whether the indices within each category are consistent and independent of indices from other categories. We also test the accuracy of methods proposed for the use of categorical traits. Most classical functional richness indices either failed to describe functional richness or were correlated with functional divergence indices. We therefore recommend using the new functional richness indices that consider intraspecific variability and thus empty space in the functional niche space. In contrast, most functional evenness and divergence indices performed well with respect to all proposed tests. For categorical variables, we do not recommend blending discrete and real-valued traits (except for indices based on distance measures) since functional evenness and divergence have no transposable meaning for discrete traits. Nonetheless, species diversity indices can be applied to categorical traits (using trait levels instead of species) in order to describe functional richness and equitability.

581 citations


Cites background from "Statistique exploratoire multidimen..."

  • ..., Principal Component Analysis), which are known to perform well when summarizing complex data (Lebart et al. 2000)....

    [...]

Journal ArticleDOI
TL;DR: The findings strongly support that computer simulations may be used as an alternative instructional tool, in order to help students confront their cognitive constraints and develop functional understanding of physics.
Abstract: A major research domain in physics education is focused on the study of the effects of various types of teaching interventions aimed to help students' alternative conceptions transformation Computer simulations are applications of special interest in physics teaching because they can support powerful modeling environments involving physics concepts and processes In this study two groups (control and experimental) of 15–16 years old students were studied to determine the role of computer simulations in the development of functional understanding of the concepts of velocity and acceleration in projectile motions Both groups received traditional classroom instruction on these topics; the experimental group used computer simulations also The results presented here show that students working with simulations exhibited significantly higher scores in the research tasks Our findings strongly support that computer simulations may be used as an alternative instructional tool, in order to help students confront their cognitive constraints and develop functional understanding of physics

332 citations

References
More filters
01 Jan 1967
TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Abstract: The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give partitions which are reasonably efficient in the sense of within-class variance. That is, if p is the probability mass function for the population, S = {S1, S2, * *, Sk} is a partition of EN, and ui, i = 1, 2, * , k, is the conditional mean of p over the set Si, then W2(S) = ff=ISi f z u42 dp(z) tends to be low for the partitions S generated by the method. We say 'tends to be low,' primarily because of intuitive considerations, corroborated to some extent by mathematical analysis and practical computational experience. Also, the k-means procedure is easily programmed and is computationally economical, so that it is feasible to process very large samples on a digital computer. Possible applications include methods for similarity grouping, nonlinear prediction, approximating multivariate distributions, and nonparametric tests for independence among several variables. In addition to suggesting practical classification methods, the study of k-means has proved to be theoretically interesting. The k-means concept represents a generalization of the ordinary sample mean, and one is naturally led to study the pertinent asymptotic behavior, the object being to establish some sort of law of large numbers for the k-means. This problem is sufficiently interesting, in fact, for us to devote a good portion of this paper to it. The k-means are defined in section 2.1, and the main results which have been obtained on the asymptotic behavior are given there. The rest of section 2 is devoted to the proofs of these results. Section 3 describes several specific possible applications, and reports some preliminary results from computer experiments conducted to explore the possibilities inherent in the k-means idea. The extension to general metric spaces is indicated briefly in section 4. The original point of departure for the work described here was a series of problems in optimal classification (MacQueen [9]) which represented special

24,320 citations


"Statistique exploratoire multidimen..." refers background in this paper

  • ...également les manuels généraux de Chandon et Pinson (1981), Jambu et Lebeaux (1978), Murtagh (1985), Roux (1985), Kaufman et Rousseeuw (1990)....

    [...]

  • ...également les travaux antérieurs et indépendants de Beltrami (1873) et Jordan (1874). Cf. également Gower (1966), Gabriel (1971)....

    [...]

  • ...également les travaux antérieurs et indépendants de Beltrami (1873) et Jordan (1874). Cf....

    [...]

  • ...également les travaux antérieurs et indépendants de Beltrami (1873) et Jordan (1874). Cf. également Gower (1966), Gabriel (1971). Le problème que l'on se propose de résoudre est alors un problème de réduction purement numérique, autrement dit, un problème de compression de données....

    [...]

  • ...1 La classification des éléments d'un tableau de contingence fondée sur le regroupement de catégories homogènes a été abordée par Benzécri (1973), Jambu et Lebeaux (1978), Govaert (1984), Cazes (1986), GHula (1986), Escoufier (1988), Greenacre (1988)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the problem of the estimation of a probability density function and of determining the mode of the probability function is discussed. Only estimates which are consistent and asymptotically normal are constructed.
Abstract: : Given a sequence of independent identically distributed random variables with a common probability density function, the problem of the estimation of a probability density function and of determining the mode of a probability function are discussed. Only estimates which are consistent and asymptotically normal are constructed. (Author)

10,114 citations


"Statistique exploratoire multidimen..." refers background in this paper

  • ...C'est celui qui avait été adopté par Pearson (1901). Bien entendu, il ne s'agissait pas de l'analyse en composantes principales telle que nous la présentons, mais les idées essentielles de la méthode étaient déjà entrevues par cet auteur....

    [...]

Journal ArticleDOI
TL;DR: In this article, a generalized form of the cross-validation criterion is applied to the choice and assessment of prediction using the data-analytic concept of a prescription, and examples used to illustrate the application are drawn from the problem areas of univariate estimation, linear regression and analysis of variance.
Abstract: SUMMARY A generalized form of the cross-validation criterion is applied to the choice and assessment of prediction using the data-analytic concept of a prescription. The examples used to illustrate the application are drawn from the problem areas of univariate estimation, linear regression and analysis of variance.

7,385 citations

Journal ArticleDOI
TL;DR: In this article, a text designed to make multivariate techniques available to behavioural, social, biological and medical students is presented, which includes an approach to multivariate inference based on the union-intersection and generalized likelihood ratio principles.
Abstract: A text designed to make multivariate techniques available to behavioural, social, biological and medical students. Special features include an approach to multivariate inference based on the union-intersection and generalized likelihood ratio principles.

6,488 citations