scispace - formally typeset
Search or ask a question
Author

Edwin Diday

Bio: Edwin Diday is an academic researcher from Paris Dauphine University. The author has contributed to research in topics: Symbolic data analysis & Cluster analysis. The author has an hindex of 31, co-authored 133 publications receiving 5240 citations. Previous affiliations of Edwin Diday include CEREMADE & University of Paris.


Papers
More filters
Book
03 Feb 2000
TL;DR: This work focuses on Symbolic Data Analysis and the SODAS Project: Purpose, History, Perspective, and Symbolic Objects, where H.H. Bock and E. Diday focused on the former and the latter dealt with the latter.
Abstract: E. Diday: Symbolic Data Analysis and the SODAS Project: Purpose, History, Perspective.- H.H. Bock: The Classical Data Situation.- H.H. Bock: Symbolic Data.- H.H. Bock, E. Diday: Symbolic Objects.- V. Stephan, G. Hebrail, Y. Lechevallier: Generation of Symbolic Objects from Relational Databases.- P. Bertrand, F. Goupil: Descriptive Statistics for Symbolic Data.- M. Noirhomme-Fraiture, M. Rouard: Visualizing and Editing Symbolic Objects.- Similarity and Dissimilarity: F. Esposito, D. Malerba, V. Tamma, H.H. Bock: Classical Resemblance Measures.- H.H. Bock: Dissimilarity Measures for Probability Distributions.- F. Esposito, D. Malerba, V. Tamma: Dissimilarity Measures for Symbolic Objects.- F. Esposito, D. Malerba, F. Lisi: Matching Symbolic Objects.- Symbolic Factor Analysis: H.H.Bock: Classical Principal Component Analysis.- A. Chouakria, P. Cazes, E. Diday: Symbolic Principal Component Analysis.- N.C. Lauro, F. Palumbo, R. Verde: Factorial Discriminant Analysis on Symbolic Objects.- Discrimination: Assigning Symbolic Objects to Classes: J. Rasson, S. Lissoir: Classical Methods of Discrimination.- J. Rasson, S. Lissoir: Symbolic Kernel Discriminant Analysis.- E. Perinel, Y. Lechevalier: Symbolic Discrimination Rules.- M. Bravo Llatas, J. Garcia-Santesmases: Segmentation Trees for Stratified Data.- Clustering Methods for Symbolic Objects: M. Chavent, H.H. Bock: Clustering Problem, Clustering Methods for Classical Data.- M. Chavent: Criterion-Based Divisive Clustering for Symbolic Data.- P. Brito: Hierarchical and Pyramidal Clustering with Complete Symbolic Objects.- G. Polaillon: Pyramidal Classification for Interval Data Using Galois Lattice Reduction.- M. Gettler-Summa, C. Pardoux: Symbolic Approaches for Three-way Data.-Illustrative Benchmark Analysis: R. Bisdorff: Introduction.- R. Bisdorff: Professional Careers of Retired Working Persons.- A. Iztueta, P. Calvo: Labour Force Survey.- F. Goupil, M. Touati, E. Diday, R. Moult: Census Data from the Office for National Statistics.- A. Morineau: The SODAS Software Package.

605 citations

BookDOI
01 Jan 2000

441 citations

Book ChapterDOI
31 Dec 2006
TL;DR: This chapter discusses Descriptive Statistics: Two or More Variates, which focuses on the part of the model concerned with Hierarchy-Divisive Clustering and Cluster Analysis.
Abstract: The first book to present a unified account of symbolic data analysis methods in a consistent statistical framework, Symbolic Data Analysis features a substantial number of examples from a range of application areas, including health, the social sciences, economics, and computer science. It includes implementation of the methods described using SODAS software, which has been developed by a team led by Edwin Diday and is freely available on the Web, with an additional chapter that provides a basic guide to the software. It also features exercises at the end of each chapter to help the reader develop their understanding of the methodology, and to enable use of the book as a course text. The book is supported by a website featuring a link to download SODAS software, datasets, solutions to exercises, and additional teaching material.

361 citations

Journal ArticleDOI
TL;DR: A new dissimilarity measure, based on “position”, “span” and “content” of symbolic objects is proposed for symbolic clustering, and the results of the application of the algorithm on numeric data of known number of classes are described first to show the efficacy of the method.

325 citations

Journal ArticleDOI
TL;DR: This article attempts to review the methods currently available to analyze symbolic data, and it quickly becomes clear that the range of methodologies available draws analogies with developments before 1900 that formed a foundation for the inferential statistics of the 1900s.
Abstract: Increasingly, datasets are so large they must be summarized in some fashion so that the resulting summary dataset is of a more manageable size, while still retaining as much knowledge inherent to the entire dataset as possible. One consequence of this situation is that the data may no longer be formatted as single values such as is the case for classical data, but rather may be represented by lists, intervals, distributions, and the like. These summarized data are examples of symbolic data. This article looks at the concept of symbolic data in general, and then attempts to review the methods currently available to analyze such data. It quickly becomes clear that the range of methodologies available draws analogies with developments before 1900 that formed a foundation for the inferential statistics of the 1900s, methods largely limited to small (by comparison) datasets and classical data formats. The scarcity of available methodologies for symbolic data also becomes clear and so draws attention to an enor...

307 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.
Abstract: Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify cross-cutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.

14,054 citations

Posted Content
TL;DR: Deming's theory of management based on the 14 Points for Management is described in Out of the Crisis, originally published in 1982 as mentioned in this paper, where he explains the principles of management transformation and how to apply them.
Abstract: According to W. Edwards Deming, American companies require nothing less than a transformation of management style and of governmental relations with industry. In Out of the Crisis, originally published in 1982, Deming offers a theory of management based on his famous 14 Points for Management. Management's failure to plan for the future, he claims, brings about loss of market, which brings about loss of jobs. Management must be judged not only by the quarterly dividend, but by innovative plans to stay in business, protect investment, ensure future dividends, and provide more jobs through improved product and service. In simple, direct language, he explains the principles of management transformation and how to apply them.

9,241 citations

Journal ArticleDOI
TL;DR: The basic ideas of PCA are introduced, discussing what it can and cannot do, and some variants of the technique have been developed that are tailored to various different data types and structures.
Abstract: Large datasets are increasingly common and are often difficult to interpret. Principal component analysis (PCA) is a technique for reducing the dimensionality of such datasets, increasing interpretability but at the same time minimizing information loss. It does so by creating new uncorrelated variables that successively maximize variance. Finding such new variables, the principal components, reduces to solving an eigenvalue/eigenvector problem, and the new variables are defined by the dataset at hand, not a priori , hence making PCA an adaptive data analysis technique. It is adaptive in another sense too, since variants of the technique have been developed that are tailored to various different data types and structures. This article will begin by introducing the basic ideas of PCA, discussing what it can and cannot do. It will then describe some variants of PCA and their application.

4,289 citations

Journal ArticleDOI
01 Jun 1991
TL;DR: The subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed, and the relation between decision trees and neutral networks (NN) is also discussed.
Abstract: A survey is presented of current methods for decision tree classifier (DTC) designs and the various existing issues. After considering potential advantages of DTCs over single-state classifiers, the subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed. The relation between decision trees and neutral networks (NN) is also discussed. >

3,176 citations