scispace - formally typeset
Search or ask a question
Topic

Decision tree model

About: Decision tree model is a research topic. Over the lifetime, 2256 publications have been published within this topic receiving 38142 citations.


Papers
More filters
Journal ArticleDOI
01 Jan 2015
TL;DR: A genetic algorithm-based neural network is formed to extract required number of branches for decision nodes in decision tree and a general neural decision tree has been formed to design a decision node for each internal node in general Neural decision tree.
Abstract: The decision trees are the techniques of machine learning, classification and recognition of images. The neural trees provide a solution for combining both decision tree and neural network. The neural tree node containing a neural network is called neural node. In the proposed study a genetic algorithm-based decision tree is formed to extract required number of branches for decision nodes in decision tree. In continuation of this, a genetic algorithm-based neural network is formed to extract required number of output nodes and hidden nodes in the neural network. Finally, by combination of the general decision tree and general neural tree, a general neural decision tree has been formed to design a decision node for each internal node in general neural decision tree. In the experimental phase general neural decision tree overrun the performance of decision tree and neural tree for classification of an image.

5 citations

Proceedings ArticleDOI
07 May 1996
TL;DR: A new statistical model for acoustic observations in speech recognition that represents intra-utterance phone correlation and can be viewed as a probabilistic model of an utterance or a speaker.
Abstract: This paper introduces a new statistical model for acoustic observations in speech recognition. The model represents intra-utterance phone correlation and can be viewed as a probabilistic model of an utterance or a speaker. The phone correlation is modeled by a dependence tree, using the mutual information between pairs of phones as a measure of correlation. The experiments presented focus on robust tree topology design and on robust estimation for a given topology. With appropriate design algorithms, the dependence trees are shown to provide a better model for independent test data than an independent phone model.

5 citations

Journal ArticleDOI
TL;DR: The soft decision tree is proposed, which is a binary decision tree with soft decisions at the internal nodes that improves model generalization and provides a superior function approximator because it is able to assign each context to several overlapped leaves.
Abstract: This paper proposes the use of a new binary decision tree, which we call a soft decision tree, to improve generalization performance compared to the conventional ‘hard’ decision tree method that is used to cluster context-dependent model parameters in statistical parametric speech synthesis. We apply the method to improve the modeling of fundamental frequency, which is an important factor in synthesizing natural-sounding high-quality speech. Conventionally, hard decision tree-clustered hidden Markov models (HMMs) are used, in which each model parameter is assigned to a single leaf node. However, this ‘divide-and-conquer’ approach leads to data sparsity, with the consequence that it suffers from poor generalization, meaning that it is unable to accurately predict parameters for models of unseen contexts: the hard decision tree is a weak function approximator. To alleviate this, we propose the soft decision tree, which is a binary decision tree with soft decisions at the internal nodes. In this soft clustering method, internal nodes select both their children with certain membership degrees; therefore, each node can be viewed as a fuzzy set with a context-dependent membership function. The soft decision tree improves model generalization and provides a superior function approximator because it is able to assign each context to several overlapped leaves. In order to use such a soft decision tree to predict the parameters of the HMM output probability distribution, we derive the smoothest (maximum entropy) distribution which captures all partial first-order moments and a global second-order moment of the training samples. Employing such a soft decision tree architecture with maximum entropy distributions, a novel speech synthesis system is trained using maximum likelihood (ML) parameter re-estimation and synthesis is achieved via maximum output probability parameter generation. In addition, a soft decision tree construction algorithm optimizing a log-likelihood measure is developed. Both subjective and objective evaluations were conducted and indicate a considerable improvement over the conventional method.

5 citations

Journal ArticleDOI
TL;DR: This paper aims to create the best model for predicting breast cancer through preprocessing, feature extraction, data visualization and prediction using breast cancer data, using algorithms like the random forest, decision tree with single and multiple predictors and 5-fold cross-validation methods.
Abstract: Breast cancer is one of the leading causes of death in women worldwide. Around one in 30 women are affected by breast cancer. Mammography has helped in detecting breast cancer in the early stages which have reduced mortality. The diagnosis of breast cancer is dependent on a variety of parameters. In this paper, we aim to create the best model for predicting breast cancer through preprocessing, feature extraction, data visualization and prediction using breast cancer data. Various visualization techniques like violin plot, grid plot, swarm plot and heat plot were utilized for proper feature extraction which has improved the accuracy of our results. For the purpose of prediction, we have used algorithms like the random forest, decision tree with single and multiple predictors, along with the commonly used statistical model, logistic regression model. We have also relied on 5-fold cross-validation methods to measure the unbiasedness of the prediction models for performance reasons. An analysis of the models was carried out and the best model was selected based on its accuracy. The results showcased that the random forest model provided an accuracy rate of 94.724% with decent 5-fold cross-validation, followed by the decision tree model which had an accuracy rate of 100% with poor 5-fold cross-validation. This was followed by the logistic regression model which had an accuracy rate of 88.442% with a low 5-fold cross-validation score.

5 citations

Journal Article
TL;DR: When machine learning (ML) and data mining (DM) methods construct models in complex domains, models can contain less-credible parts, which are statistically significant, but meaningless to the human analyst.
Abstract: When machine learning (ML) and data mining (DM) methods construct models in complex domains, models can contain less-credible parts [2], which are statistically significant, but meaningless to the human analyst. For example, let us consider a decision tree model presented in Figure 1. The tree is constructed with the J48 algorithm in Weka [8] for a complex domain indicating which segments of research and development (R&D) sector have the highest impact on economic welfare of a country. Nodes in the tree represent segments of the R&D sector. Leaves in the tree represent economic welfare of the majority of countries that reached the specific leaf. Economic welfare can be: low, middle or high. In each leaf, the first number in brackets represents the number of countries that reached that leaf. The second number represents the number of countries in that leaf with the level of welfare different than the one represented by the leaf. The quantities are expressed in decimals to account for those countries with missing values for segments appearing in the tree. Note that the left subtree is omitted to simplify the example.

5 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
80% related
Artificial neural network
207K papers, 4.5M citations
78% related
Fuzzy logic
151.2K papers, 2.3M citations
77% related
The Internet
213.2K papers, 3.8M citations
77% related
Deep learning
79.8K papers, 2.1M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202310
202224
2021101
2020163
2019158
2018121