Topic

Decision tree model

About: Decision tree model is a research topic. Over the lifetime, 2256 publications have been published within this topic receiving 38142 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Gradient Boosting Survival Tree with Applications in Credit Scoring

[...]

Miaojun Bai, Yan Zheng, Yun Shen

09 Aug 2019-arXiv: Learning

TL;DR: A nonparametric ensemble tree model called gradient boosting survival tree (GBST) is proposed that extends the survival tree models with a gradient boosting algorithm and outperforms the existing survival models measured by the concordance index, Kolmogorov–Smirnov index, and the area under the receiver operating characteristic curve of each time period.

...read moreread less

Abstract: Credit scoring plays a vital role in the field of consumer finance. Survival analysis provides an advanced solution to the credit-scoring problem by quantifying the probability of survival time. In order to deal with highly heterogeneous industrial data collected in Chinese market of consumer finance, we propose a nonparametric ensemble tree model called gradient boosting survival tree (GBST) that extends the survival tree models with a gradient boosting algorithm. The survival tree ensemble is learned by minimizing the negative log-likelihood in an additive manner. The proposed model optimizes the survival probability simultaneously for each time period, which can reduce the overall error significantly. Finally, as a test of the applicability, we apply the GBST model to quantify the credit risk with large-scale real market datasets. The results show that the GBST model outperforms the existing survival models measured by the concordance index (C-index), Kolmogorov-Smirnov (KS) index, as well as by the area under the receiver operating characteristic curve (AUC) of each time period.

...read moreread less

7 citations

Journal Article•DOI•

A novel multi-class ensemble model based on feature selection using Hadoop framework for classifying imbalanced biomedical data

[...]

Thulasi Bikku, N. Sambasiva Rao, Ananda Rao Akepogu¹•Institutions (1)

Jawaharlal Nehru Technological University, Anantapur¹

18 Dec 2018-International Journal of Business Intelligence and Data Mining

TL;DR: The experimental results on the complex biomedical datasets show that the performance of the proposed Hadoop based multi-class ensemble model significantly outperforms state-of-the-art baselines.

...read moreread less

Abstract: Due to the exponential growth of biomedical repositories such as PubMed and Medline, an accurate predictive model is essential for knowledge discovery in Hadoop environment. Traditional decision tree models such as multivariate Bernoulli model, random forest and multinominal naive Bayesian tree use attribute selection measures to decide best split at each node of the decision tree. Also, the efficiency of document analysis in Hadoop framework is limited mainly due to the class imbalance problem and large candidate sets. In this paper, we proposed a two phase map-reduce framework with text preprocessor and classification model. In the first phase, mapper based preprocessing method was designed to eliminate irrelevant features, missing values and outliers from the biomedical data. In the second phase, a map-reduce based multi-class ensemble decision tree model was designed and implemented on the preprocessed mapper data to improve the true positive rate and computational time. The experimental results on the complex biomedical datasets show that the performance of our proposed Hadoop based multi-class ensemble model significantly outperforms state-of-the-art baselines.

...read moreread less

7 citations

Proceedings Article•DOI•

Unbiased Decision Tree Model for User's QoE in Imbalanced Dataset

[...]

Lei Wang¹, Jiefeng Jin¹, Ruochen Huang¹, Xin Wei¹, Jianxin Chen¹ - Show less +1 more•Institutions (1)

Nanjing University¹

04 May 2016

TL;DR: This paper discusses the relationship between the status of IPTV set-top box and user's QoE, and proposes the unbiased decision tree model to deal with the imbalance dataset.

...read moreread less

Abstract: Nowadays, Internet Protocol Television (IPTV) is gradually replacing the traditional TV. IPTV Users require better experience. Therefore, media providers are interested in finding the key factors which influence the Quality of Experience (QoE), and it is necessary to find a model to predict the QoE. In this paper, we discuss the relationship between the status of IPTV set-top box and user's QoE. There is not a uniform standard to measure or improve user's QoE in IPTV, so we combine the status data from IPTV set-top box with user's complaints, selecting the appropriate model and using it for predicting user's QoE. As the data from IPTV set-top box is imbalance, the traditional algorithm does not perform well in terms of predicting user's QoE. To solve this problem, we propose the unbiased decision tree model to deal with the imbalance dataset. First of all, we clean the dataset. Then, we select important features influencing QoE by the feature selection technology. Finally, we compare CART model and the unbiased decision tree model. We demonstrate that the unbiased decision tree model performs well in the imbalance dataset and achieve a high accuracy.

...read moreread less

7 citations

Proceedings Article•DOI•

Decision trees with AND, OR queries

[...]

Yosi Ben-Asher¹, Ilan Newman¹•Institutions (1)

University of Haifa¹

19 Jun 1995

TL;DR: A tight lower bound of /spl theta/(k log(n/k))) is proved for the required depth of a decision tree for the threshold-k function and a tighter lower bound for the "direct sum" problem of computing simultaneously k copies of threshold-2 is proved.

...read moreread less

Abstract: We investigate decision trees in which one is allowed to query threshold functions of subsets of variables. We are mainly interested in the case where only queries of AND and OR are allowed. This model is a generalization of the classical decision tree model. Its complexity (depth) is related to the parallel time that is required to compute Boolean functions in certain CRCW PRAM machines with only one cell of constant size. It is also related to the computation using the Ethernet channel. We prove a tight lower bound of /spl theta/(k log(n/k)) for the required depth of a decision tree for the threshold-k function. As a corollary of the method we also prove a tight lower bound for the "direct sum" problem of computing simultaneously k copies of threshold-2 in this model. Next, the size complexity is considered. A relation to depth-three circuits is established and a lower bound is proven. Finally the relation between randomization, nondeterminism and determinism is also investigated, we show separation results between these models.

...read moreread less

7 citations

Patent•

System and method for building decision tree classifiers using bitmap techniques

[...]

Shiby Thomas¹, Wei Li¹, Joseph Yarmus¹, Mahesh Jagannath¹, Ari W. Mozes¹ - Show less +1 more•Institutions (1)

Business International Corporation¹

01 Feb 2006

TL;DR: In this article, a method, system, and computer program product for counting predictor-target pairs for a decision tree model provides the capability to generate count tables that are quicker and more efficient than previous techniques.

...read moreread less

Abstract: A method, system, and computer program product for counting predictor-target pairs for a decision tree model provides the capability to generate count tables that is quicker and more efficient than previous techniques. A method of counting predictor-target pairs for a decision tree model, the decision tree model based on data stored in a database, the data comprising a plurality of rows of data, at least one predictor and at least one target, comprises generating a bitmap for each split node of data stored in a database system by intersecting a parent node bitmap and a bitmap of a predictor that satisfies a condition of the node, intersecting each split node bitmap with each predictor bitmap and with each target bitmap to form intersected bitmaps, and counting bits of each intersected bitmap to generate a count of predictor-target pairs.

...read moreread less

7 citations

Collapse

Network Information

Performance

Metrics

2,288

Papers

43,502

Citations

No. of papers in the topic in previous years
Year	Papers
2023	10
2022	24
2021	101
2020	163
2019	158
2018	121

Decision tree model

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics