scispace - formally typeset
Search or ask a question

Showing papers on "Decision tree model published in 2021"


Journal ArticleDOI
TL;DR: In this paper, a decision tree based approach is proposed to enhance trust management by exploring the decision tree model in the area of IDS, which can be easily read and even resemble a human approach to decision making by splitting the choice into many small subchoices for IDS.
Abstract: Despite the growing popularity of machine learning models in the cyber-security applications (e.g., an intrusion detection system (IDS)), most of these models are perceived as a black-box. The eXplainable Artificial Intelligence (XAI) has become increasingly important to interpret the machine learning models to enhance trust management by allowing human experts to understand the underlying data evidence and causal reasoning. According to IDS, the critical role of trust management is to understand the impact of the malicious data to detect any intrusion in the system. The previous studies focused more on the accuracy of the various classification algorithms for trust in IDS. They do not often provide insights into their behavior and reasoning provided by the sophisticated algorithm. Therefore, in this paper, we have addressed XAI concept to enhance trust management by exploring the decision tree model in the area of IDS. We use simple decision tree algorithms that can be easily read and even resemble a human approach to decision-making by splitting the choice into many small subchoices for IDS. We experimented with this approach by extracting rules in a widely used KDD benchmark dataset. We also compared the accuracy of the decision tree approach with the other state-of-the-art algorithms.

89 citations


Journal ArticleDOI
TL;DR: A comparative study of machine learning models for pothole detection found the Random Forest Tree and KNN showed the best performance on the Test dataset with a similar accuracy, and the model performance increased when random search hyperparameter tuning was applied to optimise the Random forest Tree model's hyperparameters.
Abstract: Potholes are symptoms of a poorly maintained road, pointing to an underlying structural issue. A vehicle's impact with a pothole not only makes for an uncomfortable journey, but it can also cause damage to the vehicle's wheels, tyres and suspension system resulting in high repair bills. This study presents a comparative study of machine learning models for pothole detection. The data was collected from multiple android devices/routes/cars and pre-processed using a 2-second non-overlapping moving window to extract relevant statistical features for training a binary classifier. The Test dataset was isolated entirely from the Training and Validation datasets, and a stratified K-fold cross-validation was applied to the Training dataset. The Random Forest Tree and KNN showed the best performance on the Test dataset with a similar accuracy of 0.8889. The model performance increased when random search hyperparameter tuning was applied to optimise the Random Forest Tree model's hyperparameters. The Random Forest Tree model's performance after hyperparameter tuning is 0.9444, 1.0000, 0.8889 and 0.9412 for accuracy, precision, recall, and F-score, respectively.

23 citations


Journal ArticleDOI
TL;DR: It is verified that the performance of the decision tree model is improved under the boosting algorithm, and it could be applied to the prediction model for customers’ decisions on subscription to the fixed deposit business.
Abstract: A personal credit evaluation algorithm is proposed by the design of a decision tree with a boosting algorithm, and the classification is carried out. By comparison with the conventional decision tree algorithm, it is shown that the boosting algorithm acts to speed up the processing time. The Classification and Regression Tree (CART) algorithm with the boosting algorithm showed 90.95% accuracy, slightly higher than without boosting, 90.31%. To avoid overfitting of the model on the training set due to unreasonable data set division, we consider cross-validation and illustrate the results with simulation; hypermeters of the model have been applied and the model fitting effect is verified. The proposed decision tree model is fitted optimally with the help of a confusion matrix. In this paper, relevant evaluation indicators are also introduced to evaluate the performance of the proposed model. For the comparison with the conventional methods, accuracy rate, error rate, precision, recall, etc. are also illustrated; we comprehensively evaluate the model performance based on the model accuracy after the 10-fold cross-validation. The results show that the boosting algorithm improves the performance of the model in accuracy and precision when CART is applied, but the model fitting time takes much longer, around 2 min. With the obtained result, it is verified that the performance of the decision tree model is improved under the boosting algorithm. At the same time, we test the performance of the proposed verification model with model fitting, and it could be applied to the prediction model for customers’ decisions on subscription to the fixed deposit business.

19 citations


Journal ArticleDOI
TL;DR: In this article, the authors used Bayesian hyperparameters to optimize random forest and extreme gradient boosting decision trees model for landslide susceptibility mapping, and the two optimized models are compared using the receiver operating characteristic curve and confusion matrix.
Abstract: Landslides are widely distributed worldwide and often result in tremendous casualties and economic losses, especially in the Loess Plateau of China. Taking Wuqi County in the hinterland of the Loess Plateau as the research area, using Bayesian hyperparameters to optimize random forest and extreme gradient boosting decision trees model for landslide susceptibility mapping, and the two optimized models are compared. In addition, 14 landslide influencing factors are selected, and 734 landslides are obtained according to field investigation and reports from literals. The landslides were randomly divided into training data (70%) and validation data (30%). The hyperparameters of the random forest and extreme gradient boosting decision tree models were optimized using a Bayesian algorithm, and then the optimal hyperparameters are selected for landslide susceptibility mapping. Both models were evaluated and compared using the receiver operating characteristic curve and confusion matrix. The results show that the AUC validation data of the Bayesian optimized random forest and extreme gradient boosting decision tree model are 0.88 and 0.86, respectively, which showed an improvement of 4% and 3%, indicating that the prediction performance of the two models has been improved. However, the random forest model has a higher predictive ability than the extreme gradient boosting decision tree model. Thus, hyperparameter optimization is of great significance in the improvement of the prediction accuracy of the model. Therefore, the optimized model can generate a high-quality landslide susceptibility map.

16 citations


Journal ArticleDOI
TL;DR: Results show that the proposed precoding technique in conjunction with the machine learning algorithm based on a decision tree model that uses empirical data collected from the network can identify the status of cells and suitable self-healing procedures can be triggered to recover the cell accordingly.

15 citations


Journal ArticleDOI
TL;DR: This paper proposes the first secure protocol for collaborative evaluation of random forests contributed by multiple owners that outsource evaluation tasks to a third-party evaluator and is based on the new secure comparison protocol, secure counting protocol, and a multi-key somewhat homomorphic encryption on top of symmetric-key encryption.
Abstract: Decision tree and its generalization of random forests are a simple yet powerful machine learning model for many classification and regression problems. Recent works propose how to privately evaluate a decision tree in a two-party setting where the feature vector of the client or the decision tree model (such as the threshold values of its nodes) is kept secret from another party. However, these works cannot be extended trivially to support the outsourcing setting where a third-party who should not have access to the model or the query. Furthermore, their use of an interactive comparison protocol does not support branching program, hence requires interactions with the client to determine the comparison result before resuming the evaluation task. In this paper, we propose the first secure protocol for collaborative evaluation of random forests contributed by multiple owners. They outsource evaluation tasks to a third-party evaluator. Upon receiving the client's encrypted inputs, the cloud evaluates obliviously on individually encrypted random forest models and calculates the aggregated result. The system is based on our new secure comparison protocol, secure counting protocol, and a multi-key somewhat homomorphic encryption on top of symmetric-key encryption. This allows us to reduce communication overheads while achieving round complexity lower than existing work.

14 citations


Journal ArticleDOI
TL;DR: The proposed ensemble of decision trees for supervised learning model provided a reliable tool for forensic study and showed the specificity and sensitivity of the proposed model were comparable to other models.
Abstract: Since its inception in 2009, Bitcoin is mired in controversies for providing a haven for illegal activities. Several types of illicit users hide behind the blanket of anonymity. Uncovering these entities is key for forensic investigations. Current methods utilize machine learning for identifying these illicit entities. However, the existing approaches only focus on a limited category of illicit users. The current paper proposes to address the issue by implementing an ensemble of decision trees for supervised learning. More parameters allow the ensemble model to learn discriminating features that can categorize multiple groups of illicit users from licit users. To evaluate the model, a dataset of 1216 real-life entities on Bitcoin was extracted from the Blockchain. Nine Features were engineered to train the model for segregating 16 different licit-illicit categories of users. The proposed model provided a reliable tool for forensic study. Empirical evaluation of the proposed model vis-a-vis three existing benchmark models was performed to highlight its efficacy. Experiments showed that the specificity and sensitivity of the proposed model were comparable to other models. Due to higher parameters of the ensemble tree model, the classification accuracy was 0.91, with 95% CI - 0.8727, 0.9477. This was better than SVM and Logistic Regression, the two popular models in the literature and comparable to the Random Forest and XGBOOST model. CPU and RAM utilization were also monitored to demonstrate the usefulness of the proposed work for real-world deployment. RAM utilization for the proposed model was higher by 30-45% compared to the other three models. Hence, the proposed model is resource-intensive as it has higher parameters than the other three models. Higher parameters also result in higher accuracy of predictions.

13 citations


Journal ArticleDOI
TL;DR: This paper presents a differentially private latent tree (DPLT) approach, which is, to the best of the knowledge, the first approach to solving this challenging problem of publishing vertically partitioned data under differential privacy.
Abstract: In this paper, we study the problem of publishing vertically partitioned data under differential privacy, where different attributes of the same set of individuals are held by multiple parties. In this setting, with the assistance of a semi-trusted curator, the involved parties aim to collectively generate an integrated dataset while satisfying differential privacy for each local dataset. Based on the latent tree model (LTM), we present a differentially private latent tree (DPLT) approach, which is, to the best of our knowledge, the first approach to solving this challenging problem. In DPLT, the parties and the curator collaboratively identify the latent tree that best approximates the joint distribution of the integrated dataset, from which a synthetic dataset can be generated. The fundamental advantage of adopting LTM is that we can use the connections between a small number of latent attributes derived from each local dataset to capture the cross-dataset dependencies of the observed attributes in all local datasets such that the joint distribution of the integrated dataset can be learned with little injected noise and low computation and communication costs. DPLT is backed up by a series of novel techniques, including two-phase latent attribute generation (TLAG), tree index based correlation quantification (TICQ) and distributed Laplace perturbation protocol (DLPP). Extensive experiments on real datasets demonstrate that DPLT offers desirable data utility with low computation and communication costs.

12 citations


Journal ArticleDOI
TL;DR: A novel hybrid decision tree method applied with administrative data in a health care setting to predict the severity of AOD occurring within 1–5 h in an EMS system and indicates that the hybrid algorithm shows improvements in performance in the classification of the real-world problem.
Abstract: Ambulance offload delay (AOD) is a growing health care concern in Canada. It refers to the delay in transferring an ambulance patient to a hospital emergency department (ED) due to ED congestion. It can negatively affect the ability of the ambulance service to respond to future calls and reduce the efficiency of the system when the delay is significant. Using integrated historical data from a partnering hospital and an Emergency Medical Services (EMS) provider, we developed a decision-support tool using a hybrid decision tree model to predict the severity of AOD occurring within 1–5 h in an EMS system. The primary objective of this study is to provide an AOD prediction model based on the current system status, hour of the day, and day of the week. With this information, decision-makers can be proactive with efforts to mitigate AOD. Various prediction models are developed with different focuses and forecast periods. This research demonstrates a novel hybrid decision tree method applied with administrative data in a health care setting. A naive Bayes classifier is first used to remove noisy training observations before decision tree induction. This hybrid decision tree algorithm was tested against the basic classification and regression tree (CART) algorithm, using classification accuracy, precision, sensitivity, and specificity analysis. The results indicate that the hybrid algorithm shows improvements in performance in the classification of the real-world problem. It is anticipated that the prediction model for AOD produced from this study will be directly transferable. It can be generalized to other EMS systems, where predicting AOD is important for efficient operations.

11 citations


Journal ArticleDOI
TL;DR: Implemented in an easily installed package with a detailed vignette, treeheatr can be a useful teaching tool to enhance students’ understanding of a simple decision tree model before diving into more complex tree-based machine learning methods.
Abstract: Summary treeheatr is an R package for creating interpretable decision tree visualizations with the data represented as a heatmap at the tree's leaf nodes. The integrated presentation of the tree structure along with an overview of the data efficiently illustrates how the tree nodes split up the feature space and how well the tree model performs. This visualization can also be examined in depth to uncover the correlation structure in the data and importance of each feature in predicting the outcome. Implemented in an easily installed package with a detailed vignette, treeheatr can be a useful teaching tool to enhance students' understanding of a simple decision tree model before diving into more complex tree-based machine learning methods. Availability and implementation The treeheatr package is freely available under the permissive MIT license at https://trang1618.github.io/treeheatr and https://cran.r-project.org/package=treeheatr. It comes with a detailed vignette that is automatically built with GitHub Actions continuous integration.

11 citations


Proceedings ArticleDOI
01 Jan 2021
TL;DR: The author(s) conclude that decision tree regression is best for calculating the amount of ingredients required with R squared values close to 0.8 for most of the models.
Abstract: The objective of this paper is to find an alternative to conventional method of concrete mix design. For finding the alternative, 4 machine learning algorithms viz. multi-variable linear regression, Support Vector Regression, Decision Tree Regression and Artificial Neural Network for designing concrete mix of desired properties. The multi-variable linear regression model is just a simplistic baseline model, support vector regression Artificial Neural Network model were made because past researchers worked heavily on them, Decision tree model was made by authors own intuition. Their results have been compared to find the best algorithm. Finally, we check if the best performing algorithm is accurate enough to replace the convention method. For this, we utilize the concrete mix designs done in lab for various on site designs. The models have been designed for both mixes types – with plasticizer and without plasticizer The paper presents detailed comparison of four models Based on the results obtained from the four models, the best one has been selected based on high accuracy and least computational cost. Each sample had 24 features initially, out of which, most significant features were chosen which were contributing towards prediction of a variable using f regression and p values and models were trained on those selected features. Based on the R squared value, best fitting models were selected among the four algorithms used. From the paper, the author(s) conclude that decision tree regression is best for calculating the amount of ingredients required with R squared values close to 0.8 for most of the models. DTR model is also computationally cheaper than ANN and future works with DTR in mix design is highly recommended in this paper.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a method to distinguish between chronic and non-chronic kidney disease, identify its crucial features without reducing the accuracy of prediction, and a prediction algorithm to eliminate the possibility of under or overfitting.
Abstract: Prediction of diseases is sensitive as any error can result in the wrong person's treatment or not treating the right patient. Besides, some features distinguish a disease from curable to fatal or curable to chronic disease. Data mining techniques have been widely used in health-related research. The researchers, so far, could attain around 97 percent accuracy using several methods. Some researchers have demonstrated that the selection of correct features increases the prediction accuracy. This research work propose a method to distinguish between chronic and non-chronic kidney disease, identify its crucial features without reducing the accuracy of prediction, and a prediction algorithm to eliminate the possibility of under or overfitting. This study uses the recursive feature elimination (RFE) method that selects an optimal subset of features and an ensemble algorithm, the enhanced decision tree (EDT), to predict the disease. The results obtained in this paper show that the accuracy level of EDT is not changed with the removal of less significant features, thus enabling the decision-makers to concentrate on few features to reduce time and error of treatment. EDT establishes substantially high consistency in predicting, with or without feature selection, the disease.

Journal ArticleDOI
24 Jun 2021
TL;DR: It is indicated that entrepreneurial intentions can be partially predicted using the dataset in this current study, and the findings expand the current literature and invite future research.
Abstract: Youth unemployment rates present an issue both in developing and developed countries. The importance of analyzing entrepreneurial activities comes from their significant role in economic development and economic growth. In this study, a 10-year research was conducted. The dataset included 5670 participants—students from Serbia. The main goal of the study is to attempt to predict entrepreneurial intentions among the Serbian youth by analyzing demographics characteristics, close social environment, attitudes, awareness of incentive means, and environment assessment as potential influencing factors. The data analysis included Chi-square, Welch’s t-test, z-test, linear regression, binary logistic regression, ARIMA (Autoregressive Integrated Moving Average) regression, and a QUEST (Quick, Unbiased, Efficient, Statistical Tree) classification tree algorithm. The results are interesting and indicate that entrepreneurial intentions can be partially predicted using the dataset in this current study. Further, most likely due to the robust dataset, the results are not complementary with similar studies in this domain; therefore, these findings expand the current literature and invite future research.

Journal ArticleDOI
01 Jan 2021
TL;DR: To facilitate the task of building an academic prediction model, historical student academic dataset is used and the main aim is to build the prediction model by different families of the Machine Learning Techniques on the selected dataset for consideration.
Abstract: Data Mining is a field in which hidden information is extracted from a large database by using some algorithms implementation. These algorithms are further divided into some categories like classification, clustering, association rule mining etc according to information we want to extract. Data mining is a field which is widely spread over different areas like telecommunication, marketing, operation, hospitals, hotel industry, education etc. Predicting the academic’s performance and progress of the students has revealed the attention of the young researchers. To facilitate the task of building an academic prediction model, historical student academic dataset is used. In this paper, the contributions are exhibited in two different folds. In the first fold, the main aim is to build the prediction model by different families of the Machine Learning Techniques on the selected dataset for consideration. In the second fold, implementations of different ensemble meta-based model are presented by combining with different classification algorithms of Machine Learning Techniques. Different ensemble meta-based model taken into consideration for implementation are Bagging, AdaBoostM1, RandomSubSpace. The implementation results demonstrate that the ensemble meta-based technique (AdaBoostM1) gained a superior accuracy performance with MultilayerPerceptron Machine Learning technique reaching up to 80.33%.

Proceedings ArticleDOI
01 Mar 2021
TL;DR: In this article, the authors explore the use of mid-air finger 3D sketching in VR for tree modeling and demonstrate the ease-of-use, efficiency, and flexibility in tree modelling and overall shape control.
Abstract: 2D sketch-based tree modeling cannot guarantee to generate plausible depth values and full 3D tree shapes. With the advent of virtual reality (VR) technologies, 3D sketching enables a new form for 3D tree modeling. However, it is labor-intensive and difficult to create realistically-looking 3D trees with complicated geometry and lots of detailed twigs with a reasonable amount of effort. In this paper, we explore the use of mid-air finger 3D sketching in VR for tree modeling. We present a hybrid approach that integrates freehand 3D sketches with an automatic population of branch geometries. The user only needs to draw a few 3D strokes in mid-air to define the envelope of the foliage (denoted as lobes) and main branches. Our algorithm then automatically generates a full 3D tree model based on these stroke inputs. Additionally, the shape of the 3D tree model can be modified by freely dragging, squeezing, or moving lobes in mid-air. We demonstrate the ease-of-use, efficiency, and flexibility in tree modeling and overall shape control. We perform user studies and show a variety of realistic tree models generated instantaneously from 3D finger sketching.

Journal ArticleDOI
01 Feb 2021
TL;DR: A new coevolutionary fuzzy attribute order reduction algorithm (CFAOR) based on a complete attribute-value space tree model of decision table that can achieve the higher average computational efficiency and classification accuracy, compared with the state-of-the-art methods.
Abstract: Since big data sets are structurally complex, high-dimensional, and their attributes exhibit some redundant and irrelevant information, the selection, evaluation, and combination of those large-scale attributes pose huge challenges to traditional methods. Fuzzy rough sets have emerged as a powerful vehicle to deal with uncertain and fuzzy attributes in big data problems that involve a very large number of variables to be analyzed in a very short time. In order to further overcome the inefficiency of traditional algorithms in the uncertain and fuzzy big data, in this paper we present a new coevolutionary fuzzy attribute order reduction algorithm (CFAOR) based on a complete attribute-value space tree. A complete attribute-value space tree model of decision table is designed in the attribute space to adaptively prune and optimize the attribute order tree. The fuzzy similarity of multimodality attributes can be extracted to satisfy the needs of users with the better convergence speed and classification performance. Then, the decision rule sets generate a series of rule chains to form an efficient cascade attribute order reduction and classification with a rough entropy threshold. Finally, the performance of CFAOR is assessed with a set of benchmark problems that contain complex high dimensional datasets with noise. The experimental results demonstrate that CFAOR can achieve the higher average computational efficiency and classification accuracy, compared with the state-of-the-art methods. Furthermore, CFAOR is applied to extract different tissues surfaces of dynamical changing infant cerebral cortex and it achieves a satisfying consistency with those of medical experts, which shows its potential significance for the disorder prediction of infant cerebrum.

Journal ArticleDOI
TL;DR: This study generated the best spatial decision trees for each study area using spatial decision tree algorithm and found that on Magetan dataset, the best model has 33 rules with 94.34% accuracy and relief variable as the root node, whereas on Solok dataset, it has 66 rules with 60.29% accuracy.
Abstract: Predicting land and weather characteristics as indicators of land suitability is very important in increasing effectiveness in food production. This study aims to evaluate the suitability of garlic land using spatial decision tree algorithm. The algorithm is the improvement of the conventional decision tree algorithm in which spatial join relation is included to grow up spatial decision tree. The spatial dataset consists of a target layer that represents garlic land suitability and ten explanatory layers that represent land and weather characteristics in the study areas of Magetan and Solok district, Indonesia. This study generated the best spatial decision trees for each study area. On Magetan dataset, the best model has 33 rules with 94.34% accuracy and relief variable as the root node, whereas on Solok dataset, the best model has 66 rules with 60.29% accuracy and soil texture variable as the root node.

Journal ArticleDOI
14 Jan 2021
TL;DR: The performance of Decision Tree is lower than the performance of the Adaboost algorithm, and the results show that the management in the selection process can minimize the resignation number if the selection phase of new students is done accurately.
Abstract: Every year, all the colleges hold new student enrollment. It is needed to start a new school academic year. Unfortunately, the number of students who resigned is considerably high to reach 837 students and caused 324 empty seats. The college’s stakeholders can minimize the resignation number if the selection phase of new students is done accurately. Making a machine learning-based model can be the answer. The model will help predict which candidates who potentially complete the enrollment process. By knowing it in the first place will help the management in the selection process. This prediction is based on historical data. Data is processed and used to train the model using the Adaboost algorithm. The performance comparison between Adaboost and Decision Tree model is performed to find the best model. To achieve the maximum performance of the model, feature selection is performed using chi-square calculation. The results of this research show that the performance of Decision Tree is lower than the performance of the Adaboost algorithm. The Adaboost model has f-measure score of 90.9%, precision 83.7%, and recall 99.5%. The process of analyzing the data distribution of prospective new students was also conducted. The results were obtained if prospective students who tended to finish the enrollment process had the following characteristics: graduated from an Islamic school, 19-21 years old, parents' income was IDR 1,000,000 to IDR. 5,000,000, and through the SBMPTN program.

Journal ArticleDOI
TL;DR: A classification and regression tree (CART) model based on particle swarm optimisation to help patients choose between immunotherapy and cryotherapy that can accurately predict the response of patients to the two methods is established.
Abstract: Wart is a disease caused by human papillomavirus with common and plantar warts as general forms. Commonly used methods to treat warts are immunotherapy and cryotherapy. The selection of proper treatment is vital to cure warts. This paper establishes a classification and regression tree (CART) model based on particle swarm optimisation to help patients choose between immunotherapy and cryotherapy. The proposed model can accurately predict the response of patients to the two methods. Using an improved particle swarm algorithm (PSO) to optimise the parameters of the model instead of the traditional pruning algorithm, a more concise and more accurate model is obtained. Two experiments are conducted to verify the feasibility of the proposed model. On the hand, five benchmarks are used to verify the performance of the improved PSO algorithm. On the other hand, the experiment on two wart datasets is conducted. Results show that the proposed model is effective. The proposed method classifies better than k-nearest neighbour, C4.5 and logistic regression. It also performs better than the conventional optimisation method for the CART algorithm. Moreover, the decision tree model established in this study is interpretable and understandable. Therefore, the proposed model can help patients and doctors reduce the medical cost and improve the quality of healing operation.

Journal ArticleDOI
21 May 2021
TL;DR: Two widely applied tree ensemble methods, i.e., random forest and gradient boosting, were investigated to predict resilient modulus, using routinely collected soil properties, revealing that a single tree model generally suffers from high variance, while providing a similar performance to the traditional multiple linear regression model.
Abstract: Modern machine learning methods, such as tree ensembles, have recently become extremely popular due to their versatility and scalability in handling heterogeneous data and have been successfully applied across a wide range of domains In this study, two widely applied tree ensemble methods, ie, random forest (parallel ensemble) and gradient boosting (sequential ensemble), were investigated to predict resilient modulus, using routinely collected soil properties Laboratory test data on sandy soils from nine borrow pits in Georgia were used for model training and testing For comparison purposes, the two tree ensemble methods were evaluated against a regression tree model and a multiple linear regression model, demonstrating their superior performance The results revealed that a single tree model generally suffers from high variance, while providing a similar performance to the traditional multiple linear regression model By leveraging a collection of trees, both tree ensemble methods, Random Forest and eXtreme Gradient Boosting, significantly reduced variance and improved prediction accuracy, with the eXtreme Gradient Boosting being the best model, with an R2 of 095 on the test dataset

Journal ArticleDOI
TL;DR: A conditional generative adversarial network (cGAN) is adopted to infer the 3D silhouette and skeleton of a tree respectively from edges extracted from the image and simple 2D strokes drawn by the user.
Abstract: Realistic 3D tree reconstruction is still a tedious and time-consuming task in the graphics community. In this paper, we propose a simple and efficient method for reconstructing 3D tree models with high fidelity from a single image. The key to single image-based tree reconstruction is to recover 3D shape information of trees via a deep neural network learned from a set of synthetic tree models. We adopted a conditional generative adversarial network (cGAN) to infer the 3D silhouette and skeleton of a tree respectively from edges extracted from the image and simple 2D strokes drawn by the user. Based on the predicted 3D silhouette and skeleton, a realistic tree model that inherits the tree shape in the input image can be generated using a procedural modeling technique. Experiments on varieties of tree examples demonstrate the efficiency and effectiveness of the proposed method in reconstructing realistic 3D tree models from a single image.

DOI
01 Jan 2021
TL;DR: This chapter discovers the Decision Tree model, which is one of the simplest nonlinear machine learning models.
Abstract: As you’ve discovered in the previous chapter, there is a distinction in supervised machine learning models between linear and nonlinear models. In this chapter, you will discover the Decision Tree model. It is one of the simplest nonlinear machine learning models.

Journal ArticleDOI
TL;DR: In this paper, a gate recurrent unit (GRU) and decision tree fusion model, referred to as (T-GRU), was designed to explore the problem of arrhythmia recognition and to improve the credibility of deep learning methods.
Abstract: In recent years, deep learning (DNN) based methods have made leapfrogging level breakthroughs in detecting cardiac arrhythmias as the cost effectiveness of arithmetic power, and data size has broken through the tipping point. However, the inability of these methods to provide a basis for modeling decisions limits clinicians' confidence on such methods. In this paper, a Gate Recurrent Unit (GRU) and decision tree fusion model, referred to as (T-GRU), was designed to explore the problem of arrhythmia recognition and to improve the credibility of deep learning methods. The fusion model multipathway processing time-frequency domain featured the introduction of decision tree probability analysis of frequency domain features, the regularization of GRU model parameters and weight control to improve the decision tree model output weights. The MIT-BIH arrhythmia database was used for validation. Results showed that the low-frequency band features dominated the model prediction. The fusion model had an accuracy of 98.31%, sensitivity of 96.85%, specificity of 98.81%, and precision of 96.73%, indicating its high reliability and clinical significance.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed an adaptive reduced step size gradient boosting regression tree algorithm for bank performance evaluation, which overcomes the shortcomings of low accuracy and poor generalization ability of the existing regression decision tree model.
Abstract: In the current performance evaluation works of commercial banks, most of the researches only focus on the relationship between a single characteristic and performance and lack a comprehensive analysis of characteristics. On the other hand, they mainly focus on causal inference and lack systematic quantitative conclusions from the perspective of prediction. This paper is the first to comprehensively investigate the predictability of multidimensional features on commercial bank performance using boosting regression tree. The dimensionality in the financial-related fields is relatively high. There are not only observable price data, financial fundamentals data, etc., but also many unobservable undisclosed data and undisclosed events; more sources of income cannot be explained by existing models. Aiming at the characteristics of commercial bank data, this paper proposes an adaptively reduced step size gradient boosting regression tree algorithm for bank performance evaluation. In this method, a random subsample sampling is performed before training each regression tree. The adaptive reduction step size is used to replace the reduction step size setting of the original algorithm, which overcomes the shortcomings of low accuracy and poor generalization ability of the existing regression decision tree model. Compared to the BIRCH algorithm for classification of existing data, our proposed gradient boosting regression tree algorithm with adaptively reduced step size obtains better classification results. This paper empirically uses data from rural banks in 30 provinces in China to classify the different characteristics of rural banks’ performance in order to better evaluate their performance.

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors used the bidirectional encoder representations from transformers (BERT) model and a boosted tree model to compare the effects of text matching in Chinese medical Q&A data.

Journal ArticleDOI
TL;DR: In this article, a framework of early warning modeling of financial crisis based on decision tree is proposed, where the decision tree model is constructed on several training subsets as the base learner, so that the base learners can learn the characteristics of the healthy sample and crisis sample roughly equally.
Abstract: At present, the domestic and foreign financial crisis early-warning model research will provide only prediction accuracy as the only standard of success for early-warning model, ignoring an important problem, namely, will the financial crisis early-warning model for normal business, compared with the normal enterprise, forecast the financial crisis? This paper reviews the research situation at home and abroad from the perspective of the definition of the enterprise financial crisis, the form of expression, and so on. From the theoretical level, the relationship between the cause of the financial crisis and the change of financial indicators is established by explaining the early-warning theory, early-warning theory of financial crisis, and cost-sensitive learning theory, and the framework of early warning modeling of financial crisis based on decision tree is put forward. The decision tree model is constructed on several training subsets as the base learner so that the decision tree base learner can learn the characteristics of the healthy sample and crisis sample roughly equally. Taking the bond issuing enterprises of manufacturing industry as samples, the empirical comparison shows that the financial warning model based on decision tree integration is more accurate, which indicates that the model can improve the correct identification rate of financial crisis enterprises under the premise of higher overall warning accuracy.

Journal ArticleDOI
13 Apr 2021
TL;DR: It is highlighted that the annual energy yield is more with the solar PV tree model than with a land-mounted SPV system, and the cost savings and greenhouse gas reduction are also higher with the proposed oak tree-basedSolar PV tree in urban areas than in rural areas recommending it for practical applications.
Abstract: In this paper, the performance and the cost-effectiveness of a solar PV tree for supplying the energy demand of a flood lighting system at a basketball court in the School of Engineering and Technology, Christ (Deemed to be University) at Bangalore, India, are analyzed. Also, the energy demand of a flood lighting system for year 2017 is estimated (16 kWh/day), and the design of 4 individual trees of 1 kWp each is proposed, which saves around 40 sq.m area of land near to the basketball court. The experimental data was collected from June 1st, 2018 to May 31st, 2019, using a data acquisition system and processed to calculate the monthly cost of energy produced by each tree. In order to reduce the complexity in design and allow it to be shade-free, all the panels of a tree were oriented at the same azimuth angle. Based on technical and economical assessments with respect to rooftop systems, the solar PV tree presented reasonable results and could be a future adoptable technology for high population density areas, as well as for remote applications. Later, the adoptability of the proposed solar PV tree was simulated for 2 kWp, considering the climatic conditions of 2020, for different rural and urban locations of India. From the techno-economic-environmental analysis, it is highlighted that the annual energy yield is more with the solar PV tree model than with a land-mounted SPV system. The cost savings and greenhouse gas (GHG) reduction are also higher with the proposed oak tree-based solar PV tree in urban areas than in rural areas recommending it for practical applications.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors used decision tree analyses to screen individuals for early lung cancer with CT, and the decision tree structure showed that lobulation was the most important feature, followed by spiculation, vessel convergence sign, nodule type, satellite nodule size and age of patient.
Abstract: Background Considering the high morbidity and mortality of lung cancer and the high incidence of pulmonary nodules, clearly distinguishing benign from malignant lung nodules at an early stage is of great significance. However, determining the kind of lung nodule which is more prone to lung cancer remains a problem worldwide. Methods A total of 480 patients with pulmonary nodule data were collected from Shandong, China. We assessed the clinical characteristics and computed tomography (CT) imaging features among pulmonary nodules in patients who had undergone video-assisted thoracoscopic surgery (VATS) lobectomy from 2013 to 2018. Preliminary selection of features was based on a statistical analysis using SPSS. We used WEKA to assess the machine learning models using its multiple algorithms and selected the best decision tree model using its optimization algorithm. Results The combination of decision tree and logistics regression optimized the decision tree without affecting its AUC. The decision tree structure showed that lobulation was the most important feature, followed by spiculation, vessel convergence sign, nodule type, satellite nodule, nodule size and age of patient. Conclusions Our study shows that decision tree analyses can be applied to screen individuals for early lung cancer with CT. Our decision tree provides a new way to help clinicians establish a logical diagnosis by a stepwise progression method, but still needs to be validated for prospective trials in a larger patient population.

Proceedings ArticleDOI
TL;DR: In this article, a tree skeleton model based on a pre-segmented photogrammetric 3D point cloud is used to automatically determine possible pruning points for stand-alone trees within meadows.
Abstract: The cultivation of orchard meadows provides an ecological benefit for biodiversity, which is significantly higher than in intensively cultivated orchards. The goal of this research is to create a tree model to automatically determine possible pruning points for stand-alone trees within meadows. The algorithm which is presented here is capable of building a skeleton model based on a pre-segmented photogrammetric 3D point cloud. Good results were achieved in assigning the points to their leading branches and building a virtual tree model, reaching an overall accuracy of 95.19 %. This model provided the necessary information about the geometry of the tree for automated pruning.

Journal ArticleDOI
03 Mar 2021-PLOS ONE
TL;DR: In this paper, a decision tree-based logistic regression model has been proposed which analyses the significance of demographic and clinical variables in the probability of having a positive PCR in a sample of 7,314 individuals treated in the Primary Care service of the public health system of Catalonia.
Abstract: BACKGROUND: Primary care is the major point of access in most health systems in developed countries and therefore for the detection of coronavirus disease 2019 (COVID-19) cases. The quality of its IT systems, together with access to the results of mass screening with Polymerase chain reaction (PCR) tests, makes it possible to analyse the impact of various concurrent factors on the likelihood of contracting the disease. METHODS AND FINDINGS: Through data mining techniques with the sociodemographic and clinical variables recorded in patient's medical histories, a decision tree-based logistic regression model has been proposed which analyses the significance of demographic and clinical variables in the probability of having a positive PCR in a sample of 7,314 individuals treated in the Primary Care service of the public health system of Catalonia. The statistical approach to decision tree modelling allows 66.2% of diagnoses of infection by COVID-19 to be classified with a sensitivity of 64.3% and a specificity of 62.5%, with prior contact with a positive case being the primary predictor variable. CONCLUSIONS: The use of a classification tree model may be useful in screening for COVID-19 infection. Contact detection is the most reliable variable for detecting Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cases. The model would support that, beyond a symptomatic diagnosis, the best way to detect cases would be to engage in contact tracing.