Showing papers on "Decision tree model published in 2016"

PDF

Open Access

Proceedings Article•

Evasion and hardening of tree ensemble classifiers

[...]

Alex Kantchelian¹, J. D. Tygar¹, Anthony D. Joseph¹•Institutions (1)

19 Jun 2016

TL;DR: In this paper, the authors present two algorithms for systematically computing evasions for tree ensembles such as boosted trees and random forests, and demonstrate that both gradient boosted trees are extremely susceptible to evasions.

...read moreread less

Abstract: Classifier evasion consists in finding for a given instance x the "nearest" instance x′ such that the classifier predictions of x and x′ are different. We present two novel algorithms for systematically computing evasions for tree ensembles such as boosted trees and random forests. Our first algorithm uses a Mixed Integer Linear Program solver and finds the optimal evading instance under an expressive set of constraints. Our second algorithm trades off optimality for speed by using symbolic prediction, a novel algorithm for fast finite differences on tree ensembles. On a digit recognition task, we demonstrate that both gradient boosted trees and random forests are extremely susceptible to evasions. Finally, we harden a boosted tree model without loss of predictive accuracy by augmenting the training set of each boosting round with evading instances, a technique we call adversarial boosting.

...read moreread less

125 citations

Journal Article•DOI•

GEFCom2014: Probabilistic solar and wind power forecasting using a generalized additive tree ensemble approach

[...]

Gabor I. Nagy¹, Gergő Barta¹, Sandor Kazi¹, Gyula Borbely², Gabor Simon - Show less +1 more•Institutions (2)

Budapest University of Technology and Economics¹, Ericsson²

01 Jul 2016-International Journal of Forecasting

TL;DR: A voted ensemble of a quantile regression forest model and a stacked random forest–gradient boosting decision tree model is used to predict the probability distribution of solar and wind power generation in connection with the Global Energy Forecasting Competition 2014.

...read moreread less

97 citations

Journal Article•DOI•

A generalized item response tree model for psychological assessments.

[...]

Minjeong Jeon¹, Paul De Boeck¹•Institutions (1)

Ohio State University¹

01 Sep 2016-Behavior Research Methods

TL;DR: A generalized item response tree model with a flexible parametric form, dimensionality, and choice of covariates for modeling item response processes with a tree structure is presented.

...read moreread less

Abstract: A new item response theory (IRT) model with a tree structure has been introduced for modeling item response processes with a tree structure In this paper, we present a generalized item response tree model with a flexible parametric form, dimensionality, and choice of covariates The utilities of the model are demonstrated with two applications in psychological assessments for investigating Likert scale item responses and for modeling omitted item responses The proposed model is estimated with the freely available R package flirt (Jeon et al, 2014b)

...read moreread less

96 citations

Journal Article•DOI•

A combined M5P tree and hazard-based duration model for predicting urban freeway traffic accident durations.

[...]

Lei Lin¹, Qian Wang¹, Adel W. Sadek¹•Institutions (1)

State University of New York System¹

01 Jun 2016-Accident Analysis & Prevention

TL;DR: A novel approach for accident duration prediction is proposed, which improves on the original M5P tree algorithm through the construction of a M5p-HBDM model, in which the leaves of the M5 P tree model are HBDMs instead of linear regression models.

...read moreread less

77 citations

Journal Article•DOI•

Mode Choice Analysis Using Random Forrest Decision Trees

[...]

Ch. Ravi Sekhar¹, Minal², Errampalli Madhu¹•Institutions (2)

Council of Scientific and Industrial Research¹, Academy of Scientific and Innovative Research²

01 Jan 2016-Transportation research procedia

TL;DR: In this paper, a study aimed at modelling the mode choice behavior of commuters in Delhi by considering Random Forrest (RF) Decision Tree (DT) method, which is one of the most efficient DT methods for solving classification problems.

...read moreread less

Abstract: Mode choice analysis forms an integral part of transportation planning process as it gives a complete insight to the mode choice preferences of the commuters and is also used as an instrument for evaluation of introduction of new transport systems. Mode choice analysis involves the procedure to study the factors in decision making process of the commuter while choosing the mode that renders highest utility to them. This study aims at modelling the mode choice behaviour of commuters in Delhi by considering Random Forrest (RF) Decision Tree (DT) method. The random forest model is one of the most efficient DT methods for solving classification problems. For the purpose of model development, about 5000 stratified household samples were collected in Delhi through household interview survey. A comparative evaluation has been carried out between traditional Multinomial logit (MNL) model and Decision tree model to demonstrate the suitableness of RF models in mode choice modelling. From the result, it was observed that model developed by Random Forrest based DT model is the superior one with higher prediction accuracy (98.96%) than the Logit model prediction accuracy (77.31%).

...read moreread less

61 citations

Journal Article•DOI•

A hybrid classification algorithm by subspace partitioning through semi-supervised decision tree

[...]

Kyoungok Kim¹•Institutions (1)

Seoul National University of Science and Technology¹

01 Dec 2016-Pattern Recognition

TL;DR: The proposed semi-supervised decision tree splits internal nodes by utilizing both labels and the structural characteristics of data for subspace partitioning, to improve the accuracy of classifiers applied to terminal nodes in the hybrid models.

...read moreread less

60 citations

Journal Article•DOI•

Research of Decision Tree Classification Algorithm in Data Mining

[...]

Qing-yun Dai, Chun-ping Zhang, Hao Wu

31 May 2016-International journal of database theory and application

TL;DR: This paper presents decision tree classifier, a flowchart like tree structure, where each intenal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node represents a class.

...read moreread less

Abstract: Decision tree algorithm is one of the most important classification measures in data mining. Decision tree classifier as one type of classifier is a flowchart like tree structure, where each intenal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node represents a class. The method that a decision tree model is used to classify a record is to find a path that from root to leaf by measuring the attributes test, and the attribute on the leaf is classification result.

...read moreread less

44 citations

Journal Article•DOI•

Individual Tree Crown Modeling and Change Detection From Airborne Lidar Data

[...]

Wen Xiao¹, Sudan Xu², Sander Oude Elberink², George Vosselman²•Institutions (2)

Newcastle University¹, University of Twente²

22 Apr 2016-IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

TL;DR: The model-based change detection is more robust to data noise and to the segmentation of multi-tree components compared to the point-based method and shows the potential of the method to monitor the growth of urban trees.

...read moreread less

Abstract: Light detection and ranging (lidar) provides a promising way of detecting changes of trees in three-dimensional (3-D) because laser beams can penetrate through the foliage and therefore provide full coverage of trees. The aim is to detect changes in trees in urban areas using multitemporal airborne lidar point clouds. Three datasets covering a part of Rotterdam, The Netherlands, have been classified into several classes including trees. A connected components algorithm is applied first to cluster the tree points. However, closely located and intersected trees are clustered together as multi-tree components. A tree-shaped model-based continuously adaptive mean shift (CamShift) algorithm is implemented to further segment these components into individual trees. Then, the tree parameters are derived in two independent methods: a point-based method using the convex hull and a model-based method which fits a tree-shaped model to the lidar points. At last, changes are detected by comparing the parameters of corresponding tree models which are matched by a tree-to-tree matching algorithm using overlapping bounding boxes and point-to-point distances. The results are visualized and statistically analyzed. The CamShift using a tree model kernel yields high segmentation accuracies. The model-based change detection is consistent with the point-based method according to the small differences between the parameters of single trees. The highlight is that it is more robust to data noise and to the segmentation of multi-tree components compared to the point-based method. The detected changes show the potential of the method to monitor the growth of urban trees.

...read moreread less

38 citations

Proceedings Article•DOI•

Exponential separation of communication and external information

[...]

Anat Ganor¹, Gillat Kol², Ran Raz¹•Institutions (2)

Weizmann Institute of Science¹, Princeton University²

19 Jun 2016

TL;DR: An explicit example of a search problem with external information complexity ≤ O(k), withrespect to any input distribution, and distributional communication complexity ≥ 2k, with respect to some input distribution is obtained.

...read moreread less

Abstract: We show an exponential gap between communication complexity and external information complexity, by analyzing a communication task suggested as a candidate by Braverman. Previously, only a separation of communication complexity and internal information complexity was known. More precisely, we obtain an explicit example of a search problem with external information complexity ≤ O(k), with respect to any input distribution, and distributional communication complexity ≥ 2k, with respect to some input distribution. In particular, this shows that a communication protocol cannot always be compressed to its external information. By a result of Braverman, our gap is the largest possible. Moreover, since the upper bound of O(k) on the external information complexity of the problem is obtained with respect to any input distribution, our result implies an exponential gap between communication complexity and information complexity (both internal and external) in the non-distributional setting of Braverman. In this setting, no gap was previously known, even for internal information complexity.

...read moreread less

38 citations

Journal Article•DOI•

Empirical Validation of Structural Complexity Metric and Complexity Management for Engineering Systems

[...]

Kaushik Sinha¹, Olivier L. Weck¹•Institutions (1)

Massachusetts Institute of Technology¹

01 May 2016-Systems Engineering

TL;DR: A quantitative measure for structural complexity is described, an empirical validation study of the structural complexity metric is conducted, and the notion of system value based on performance‐complexity trade space is introduced and introduced.

...read moreread less

Abstract: Quantitative assessment of structural complexity is essential for characterization of engineered complex systems. In this paper, we describe a quantitative measure for structural complexity, conduct an empirical validation study of the structural complexity metric, and introduce a complexity management framework for engineering system development. We perform empirical validation of the proposed complexity metric using simple experiments using ball and stick models and show that the development effort increases superlinearly with increasing structural complexity. The standard deviation of the build time for ball and stick models is observed to vary superlinearly with structural complexity. We also describe a generic statistical procedure for building such cost estimation relationships with structural complexity as the independent variable. We distinguish the notion of perception of complexity as an observer-dependent property and contrast that with complexity, which is a property of the system architecture. Finally, we introduce the notion of system value based on performance-complexity trade space and introduce a complexity management framework for system development.

...read moreread less

33 citations

Journal Article•DOI•

Decision Tree Approach to Accident Prediction for Highway–Rail Grade Crossings: Empirical Analysis:

[...]

Zijian Zheng¹, Pan Lu¹, Denver Tolliver¹•Institutions (1)

North Dakota State University¹

07 Jun 2016-Transportation Research Record

TL;DR: In this article, a decision tree approach is explored to evaluate highway-rail grade crossings (HRGCs) crashes, which is more advanced in its ability to handle large data sets, deal with missing values and not require predefined underlying relationships between target variables and predictors.

...read moreread less

Abstract: Highway–rail grade crossings (HRGCs) are critical spatial locations that are of utmost importance for transportation safety because traffic crashes at these locations are often catastrophic. Compared with traditional regression models, the decision tree is more advanced in its ability to handle large data sets, deal with missing values, and not require predefined underlying relationships between target variables and predictors. Thus the decision tree approach is explored in this study, which evaluates HRGC crashes. Because crashes at HRGCs are rare, the majority of data will have a zero-crash classification. A traditional decision tree method will have a bias toward the majority classification, which will result in a good prediction for the majority class but a relatively poor prediction for rare events. To improve model accuracy with the decision tree model, especially for forecasting rare events, previous probability and decision profit values are adjusted. Historical crash data of North Dakota State fr...

...read moreread less

Journal Article•DOI•

Decision tree induction with a constrained number of leaf nodes

[...]

Chia-Chi Wu, Yen-Liang Chen¹, Yi Hung Liu², Xiang-Yu Yang¹•Institutions (2)

National Central University¹, Shantou University²

01 Oct 2016-Applied Intelligence

TL;DR: A new algorithm, the Size Constrained Decision Tree (SCDT), is proposed with which to construct a decision tree, paying close attention on how to efficiently use the limited number of leaf nodes.

...read moreread less

Abstract: With the advantages of being easy to understand and efficient to compute, the decision tree method has long been one of the most popular classifiers. Decision trees constructed with existing approaches, however, tend to be huge and complex, and consequently are difficult to use in practical applications. In this study, we deal with the problem of tree complexity by allowing users to specify the number of leaf nodes, and then construct a decision tree that allows maximum classification accuracy with the given number of leaf nodes. A new algorithm, the Size Constrained Decision Tree (SCDT), is proposed with which to construct a decision tree, paying close attention on how to efficiently use the limited number of leaf nodes. Experimental results show that the SCDT method can successfully generate a simpler decision tree and offers better accuracy.

...read moreread less

Journal Article•DOI•

Comparison of artificial neural network and decision tree models in estimating spatial distribution of snow depth in a semi-arid region of Iran

[...]

Samaneh Gharaei-Manesh¹, Ali Fathzadeh, Ruhollah Taghizadeh-Mehrjardi•Institutions (1)

Yazd University¹

01 Feb 2016-Cold Regions Science and Technology

TL;DR: In this article, the authors used Artificial Neural Networks (ANNs) and M5 algorithm of decision tree to estimate the snow depth from terrain parameters in the Sakhvid Basin, Iran using artificial neural networks and decision tree algorithm.

...read moreread less

Journal Article•DOI•

Weighted Hybrid Decision Tree Model for Random Forest Classifier

[...]

Vrushali Kulkarni¹, Pradeep K. Sinha², Manisha C. Petare³•Institutions (3)

Maharashtra Institute of Technology¹, Centre for Development of Advanced Computing², College of Engineering, Pune³

01 Jun 2016-Journal of The Institution of Engineers : Series B

TL;DR: A new approach of hybrid decision tree model for random forest classifier is proposed, which is augmented by weighted voting based on the strength of individual tree and has shown notable increase in the accuracy of random forest.

...read moreread less

Abstract: Random Forest is an ensemble, supervised machine learning algorithm. An ensemble generates many classifiers and combines their results by majority voting. Random forest uses decision tree as base classifier. In decision tree induction, an attribute split/evaluation measure is used to decide the best split at each node of the decision tree. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation among them. The work presented in this paper is related to attribute split measures and is a two step process: first theoretical study of the five selected split measures is done and a comparison matrix is generated to understand pros and cons of each measure. These theoretical results are verified by performing empirical analysis. For empirical analysis, random forest is generated using each of the five selected split measures, chosen one at a time. i.e. random forest using information gain, random forest using gain ratio, etc. The next step is, based on this theoretical and empirical analysis, a new approach of hybrid decision tree model for random forest classifier is proposed. In this model, individual decision tree in Random Forest is generated using different split measures. This model is augmented by weighted voting based on the strength of individual tree. The new approach has shown notable increase in the accuracy of random forest.

...read moreread less

Proceedings Article•DOI•

The analysis of cases based on decision tree

[...]

Yurong Zhong

01 Aug 2016

TL;DR: A simplified algorithm of entropy right based on the ID3 algorithm that can make the attribute selection become more reasonable, and avoid compatibility with real attribute, and be able to improve the algorithm running efficiency.

...read moreread less

Abstract: Data mining is an data intelligent analysis technology in the late 20th century, it can automatically extract or find useful model knowledge from large amounts of data in databases, data warehouses or other databases. In this process, the classification of data is an important research topic in the field of data mining. Currently there are different methods for classification, the classification algorithm of decision tree is clear, easy to understand and easy to convert into certain classification rules, so this classification algorithm is widely studied and applied. Based on the background of “data platform for public petition”, it aims to study how data mining system combined with the existing database, extracting useful information from the mass characteristics hidden in the data, and provide comprehensive analysis for system managers and decision makers. This paper focus on the study of basic principle of data mining and basic algorithms. The classification of the cases, this module was developed based on decision tree algorithm. Based on improved ID3 decision tree algorithm, according to the case information of the library and the client information of the other library, decision tree model can be built, to give certain case an assessment of the comprehensive analysis. This paper presents a simplified algorithm of entropy right based on the ID3 algorithm. The main idea of this algorithm is to combine the principle of Taylor formula with the attribute selection of the ID3 algorithm—entropy solution firstly, to simplify the entropy solution of the ID3 algorithm, to change the standard of attribute selection of the ID3 algorithm, to reduce the calculation complex degree of the algorithm, and to improve the algorithm running efficiency; And then give simplified entropy of every attribute a right N, this N is depend to every number of the attribute's value, to balance uncertainty of each attribute on the data set. It can make the attribute selection become more reasonable, and avoid compatibility with real attribute.

...read moreread less

Book Chapter•DOI•

Functional Bid Landscape Forecasting for Display Advertising

[...]

Yuchen Wang¹, Kan Ren¹, Weinan Zhang¹, Jun Wang², Yong Yu¹ - Show less +1 more•Institutions (2)

Shanghai Jiao Tong University¹, University College London²

19 Sep 2016

TL;DR: Zhou et al. as discussed by the authors proposed a functional bid landscape forecasting method to automatically learn the function mapping from each ad auction features to the market price distribution without any assumption about the functional form.

...read moreread less

Abstract: Real-time auction has become an important online advertising trading mechanism. A crucial issue for advertisers is to model the market competition, i.e., bid landscape forecasting. It is formulated as predicting the market price distribution for each ad auction provided by its side information. Existing solutions mainly focus on parameterized heuristic forms of the market price distribution and learn the parameters to fit the data. In this paper, we present a functional bid landscape forecasting method to automatically learn the function mapping from each ad auction features to the market price distribution without any assumption about the functional form. Specifically, to deal with the categorical feature input, we propose a novel decision tree model with a node splitting scheme by attribute value clustering. Furthermore, to deal with the problem of right-censored market price observations, we propose to incorporate a survival model into tree learning and prediction, which largely reduces the model bias. The experiments on real-world data demonstrate that our models achieve substantial performance gains over previous work in various metrics. The software related to this paper is available at https://github.com/zeromike/bid-lands.

...read moreread less

Journal Article•DOI•

Determining an optimal hierarchical forecasting model based on the characteristics of the data set: Technical note

[...]

Zlatana D. Nenova¹, Jerrold H. May¹•Institutions (1)

University of Pittsburgh¹

01 May 2016-Journal of Operations Management

TL;DR: A new, two-level oblique linear discriminant tree model is presented, which identifies the optimal hierarchical forecast technique for a given hierarchical database in a very time-efficient manner and develops an analytical model for personalized forecast aggregation decisions, based on characteristics of a hierarchical dataset.

...read moreread less

Proceedings Article•DOI•

Structure of Protocols for XOR Functions

[...]

Hamed Hatami¹, Kaave Hosseini¹, Shachar Lovett¹•Institutions (1)

University of California, San Diego¹

01 Oct 2016

TL;DR: The deterministic communication complexity of F is equal to the parity decision tree complexity of f, which relies on a novel technique of entropy reduction for protocols, combined with existing techniques in Fourier analysis and additive combinatorics.

...read moreread less

Abstract: Let f be a boolean function on n variables. Its associated XOR function is the two-party function F(x, y) = f(x xor y). We show that, up to polynomial factors, the deterministic communication complexity of F is equal to the parity decision tree complexity of f. This relies on a novel technique of entropy reduction for protocols, combined with existing techniques in Fourier analysis and additive combinatorics.

...read moreread less

Patent•

Advertisement click rate estimation method based on decision tree, application recommendation method and device

[...]

Zhou Nan, Chang Fuyang, Yue Huadong

01 Jun 2016

TL;DR: In this article, a click rate estimation method based on a decision tree and a device is proposed. But the method is not suitable for the estimation of the click rate of a specific historical advertisement in a predetermined historical time period.

...read moreread less

Abstract: The present invention provides an advertisement click rate estimation method based on a decision tree and a device. The method comprises a step of obtaining the related characteristic information of a specific historical advertisement in a predetermined historical time period, a step of obtaining the personalized characteristic information of a target user, a step of carrying out decision tree model calculation based on the obtained target user personalized characteristic information and the related characteristic information of the specific historical advertisement so as to determine a cross characteristic vector for estimating the click rate of the specific historical advertisement, wherein each leaf node of the decision tree model represents one cross characteristic, a step of calculating based on the cross characteristic vector and the model training parameter obtained from training in advance so as to estimate the click rate of the specific historical advertisement. The invention also provides an application software recommendation method and a device. According to the method or the device, through the decision tree model, the cross characteristic with a strong classification characteristic is generated, and thus the advertisement click rate is accurately and rapidly estimated.

...read moreread less

Journal Article•DOI•

Decision tree modeling using R.

[...]

Zhongheng Zhang¹•Institutions (1)

Zhejiang University¹

01 Aug 2016-Annals of Translational Medicine

TL;DR: This work focuses on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures, and incorporates recursive partitioning into conventional parametric model building.

...read moreread less

Abstract: In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building.

...read moreread less

Journal Article•DOI•

Time-constrained cost-sensitive decision tree induction

[...]

Yen-Liang Chen¹, Chia-Chi Wu², Kwei Tang³•Institutions (3)

National Central University¹, Institute for Information Industry², National Chengchi University³

01 Aug 2016-Information Sciences

TL;DR: An algorithm to generate a time-constrained minimal-cost tree that select the attribute that brings the maximal benefit when time is sufficient, and to select the most time-efficient attribute whenTime is limited is developed.

...read moreread less

Journal Article•DOI•

An Entropy Based Measure for Comparing Distributions of Complexity

[...]

Rajeev Rajaram¹, Brian Castellani¹•Institutions (1)

Kent State University¹

01 Jul 2016-Physica A-statistical Mechanics and Its Applications

TL;DR: A new complexity measure called case-based entropyCc is introduced — a modification of the Shannon–Wiener entropy measure H.c, which can be used to empirically identify and measure the distribution of the diversity of complexity within and across multiple natural and human-made systems.

...read moreread less

Abstract: This paper is part of a series addressing the empirical/statistical distribution of the diversity of complexity within and amongst complex systems. Here, we consider the problem of measuring the diversity of complexity in a system, given its ordered range of complexity types i and their probability of occurrence p i , with the understanding that larger values of i mean a higher degree of complexity. To address this problem, we introduce a new complexity measure called case-based entropy C c — a modification of the Shannon–Wiener entropy measure H . The utility of this measure is that, unlike current complexity measures–which focus on the macroscopic complexity of a single system– C c can be used to empirically identify and measure the distribution of the diversity of complexity within and across multiple natural and human-made systems, as well as the diversity contribution of complexity of any part of a system, relative to the total range of ordered complexity types.

...read moreread less

Journal Article•DOI•

Prediction of Frost Occurrences Using Statistical Modeling Approaches

[...]

Hyojin Lee, Jong A. Chun, Hyun-Hee Han, Sung Kim

17 Apr 2016-Advances in Meteorology

TL;DR: In this paper, the authors developed the frost prediction models in spring in Korea using logistic regression and decision tree techniques, and compared the performance of the two models and concluded that the decision tree model can be more useful for the timely warning system.

...read moreread less

Abstract: We developed the frost prediction models in spring in Korea using logistic regression and decision tree techniques. Hit Rate (HR), Probability of Detection (POD), and False Alarm Rate (FAR) from both models were calculated and compared. Threshold values for the logistic regression models were selected to maximize HR and POD and minimize FAR for each station, and the split for the decision tree models was stopped when change in entropy was relatively small. Average HR values were 0.92 and 0.91 for logistic regression and decision tree techniques, respectively, average POD values were 0.78 and 0.80 for logistic regression and decision tree techniques, respectively, and average FAR values were 0.22 and 0.28 for logistic regression and decision tree techniques, respectively. The average numbers of selected explanatory variables were 5.7 and 2.3 for logistic regression and decision tree techniques, respectively. Fewer explanatory variables can be more appropriate for operational activities to provide a timely warning for the prevention of the frost damages to agricultural crops. We concluded that the decision tree model can be more useful for the timely warning system. It is recommended that the models should be improved to reflect local topological features.

...read moreread less

Patent•

Attitude and orbit control data analysis method based on decision tree

[...]

Fan Huifang, Yang Tianshe, Liu Yang, Zhang Guoyong, Zhou Jun, Wang Sen - Show less +2 more

11 May 2016

TL;DR: In this article, a decision tree C5.0 algorithm model is proposed for attitude and orbit control complex data analysis and has a certain guidance effect on satellite fault identification, diagnosis and anticipation.

...read moreread less

Abstract: The invention provides an attitude and orbit control data analysis method based on a decision tree. The method comprises the following steps: attitude and orbit control data preprocessing, i.e., finishing remote measurement data deduplicating, remote measurement data sequencing, remote measurement data extraction and remote measurement data outlier rejection through the data preprocessing; hierarchical modeling of an attitude and orbit control, i.e., establishing an information and control flow chart of the attitude and orbit control, determining remote measurement variables related to current faults of the attitude and orbit control, and taking the variables as input variables for decision tree analysis; establishing a flow chart of the decision tree analysis; and decision tree modeling, i.e., creating a decision tree C5.0 algorithm model, defining a model name in the model, boosting algorithm test frequency, and trimming attributes and a minimum recording number of each sub branch. The method provided by the invention solves the problem of difficulties in satellite attitude and orbit control complex data analysis and has a certain guidance effect on satellite fault identification, diagnosis and anticipation.

...read moreread less

Patent•

Method and system for processing call based on decision tree model

[...]

Wang Xin

08 Jun 2016

TL;DR: In this article, the authors proposed a decision tree model for processing a call based on a caller's basic information and associated information of a user according to an incoming call number, combining the basic information with associated information as an input and matched first pushing information as output to construct the decision tree, and then pushing the second pushing information to a customer service.

...read moreread less

Abstract: The invention provides a method and a system for processing a call based on a decision tree model; the method comprises the following steps: obtaining basic information and associated information of a user according to an incoming call number; combining the basic information and the associated information as an input and matched first pushing information as an output to construct the decision tree model; inputting the basic information and the associated information of the specific user to the trained decision tree model and calculating to obtain second pushing information of the specific user; and pushing the second pushing information to a customer service. According to the method and the system for processing the call based on the decision tree model provided by the invention, when the incoming call user dials a customer service hotline, the user does not need to input a digital incoming line again according to a voice prompt, and the call dialing is accurately matched to a customer service group, so that the artificial customer service, when the user incomes the line, can know the question that the user wants to consult simultaneously; and therefore, the operation of the user and the time of listening to a prompt tone for operation are effectively reduced, the user satisfaction of a customer is effectively improved, and the working efficiency of the customer service is improved.

...read moreread less

Patent•

Decision tree model-based photovoltaic assembly fault diagnosis method

[...]

Wu Chunhua, Xu Lijuan, Wang Yuanzhang

10 Aug 2016

TL;DR: In this article, a decision tree model-based photovoltaic assembly fault diagnosis method is proposed for power generation technical field, which includes the following steps: data are acquired, and data processing is carried out; obtained data are introduced into a decision-tree-based model, the modeling steps of the model mainly comprise selection and processing of training and test sample data, tree establishment and tree pruning, decision trees model establishment and decision tree accuracy verification; and the fault type of a photovoration assembly is judged through a decision module.

...read moreread less

Abstract: The present invention belongs to the photovoltaic power generation technical field and provides a decision tree model-based photovoltaic assembly fault diagnosis method The method includes the following steps that: photovoltaic assembly data are acquired, and data processing is carried out; obtained data are introduced into a decision tree-based photovoltaic assembly fault diagnosis model, the modeling steps of the model mainly comprise selection and processing of training and test sample data, tree establishment and tree pruning, decision tree model establishment and decision tree accuracy verification; and the fault type of a photovoltaic assembly is judged through a decision module With the method of the invention adopted, manpower and resource waste caused by fault judgment errors can be avoided, the accuracy and reliability of fault judgment are improved, serious consequences to the photovoltaic assembly, caused by a fault, can be avoided, and the service life of the photovoltaic assembly can be prolonged

...read moreread less

Journal Article•DOI•

Exponential Separation of Information and Communication for Boolean Functions

[...]

Anat Ganor¹, Gillat Kol, Ran Raz¹•Institutions (1)

Weizmann Institute of Science¹

15 Nov 2016-Journal of the ACM

TL;DR: The relative discrepancy method is presented, a new rectangle-based method for proving communication complexity lower bounds for boolean functions, powerful enough to separate information complexity and communication complexity.

...read moreread less

Abstract: We show an exponential gap between communication complexity and information complexity by giving an explicit example of a partial boolean function with information complexity ≤ O(k), and distributional communication complexity ≥ 2k. This shows that a communication protocol cannot always be compressed to its internal information. By a result of Braverman [2015], our gap is the largest possible. By a result of Braverman and Rao [2014], our example shows a gap between communication complexity and amortized communication complexity, implying that a tight direct sum result for distributional communication complexity cannot hold, answering a long-standing open problem.Another (conceptual) contribution of our work is the relative discrepancy method, a new rectangle-based method for proving communication complexity lower bounds for boolean functions, powerful enough to separate information complexity and communication complexity.

...read moreread less

Journal Article•DOI•

Improved bounds for the randomized decision tree Complexity of recursive majority

[...]

Frédéric Magniez¹, Ashwin Nayak², Miklos Santha¹, Miklos Santha³, Jonah Sherman⁴, Gábor Tardos, David Xiao¹ - Show less +3 more•Institutions (4)

Paris Diderot University¹, University of Waterloo², National University of Singapore³, University of California, Berkeley⁴

01 May 2016-Random Structures and Algorithms

TL;DR: A lower bound of (1/2−δ)·2.57143h is proved for the two‐sided‐error randomized decision tree complexity of evaluating height h formulae with error δ∈[0, 1/2) .

...read moreread less

Abstract: We consider the randomized decision tree complexity of the recursive 3-majority function. We prove a lower bound of for the two-sided-error randomized decision tree complexity of evaluating height h formulae with error . This improves the lower bound of given by Jayram, Kumar, and Sivakumar (STOC'03), and the one of given by Leonardos (ICALP'13). Second, we improve the upper bound by giving a new zero-error randomized decision tree algorithm that has complexity at most . The previous best known algorithm achieved complexity . The new lower bound follows from a better analysis of the base case of the recursion of Jayram et al. The new algorithm uses a novel “interleaving” of two recursive algorithms. © 2015 Wiley Periodicals, Inc. Random Struct. Alg., 2015

...read moreread less

Proceedings Article•DOI•

Separations in communication complexity using cheat sheets and information complexity

[...]

Anurag Anshu¹, Aleksandrs Belovs, Shalev Ben-David², Mika Göös³, Rahul Jain¹, Robin Kothari², Troy Lee, Miklos Santha⁴ - Show less +4 more•Institutions (4)

National University of Singapore¹, Massachusetts Institute of Technology², University of Toronto³, Paris Diderot University⁴

04 May 2016-arXiv: Quantum Physics

TL;DR: Aaronson, Ben-David, and Kothari as discussed by the authors gave the first super-quadratic separation between quantum and randomized communication complexity for a total function, giving an example exhibiting a power 2.5 gap.

...read moreread less

Abstract: While exponential separations are known between quantum and randomized communication complexity for partial functions (Raz, STOC 1999), the best known separation between these measures for a total function is quadratic, witnessed by the disjointness function. We give the first super-quadratic separation between quantum and randomized communication complexity for a total function, giving an example exhibiting a power 2.5 gap. We further present a 1.5 power separation between exact quantum and randomized communication complexity, improving on the previous ~1.15 separation by Ambainis (STOC 2013). Finally, we present a nearly optimal quadratic separation between randomized communication complexity and the logarithm of the partition number, improving upon the previous best power 1.5 separation due to Goos, Jayram, Pitassi, and Watson. Our results are the communication analogues of separations in query complexity proved using the recent cheat sheet framework of Aaronson, Ben-David, and Kothari (STOC 2016). Our main technical results are randomized communication and information complexity lower bounds for a family of functions, called lookup functions, that generalize and port the cheat sheet framework to communication complexity.

...read moreread less

Patent•

Prediction model demonstration method and device and prediction model adjustment method and device

[...]

Bai Yang¹, Chen Yuqiang, Dai Wenyuan•Institutions (1)

Paradigm¹

07 Sep 2016

TL;DR: In this article, a prediction model demonstration method is proposed to enable the prediction model hard to understand to be approximated to the decision-making tree model, so that a user can better understand the predicted model based on the demonstrated decision making tree model.

...read moreread less

Abstract: The invention discloses a prediction model demonstration method, a prediction model demonstration device, a prediction model adjustment method and a prediction model adjustment device. The predictionmodel demonstration method comprises the steps of: acquiring at least one prediction result of a prediction model for at least one prediction sample; acquiring at least one decision-making tree training sample for training a decision-making tree model based on the at least one prediction sample and the at least one prediction result, wherein the decision-making tree model is used for fitting the prediction model; training the decision-making tree model by utilizing at least one decision-making tree training sample; and demonstrating the trained decision-making tree model visually. The prediction model demonstration method can enable the prediction model hard to understand to be approximated to the decision-making tree model, and demonstrate the approximated decision-making tree model, so that a user can better understand the prediction model based on the demonstrated decision-making tree model.

...read moreread less