scispace - formally typeset
Search or ask a question

Showing papers in "Machine Learning in 2006"


Journal ArticleDOI
TL;DR: A new tree-based ensemble method for supervised classification and regression problems that consists of randomizing strongly both attribute and cut-point choice while splitting a tree node and builds totally randomized trees whose structures are independent of the output values of the learning sample.
Abstract: This paper proposes a new tree-based ensemble method for supervised classification and regression problems. It essentially consists of randomizing strongly both attribute and cut-point choice while splitting a tree node. In the extreme case, it builds totally randomized trees whose structures are independent of the output values of the learning sample. The strength of the randomization can be tuned to problem specifics by the appropriate choice of a parameter. We evaluate the robustness of the default choice of this parameter, and we also provide insight on how to adjust it in particular situations. Besides accuracy, the main strength of the resulting algorithm is computational efficiency. A bias/variance analysis of the Extra-Trees algorithm is also provided as well as a geometrical and a kernel characterization of the models induced.

5,246 citations


Journal ArticleDOI
TL;DR: Experiments with a real-world database and knowledge base in a university domain illustrate the promise of this approach to combining first-order logic and probabilistic graphical models in a single representation.
Abstract: We propose a simple approach to combining first-order logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a first-order knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects in the domain, it specifies a ground Markov network containing one feature for each possible grounding of a first-order formula in the KB, with the corresponding weight. Inference in MLNs is performed by MCMC over the minimal subset of the ground network required for answering the query. Weights are efficiently learned from relational databases by iteratively optimizing a pseudo-likelihood measure. Optionally, additional clauses are learned using inductive logic programming techniques. Experiments with a real-world database and knowledge base in a university domain illustrate the promise of this approach.

2,916 citations


Journal ArticleDOI
TL;DR: The first empirical results simultaneously comparing most of the major Bayesian network algorithms against each other are presented, namely the PC, Sparse Candidate, Three Phase Dependency Analysis, Optimal Reinsertion, Greedy Equivalence Search, and Greedy Search.
Abstract: We present a new algorithm for Bayesian network structure learning, called Max-Min Hill-Climbing (MMHC). The algorithm combines ideas from local learning, constraint-based, and search-and-score techniques in a principled and effective way. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. In our extensive empirical evaluation MMHC outperforms on average and in terms of various metrics several prototypical and state-of-the-art algorithms, namely the PC, Sparse Candidate, Three Phase Dependency Analysis, Optimal Reinsertion, Greedy Equivalence Search, and Greedy Search. These are the first empirical results simultaneously comparing most of the major Bayesian network algorithms against each other. MMHC offers certain theoretical advantages, specifically over the Sparse Candidate algorithm, corroborated by our experiments. MMHC and detailed results of our study are publicly available at http://www.dsl-lab.org/supplements/mmhc_paper/mmhc_index.html.

1,682 citations


Journal ArticleDOI
TL;DR: A theoretical analysis on six existing diversity measures is presented, show underlying relationships between them, and relate them to the concept of margin, which is more explicitly related to the success of ensemble learning algorithms.
Abstract: Diversity among the base classifiers is deemed to be important when constructing a classifier ensemble. Numerous algorithms have been proposed to construct a good classifier ensemble by seeking both the accuracy of the base classifiers and the diversity among them. However, there is no generally accepted definition of diversity, and measuring the diversity explicitly is very difficult. Although researchers have designed several experimental studies to compare different diversity measures, usually confusing results were observed. In this paper, we present a theoretical analysis on six existing diversity measures (namely disagreement measure, double fault measure, KW variance, inter-rater agreement, generalized diversity and measure of difficulty), show underlying relationships between them, and relate them to the concept of margin, which is more explicitly related to the success of ensemble learning algorithms. We illustrate why confusing experimental results were observed and show that the discussed diversity measures are naturally ineffective. Our analysis provides a deeper understanding of the concept of diversity, and hence can help design better ensemble learning algorithms.

380 citations


Journal ArticleDOI
TL;DR: This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2-class classifiers over the full range of possible class distributions and misclassification costs.
Abstract: This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2-class classifiers over the full range of possible class distributions and misclassification costs. Cost curves are shown to be superior to ROC curves for visualizing classifier performance for most purposes. This is because they visually support several crucial types of performance assessment that cannot be done easily with ROC curves, such as showing confidence intervals on a classifier's performance, and visualizing the statistical significance of the difference in performance of two classifiers. A software tool supporting all the cost curve analysis described in this paper is available from the authors.

313 citations


Journal ArticleDOI
TL;DR: An algorithm that predicts musical genre and artist from an audio waveform using the ensemble learner ADABOOST and evidence collected from a variety of popular features and classifiers that the technique of classifying features aggregated over segments of audio is better than classifying either entire songs or individual short-timescale features.
Abstract: We present an algorithm that predicts musical genre and artist from an audio waveform. Our method uses the ensemble learner ADABOOST to select from a set of audio features that have been extracted from segmented audio and then aggregated. Our classifier proved to be the most effective method for genre classification at the recent MIREX 2005 international contests in music information extraction, and the second-best method for recognizing artists. This paper describes our method in detail, from feature extraction to song classification, and presents an evaluation of our method on three genre databases and two artist-recognition databases. Furthermore, we present evidence collected from a variety of popular features and classifiers that the technique of classifying features aggregated over segments of audio is better than classifying either entire songs or individual short-timescale features.

296 citations


Journal ArticleDOI
TL;DR: It is concluded that dynamic scripting can be successfully applied to the online adaptation of game AI in commercial computer games by implementing the technique in the game Neverwinter Nights.
Abstract: Online learning in commercial computer games allows computer-controlled opponents to adapt to the way the game is being played. As such it provides a mechanism to deal with weaknesses in the game AI, and to respond to changes in human player tactics. We argue that online learning of game AI should meet four computational and four functional requirements. The computational requirements are speed, effectiveness, robustness and efficiency. The functional requirements are clarity, variety, consistency and scalability. This paper investigates a novel online learning technique for game AI called `dynamic scripting', that uses an adaptive rulebase for the generation of game AI on the fly. The performance of dynamic scripting is evaluated in experiments in which adaptive agents are pitted against a collection of manually-designed tactics in a simulated computer roleplaying game. Experimental results indicate that dynamic scripting succeeds in endowing computer-controlled opponents with adaptive performance. To further improve the dynamic-scripting technique, an enhancement is investigated that allows scaling of the difficulty level of the game AI to the human player's skill level. With the enhancement, dynamic scripting meets all computational and functional requirements. The applicability of dynamic scripting in state-of-the-art commercial games is demonstrated by implementing the technique in the game Neverwinter Nights. We conclude that dynamic scripting can be successfully applied to the online adaptation of game AI in commercial computer games.

274 citations


Journal ArticleDOI
TL;DR: This paper reviews the literature on deterministic and stochastic stepsize rules, and derives formulas for optimal stepsizes for minimizing estimation error, and an approximation is proposed for the case where the parameters are unknown.
Abstract: We address the problem of determining optimal stepsizes for estimating parameters in the context of approximate dynamic programming. The sufficient conditions for convergence of the stepsize rules have been known for 50 years, but practical computational work tends to use formulas with parameters that have to be tuned for specific applications. The problem is that in most applications in dynamic programming, observations for estimating a value function typically come from a data series that can be initially highly transient. The degree of transience affects the choice of stepsize parameters that produce the fastest convergence. In addition, the degree of initial transience can vary widely among the value function parameters for the same dynamic program. This paper reviews the literature on deterministic and stochastic stepsize rules, and derives formulas for optimal stepsizes for minimizing estimation error. This formula assumes certain parameters are known, and an approximation is proposed for the case where the parameters are unknown. Experimental work shows that the approximation provides faster convergence than other popular formulas.

201 citations


Journal ArticleDOI
Marc Boullé1
TL;DR: A new discretization method MODL1 is proposed, founded on a Bayesian approach, which results in the definition of a Bayes optimal evaluation criterion of discretizations and a new super-linear optimization algorithm that manages to find near-optimal discretizers.
Abstract: While real data often comes in mixed format, discrete and continuous, many supervised induction algorithms require discrete data. Efficient discretization of continuous attributes is an important problem that has effects on speed, accuracy and understandability of the induction models. In this paper, we propose a new discretization method MODL1, founded on a Bayesian approach. We introduce a space of discretization models and a prior distribution defined on this model space. This results in the definition of a Bayes optimal evaluation criterion of discretizations. We then propose a new super-linear optimization algorithm that manages to find near-optimal discretizations. Extensive comparative experiments both on real and synthetic data demonstrate the high inductive performances obtained by the new discretization method.

175 citations


Journal ArticleDOI
TL;DR: The question of whether one can efficiently produce low-dimensional mappings, using only black-box access to a kernel function, is explored, and it is found that if the data was linearly separable with margin γ under the kernel, then it is approximately separable in this new feature space.
Abstract: Kernel functions are typically viewed as providing an implicit mapping of points into a high-dimensional space, with the ability to gain much of the power of that space without incurring a high cost if the result is linearly-separable by a large margin ?. However, the Johnson-Lindenstrauss lemma suggests that in the presence of a large margin, a kernel function can also be viewed as a mapping to a low-dimensional space, one of dimension only $$\tilde{O}(1/\gamma^2)$$ . In this paper, we explore the question of whether one can efficiently produce such low-dimensional mappings, using only black-box access to a kernel function. That is, given just a program that computes K(x,y) on inputs x,y of our choosing, can we efficiently construct an explicit (small) set of features that effectively capture the power of the implicit high-dimensional space? We answer this question in the affirmative if our method is also allowed black-box access to the underlying data distribution (i.e., unlabeled examples). We also give a lower bound, showing that if we do not have access to the distribution, then this is not possible for an arbitrary black-box kernel function; we leave as an open problem, however, whether this can be done for standard kernel functions such as the polynomial kernel. Our positive result can be viewed as saying that designing a good kernel function is much like designing a good feature space. Given a kernel, by running it in a black-box manner on random unlabeled examples, we can efficiently generate an explicit set of $$\tilde{O}(1/\gamma^2)$$ features, such that if the data was linearly separable with margin ? under the kernel, then it is approximately separable in this new feature space.

155 citations


Journal ArticleDOI
TL;DR: This paper provides extensive empirical evidence that the distribution-based aggregators indeed do facilitate modeling with high-dimensional categorical attributes, and conjecture special properties of identifier attributes, e.g., they proxy for unobserved attributes and for information deeper in the relationship network.
Abstract: Identifier attributes--very high-dimensional categorical attributes such as particular product ids or people's names--rarely are incorporated in statistical modeling. However, they can play an important role in relational modeling: it may be informative to have communicated with a particular set of people or to have purchased a particular set of products. A key limitation of existing relational modeling techniques is how they aggregate bags (multisets) of values from related entities. The aggregations used by existing methods are simple summaries of the distributions of features of related entities: e.g., MEAN, MODE, SUM, or COUNT. This paper's main contribution is the introduction of aggregation operators that capture more information about the value distributions, by storing meta-data about value distributions and referencing this meta-data when aggregating--for example by computing class-conditional distributional distances. Such aggregations are particularly important for aggregating values from high-dimensional categorical attributes, for which the simple aggregates provide little information. In the first half of the paper we provide general guidelines for designing aggregation operators, introduce the new aggregators in the context of the relational learning system ACORA (Automated Construction of Relational Attributes), and provide theoretical justification. We also conjecture special properties of identifier attributes, e.g., they proxy for unobserved attributes and for information deeper in the relationship network. In the second half of the paper we provide extensive empirical evidence that the distribution-based aggregators indeed do facilitate modeling with high-dimensional categorical attributes, and in support of the aforementioned conjectures.

Journal ArticleDOI
TL;DR: This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction, which was successfully applied to standard ILP problems and real-life problems.
Abstract: Relational rule learning algorithms are typically designed to construct classification and prediction rules. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction. The proposed approach was successfully applied to standard ILP problems (East-West trains, King-Rook-King chess endgame and mutagenicity prediction) and two real-life problems (analysis of telephone calls and traffic accident analysis).

Proceedings ArticleDOI
TL;DR: This talk will describe MLton's approach to whole-program compilation, covering the optimizations and the intermediate languages, as well as some of the engineering challenges that were overcome to make it feasible to use MLton on programs with over one hundred thousand lines.
Abstract: MLton is a stable, robust, widely ported, Standard ML (SML) compiler that generates efficient executables. Whole-program compilation is the key to MLton's success, significantly improving both correctness and efficiency. Whole-program compilation makes possible a number of optimizations that reduce or eliminate the cost of SML's powerful abstraction mechanisms, such as parametric modules, polymorphism, and higher-order functions. It also allows MLton to use a simply-typed, first-order, intermediate language. By structuring the bulk of MLton's optimizer as small passes on whole programs in this simple intermediate language, it is easy to implement and debug new optimizations. This intermediate language uses a variant of standard control-flow graphs and static single assignment form, which makes it easy to implement traditional local optimizations as well. Having the whole program also enables standard data representations such as unboxed integers and arrays, as well as efficient representations for user-defined data structures.This talk will describe MLton's approach to whole-program compilation, covering the optimizations and the intermediate languages, as well as some of the engineering challenges that were overcome to make it feasible to use MLton on programs with over one hundred thousand lines. It will also cover the history of the MLton project from its inception in 1997 until now, and give some lessons learned and thoughts on the future of MLton.

Proceedings ArticleDOI
TL;DR: This paper introduces anocaml library that encapsulates hash-consed terms in an abstract datatype, thus safely ensuring maximal sharing and is parameterized by an equality that allows the user to identify terms according to an arbitrary equivalence relation.
Abstract: Hash-consing is a technique to share values that are structurally equal. Beyond the obvious advantage of saving memory blocks, hash-consing may also be used to speed up fundamental operations and data structures by several orders of magnitude when sharing is maximal. This paper introduces an \ocaml\ hash-consing library that encapsulates hash-consed terms in an abstract datatype, thus safely ensuring maximal sharing. This library is also parameterized by an equality that allows the user to identify terms according to an arbitrary equivalence relation.

Proceedings ArticleDOI
Don Syme1
TL;DR: This paper explores the use of a modest meta-programming extension to F# to access and leverage the functionality of LINQ and other components, and demonstrates an implementation of language integrated SQL queries using the LINQ/SQLMetal libraries.
Abstract: Language-integrated meta-programming and extensible compilation have been recurring themes of programming languages since the invention of LISP. A recent real-world application of these techniques is the use of small meta-programs to specify database queries, as used in the Microsoft LINQ extensions for .NET. It is important that .NET languages such as F# are able to leverage the functionality provided by LINQ and related components for heterogeneous execution, both for pragmatic reasons and as a first step toward applying more disciplined, formal approaches to these problems. This paper explores the use of a modest meta-programming extension to F# to access and leverage the functionality of LINQ and other components. We do this by demonstrating an implementation of language integrated SQL queries using the LINQ/SQLMetal libraries. We also sketch two other applications: the execution of data-parallel quoted F# programs on a GPU via the Accelerator libraries, and dynamic native-code compilation via LINQ.

Journal ArticleDOI
TL;DR: In this paper, Hierarchical Naive Bayes models are extended with latent variables to relax the assumption that all attributes used to describe an instance are conditionally independent given the class of that instance.
Abstract: Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naive Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe an instance are conditionally independent given the class of that instance. When this assumption is violated (which is often the case in practice) it can reduce classification accuracy due to "information double-counting" and interaction omission. In this paper we focus on a relatively new set of models, termed Hierarchical Naive Bayes models. Hierarchical Naive Bayes models extend the modeling flexibility of Naive Bayes models by introducing latent variables to relax some of the independence statements in these models. We propose a simple algorithm for learning Hierarchical Naive Bayes models in the context of classification. Experimental results show that the learned models can significantly improve classification accuracy as compared to other frameworks.

Journal ArticleDOI
TL;DR: Deterministic annealing can often significantly improve the performance of semi-supervised clustering and the constrained approach is the best when available labels are complete whereas the feedback-based approach excels when available labeled data are incomplete.
Abstract: Semi-supervised learning has become an attractive methodology for improving classification models and is often viewed as using unlabeled data to aid supervised learning. However, it can also be viewed as using labeled data to help clustering, namely, semi-supervised clustering. Viewing semi-supervised learning from a clustering angle is useful in practical situations when the set of labels available in labeled data are not complete, i.e., unlabeled data contain new classes that are not present in labeled data. This paper analyzes several multinomial model-based semi-supervised document clustering methods under a principled model-based clustering framework. The framework naturally leads to a deterministic annealing extension of existing semi-supervised clustering approaches. We compare three (slightly) different semi-supervised approaches for clustering documents: Seeded damnl, Constrained damnl, and Feedback-based damnl, where damnl stands for multinomial model-based deterministic annealing algorithm. The first two are extensions of the seeded k-means and constrained k-means algorithms studied by Basu et al. (2002); the last one is motivated by Cohn et al. (2003). Through empirical experiments on text datasets, we show that: (a) deterministic annealing can often significantly improve the performance of semi-supervised clustering; (b) the constrained approach is the best when available labels are complete whereas the feedback-based approach excels when available labels are incomplete.

Journal ArticleDOI
TL;DR: This paper discusses the problem of rule based classification of XML data by using frequent discriminatory substructures within XML documents and shows the effectiveness of the method with respect to other classifiers.
Abstract: XML documents have recently become ubiquitous because of their varied applicability in a number of applications. Classification is an important problem in the data mining domain, but current classification methods for XML documents use IR-based methods in which each document is treated as a bag of words. Such techniques ignore a significant amount of information hidden inside the documents. In this paper we discuss the problem of rule based classification of XML data by using frequent discriminatory substructures within XML documents. Such a technique is more capable of finding the classification characteristics of documents. In addition, the technique can also be extended to cost sensitive classification. We show the effectiveness of the method with respect to other classifiers. We note that the methodology discussed in this paper is applicable to any kind of semi-structured data.

Journal ArticleDOI
TL;DR: AI research on computer games began to follow developments in the games industry early on, but since John Laird’s keynote address at the AAAI 2000 conference, numerous workshops, conferences, and special issues of journals demonstrate the growing importance of game-playing applications for Artificial Intelligence.
Abstract: The history of the interaction of machine learning and computer game-playing goes back to the earliest days of Artificial Intelligence, when Arthur Samuel worked on his famous checker-playing program, pioneering many machine-learning and game-playing techniques (Samuel, 1959, 1967). Since then, both fields have advanced considerably, and research in the intersection of the two can be found regularly in conferences in their respective fields and in general AI conferences. For surveys of the field we refer to Ginsberg (1998), Schaeffer (2000), Furnkranz (2001); edited volumes have been compiled by Schaeffer and van den Herik (2002) and by Furnkranz and Kubat (2001). In recent years, the computer games industry has discovered AI as a necessary ingredient to make games more entertaining and challenging and, vice versa, AI has discovered computer games as an interesting and rewarding application area. The industry’s perspective is witnessed by a plethora of recent books on gentle introductions to AI techniques for game programmers (Collins, 2002; Champanard, 2003; Bourg & Seemann, 2004; Schwab, 2004) or a series of edited collections of articles (Rabin, 2002, 2003, 2006). AI research on computer games began to follow developments in the games industry early on, but since John Laird’s keynote address at the AAAI 2000 conference, in which he advocated Interactive Computer Games as a challenging and rewarding application area for AI (Laird & van Lent, 2001), numerous workshops (Fu & Orkin, 2004; Aha et al., 2005), conferences, and special issues of journals (Forbus & Laird, 2002) demonstrate the growing importance of game-playing applications for Artificial Intelligence.

Journal ArticleDOI
TL;DR: This work presents a classification-based system for performing automatic melody transcription that makes no assumptions beyond what is learned from its training data, and shows that a simple frame-level note classifier, temporally smoothed by post processing with a hidden Markov model, produces results comparable to state of the art model-based transcription systems.
Abstract: The melody of a musical piece--informally, the part you would hum along with--is a useful and compact summary of a full audio recording. The extraction of melodic content has practical applications ranging from content-based audio retrieval to the analysis of musical structure. Whereas previous systems generate transcriptions based on a model of the harmonic (or periodic) structure of musical pitches, we present a classification-based system for performing automatic melody transcription that makes no assumptions beyond what is learned from its training data. We evaluate the success of our algorithm by predicting the melody of the ADC 2004 Melody Competition evaluation set, and we show that a simple frame-level note classifier, temporally smoothed by post processing with a hidden Markov model, produces results comparable to state of the art model-based transcription systems.

Journal ArticleDOI
TL;DR: The results show that the character level representation of emails and classes facilitated by the suffix tree can significantly improve classification accuracy when compared with the currently popular methods, such as naive Bayes.
Abstract: We present an approach to email filtering based on the suffix tree data structure. A method for the scoring of emails using the suffix tree is developed and a number of scoring and score normalisation functions are tested. Our results show that the character level representation of emails and classes facilitated by the suffix tree can significantly improve classification accuracy when compared with the currently popular methods, such as naive Bayes. We believe the method can be extended to the classification of documents in other domains.

Journal ArticleDOI
TL;DR: A unified view of binary data clustering is presented by examining the connections among various clustering criteria and experimental studies are conducted to empirically verify the relationships.
Abstract: Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This paper studies the problem of clustering binary data. Binary data have been occupying a special place in the domain of data analysis. A unified view of binary data clustering is presented by examining the connections among various clustering criteria. Experimental studies are conducted to empirically verify the relationships.

Journal ArticleDOI
TL;DR: A new method for establishing an alignment between a polyphonic musical score and a corresponding sampled audio performance using a graphical model containing both latent discrete variables, corresponding to score position, as well as a latent continuous tempo process.
Abstract: We present a new method for establishing an alignment between a polyphonic musical score and a corresponding sampled audio performance. The method uses a graphical model containing both latent discrete variables, corresponding to score position, as well as a latent continuous tempo process. We use a simple data model based only on the pitch content of the audio signal. The data interpretation is defined to be the most likely configuration of the hidden variables, given the data, and we develop computational methodology to identify or approximate this configuration using a variant of dynamic programming involving parametrically represented continuous variables. Experiments are presented on a 55-minute hand-marked orchestral test set.

Journal ArticleDOI
TL;DR: The goal in this work is to present a unifying framework that supports all of the types of relational uncertainty yet is based on logic programming formalisms, and facilitates understanding the relationship between the frame-based approaches and alternate logic programming approaches, and allows greater transfer of ideas between them.
Abstract: In this paper, we describe the syntax and semantics for a probabilistic relational language (PRL). PRL is a recasting of recent work in Probabilistic Relational Models (PRMs) into a logic programming framework. We show how to represent varying degrees of complexity in the semantics including attribute uncertainty, structural uncertainty and identity uncertainty. Our approach is similar in spirit to the work in Bayesian Logic Programs (BLPs), and Logical Bayesian Networks (LBNs). However, surprisingly, there are still some important differences in the resulting formalism; for example, we introduce a general notion of aggregates based on the PRM approaches. One of our contributions is that we show how to support richer forms of structural uncertainty in a probabilistic logical language than have been previously described. Our goal in this work is to present a unifying framework that supports all of the types of relational uncertainty yet is based on logic programming formalisms. We also believe that it facilitates understanding the relationship between the frame-based approaches and alternate logic programming approaches, and allows greater transfer of ideas between them.

Proceedings ArticleDOI
TL;DR: This paper describes how Ocsigen uses the Objective Caml type system in a thoroughgoing way in order to produce valid XHTML and valid remote function calls through links and form clicking.
Abstract: Ocsigen is a framework for programming highly dynamic web sites in Objective Caml. It allows to program sites as Ocaml applications and introduces new concepts to take into account the particularities of Web interaction, especially the management of URLs and sessions. This paper describes how Ocsigen uses the Objective Caml type system in a thoroughgoing way in order to produce valid XHTML and valid remote function calls through links and form clicking. It also describes how Ocsigen handles the progression of a Web user through a site, using sophisticated and high-level sessions mechanisms.

Journal ArticleDOI
TL;DR: A representation for melodic segment classes is presented and it is applied to sequential pattern discovery and to the statistical modeling of musical style.
Abstract: This paper presents a representation for melodic segment classes and applies it to music data mining. Melody is modeled as a sequence of segments, each segment being a sequence of notes. These segments are assigned to classes through a knowledge representation scheme which allows the flexible construction of abstract views of the music surface. The representation is applied to sequential pattern discovery and to the statistical modeling of musical style.

Journal ArticleDOI
TL;DR: A new decision-making algorithm that allows models to be used for both opponent agents and partners, while utilizing a novel model-based Monte Carlo sampling method to overcome the problem of hidden information is presented.
Abstract: Bridge bidding is considered to be one of the most difficult problems for game-playing programs. It involves four agents rather than two, including a cooperative agent. In addition, the partial observability of the game makes it impossible to predict the outcome of each action. In this paper we present a new decision-making algorithm that is capable of overcoming these problems. The algorithm allows models to be used for both opponent agents and partners, while utilizing a novel model-based Monte Carlo sampling method to overcome the problem of hidden information. The paper also presents a learning framework that uses the above decision-making algorithm for co-training of partners. The agents refine their selection strategies during training and continuously exchange their refined strategies. The refinement is based on inductive learning applied to examples accumulated for classes of states with conflicting actions. The algorithm was empirically evaluated on a set of bridge deals. The pair of agents that co-trained significantly improved their bidding performance to a level surpassing that of the current state-of-the-art bidding algorithm.

Journal ArticleDOI
TL;DR: A simple and new approach to Boolean networks is suggested, and a randomized network search algorithm with average time complexity O(mnk+1/ (log m)(k−1) is provided.
Abstract: Boolean networks provide a simple and intuitive model for gene regulatory networks, but a critical defect is the time required to learn the networks. In recent years, efficient network search algorithms have been developed for a noise-free case and for a limited function class. In general, the conventional algorithm has the high time complexity of O(22k mn k+1) where m is the number of measurements, n is the number of nodes (genes), and k is the number of input parents. Here, we suggest a simple and new approach to Boolean networks, and provide a randomized network search algorithm with average time complexity O (mn k+1/ (log m)(k?1)). We show the efficiency of our algorithm via computational experiments, and present optimal parameters. Additionally, we provide tests for yeast expression data.

Proceedings ArticleDOI
TL;DR: This paper presents a new way to generate type-error messages in a polymorphic, implicitly, and strongly typed language (specifically Caml), and separates error-message generation from type-checking by taking a fundamentally new approach.
Abstract: We present a new way to generate type-error messages in a polymorphic, implicitly, and strongly typed language (specifically Caml). Our method separates error-message generation from type-checking by taking a fundamentally new approach: we present to programmers small term-level modifications that cause an ill-typed program to become well-typed. This approach aims to improve feedback to programmers with no change to the underlying type-checker nor the compilation of well-typed programs.We have added a prototype implementation of our approach to the Objective Caml system by intercepting type-checker error messages and using the type-checker on candidate changes to see if they succeed. This novel front-end architecture naturally decomposes into (1) enumerating local changes to the abstract syntax tree that may remove type errors, (2) searching for places to try the changes, (3) using the type-checker to evaluate the changes, and (4) ranking the changes and presenting them to the user.

Journal ArticleDOI
TL;DR: This work investigates the problem of how a performance played at a particular tempo can be rendered automatically at another tempo, while preserving naturally sounding expressivity, and presents a case-based reasoning system called TempoExpress for addressing this problem.
Abstract: The research presented in this paper focuses on global tempo transformations of monophonic audio recordings of saxophone jazz performances. We are investigating the problem of how a performance played at a particular tempo can be rendered automatically at another tempo, while preserving naturally sounding expressivity. Or, differently stated, how does expressiveness change with global tempo. Changing the tempo of a given melody is a problem that cannot be reduced to just applying a uniform transformation to all the notes of a musical piece. The expressive resources for emphasizing the musical structure of the melody and the affective content differ depending on the performance tempo. We present a case-based reasoning system called TempoExpress for addressing this problem, and describe the experimental results obtained with our approach.