Showing papers in &quot;Machine Learning in 2006&quot;

The max-min hill-climbing Bayesian network structure learning algorithm

TL;DR: Experiments with a real-world database and knowledge base in a university domain illustrate the promise of this approach to combining first-order logic and probabilistic graphical models in a single representation.

...read moreread less

Abstract: We propose a simple approach to combining first-order logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a first-order knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects in the domain, it specifies a ground Markov network containing one feature for each possible grounding of a first-order formula in the KB, with the corresponding weight. Inference in MLNs is performed by MCMC over the minimal subset of the ground network required for answering the query. Weights are efficiently learned from relational databases by iteratively optimizing a pseudo-likelihood measure. Optionally, additional clauses are learned using inductive logic programming techniques. Experiments with a real-world database and knowledge base in a university domain illustrate the promise of this approach.

...read moreread less

2,916 citations

Journal Article•DOI•

[...]

Ioannis Tsamardinos¹, Laura E. Brown¹, Constantin F. Aliferis¹•Institutions (1)

Vanderbilt University¹

An analysis of diversity measures

TL;DR: The first empirical results simultaneously comparing most of the major Bayesian network algorithms against each other are presented, namely the PC, Sparse Candidate, Three Phase Dependency Analysis, Optimal Reinsertion, Greedy Equivalence Search, and Greedy Search.

...read moreread less

Abstract: We present a new algorithm for Bayesian network structure learning, called Max-Min Hill-Climbing (MMHC). The algorithm combines ideas from local learning, constraint-based, and search-and-score techniques in a principled and effective way. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. In our extensive empirical evaluation MMHC outperforms on average and in terms of various metrics several prototypical and state-of-the-art algorithms, namely the PC, Sparse Candidate, Three Phase Dependency Analysis, Optimal Reinsertion, Greedy Equivalence Search, and Greedy Search. These are the first empirical results simultaneously comparing most of the major Bayesian network algorithms against each other. MMHC offers certain theoretical advantages, specifically over the Sparse Candidate algorithm, corroborated by our experiments. MMHC and detailed results of our study are publicly available at http://www.dsl-lab.org/supplements/mmhc_paper/mmhc_index.html.

...read moreread less

1,682 citations

Journal Article•DOI•

[...]

E. K. Tang¹, Ponnuthurai Nagaratnam Suganthan¹, Xin Yao²•Institutions (2)

Nanyang Technological University¹, University of Birmingham²

Cost curves: An improved method for visualizing classifier performance

TL;DR: A theoretical analysis on six existing diversity measures is presented, show underlying relationships between them, and relate them to the concept of margin, which is more explicitly related to the success of ensemble learning algorithms.

...read moreread less

Abstract: Diversity among the base classifiers is deemed to be important when constructing a classifier ensemble. Numerous algorithms have been proposed to construct a good classifier ensemble by seeking both the accuracy of the base classifiers and the diversity among them. However, there is no generally accepted definition of diversity, and measuring the diversity explicitly is very difficult. Although researchers have designed several experimental studies to compare different diversity measures, usually confusing results were observed. In this paper, we present a theoretical analysis on six existing diversity measures (namely disagreement measure, double fault measure, KW variance, inter-rater agreement, generalized diversity and measure of difficulty), show underlying relationships between them, and relate them to the concept of margin, which is more explicitly related to the success of ensemble learning algorithms. We illustrate why confusing experimental results were observed and show that the discussed diversity measures are naturally ineffective. Our analysis provides a deeper understanding of the concept of diversity, and hence can help design better ensemble learning algorithms.

...read moreread less

380 citations

Journal Article•DOI•

[...]

Chris Drummond¹, Robert C. Holte²•Institutions (2)

National Research Council¹, University of Alberta²

Aggregate features and ADABOOST for music classification

TL;DR: This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2-class classifiers over the full range of possible class distributions and misclassification costs.

...read moreread less

Abstract: This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2-class classifiers over the full range of possible class distributions and misclassification costs. Cost curves are shown to be superior to ROC curves for visualizing classifier performance for most purposes. This is because they visually support several crucial types of performance assessment that cannot be done easily with ROC curves, such as showing confidence intervals on a classifier's performance, and visualizing the statistical significance of the difference in performance of two classifiers. A software tool supporting all the cost curve analysis described in this paper is available from the authors.

...read moreread less

313 citations

Journal Article•DOI•

[...]

James Bergstra¹, Norman Casagrande¹, Dumitru Erhan¹, Douglas Eck¹, Balázs Kégl¹ - Show less +1 more•Institutions (1)

Université de Montréal¹

Adaptive game AI with dynamic scripting

TL;DR: An algorithm that predicts musical genre and artist from an audio waveform using the ensemble learner ADABOOST and evidence collected from a variety of popular features and classifiers that the technique of classifying features aggregated over segments of audio is better than classifying either entire songs or individual short-timescale features.

...read moreread less

Abstract: We present an algorithm that predicts musical genre and artist from an audio waveform. Our method uses the ensemble learner ADABOOST to select from a set of audio features that have been extracted from segmented audio and then aggregated. Our classifier proved to be the most effective method for genre classification at the recent MIREX 2005 international contests in music information extraction, and the second-best method for recognizing artists. This paper describes our method in detail, from feature extraction to song classification, and presents an evaluation of our method on three genre databases and two artist-recognition databases. Furthermore, we present evidence collected from a variety of popular features and classifiers that the technique of classifying features aggregated over segments of audio is better than classifying either entire songs or individual short-timescale features.

...read moreread less

296 citations

Journal Article•DOI•

[...]

Pieter Spronck¹, Marc Ponsen¹, Ida G. Sprinkhuizen-Kuyper¹, Eric O. Postma¹•Institutions (1)

Maastricht University¹

01 Jun 2006-Machine Learning

TL;DR: It is concluded that dynamic scripting can be successfully applied to the online adaptation of game AI in commercial computer games by implementing the technique in the game Neverwinter Nights.

...read moreread less

Abstract: Online learning in commercial computer games allows computer-controlled opponents to adapt to the way the game is being played. As such it provides a mechanism to deal with weaknesses in the game AI, and to respond to changes in human player tactics. We argue that online learning of game AI should meet four computational and four functional requirements. The computational requirements are speed, effectiveness, robustness and efficiency. The functional requirements are clarity, variety, consistency and scalability. This paper investigates a novel online learning technique for game AI called `dynamic scripting', that uses an adaptive rulebase for the generation of game AI on the fly. The performance of dynamic scripting is evaluated in experiments in which adaptive agents are pitted against a collection of manually-designed tactics in a simulated computer roleplaying game. Experimental results indicate that dynamic scripting succeeds in endowing computer-controlled opponents with adaptive performance. To further improve the dynamic-scripting technique, an enhancement is investigated that allows scaling of the difficulty level of the game AI to the human player's skill level. With the enhancement, dynamic scripting meets all computational and functional requirements. The applicability of dynamic scripting in state-of-the-art commercial games is demonstrated by implementing the technique in the game Neverwinter Nights. We conclude that dynamic scripting can be successfully applied to the online adaptation of game AI in commercial computer games.

...read moreread less

274 citations

Journal Article•DOI•

Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming

[...]

Abraham P. George¹, Warren B. Powell¹•Institutions (1)

Princeton University¹

MODL: A Bayes optimal discretization method for continuous attributes

TL;DR: This paper reviews the literature on deterministic and stochastic stepsize rules, and derives formulas for optimal stepsizes for minimizing estimation error, and an approximation is proposed for the case where the parameters are unknown.

...read moreread less

Abstract: We address the problem of determining optimal stepsizes for estimating parameters in the context of approximate dynamic programming. The sufficient conditions for convergence of the stepsize rules have been known for 50 years, but practical computational work tends to use formulas with parameters that have to be tuned for specific applications. The problem is that in most applications in dynamic programming, observations for estimating a value function typically come from a data series that can be initially highly transient. The degree of transience affects the choice of stepsize parameters that produce the fastest convergence. In addition, the degree of initial transience can vary widely among the value function parameters for the same dynamic program. This paper reviews the literature on deterministic and stochastic stepsize rules, and derives formulas for optimal stepsizes for minimizing estimation error. This formula assumes certain parameters are known, and an approximation is proposed for the case where the parameters are unknown. Experimental work shows that the approximation provides faster convergence than other popular formulas.

...read moreread less

201 citations

Journal Article•DOI•

[...]

Marc Boullé¹•Institutions (1)

Orange S.A.¹

Kernels as features: On kernels, margins, and low-dimensional mappings

TL;DR: A new discretization method MODL1 is proposed, founded on a Bayesian approach, which results in the definition of a Bayes optimal evaluation criterion of discretizations and a new super-linear optimization algorithm that manages to find near-optimal discretizers.

...read moreread less

Abstract: While real data often comes in mixed format, discrete and continuous, many supervised induction algorithms require discrete data. Efficient discretization of continuous attributes is an important problem that has effects on speed, accuracy and understandability of the induction models. In this paper, we propose a new discretization method MODL1, founded on a Bayesian approach. We introduce a space of discretization models and a prior distribution defined on this model space. This results in the definition of a Bayes optimal evaluation criterion of discretizations. We then propose a new super-linear optimization algorithm that manages to find near-optimal discretizations. Extensive comparative experiments both on real and synthetic data demonstrate the high inductive performances obtained by the new discretization method.

...read moreread less

175 citations

Journal Article•DOI•

[...]

Maria-Florina Balcan¹, Avrim Blum¹, Santosh Vempala²•Institutions (2)

Carnegie Mellon University¹, Massachusetts Institute of Technology²

Distribution-based aggregation for relational learning with identifier attributes

TL;DR: The question of whether one can efficiently produce low-dimensional mappings, using only black-box access to a kernel function, is explored, and it is found that if the data was linearly separable with margin γ under the kernel, then it is approximately separable in this new feature space.

...read moreread less

Abstract: Kernel functions are typically viewed as providing an implicit mapping of points into a high-dimensional space, with the ability to gain much of the power of that space without incurring a high cost if the result is linearly-separable by a large margin ?. However, the Johnson-Lindenstrauss lemma suggests that in the presence of a large margin, a kernel function can also be viewed as a mapping to a low-dimensional space, one of dimension only $$\tilde{O}(1/\gamma^2)$$ . In this paper, we explore the question of whether one can efficiently produce such low-dimensional mappings, using only black-box access to a kernel function. That is, given just a program that computes K(x,y) on inputs x,y of our choosing, can we efficiently construct an explicit (small) set of features that effectively capture the power of the implicit high-dimensional space? We answer this question in the affirmative if our method is also allowed black-box access to the underlying data distribution (i.e., unlabeled examples). We also give a lower bound, showing that if we do not have access to the distribution, then this is not possible for an arbitrary black-box kernel function; we leave as an open problem, however, whether this can be done for standard kernel functions such as the polynomial kernel. Our positive result can be viewed as saying that designing a good kernel function is much like designing a good feature space. Given a kernel, by running it in a black-box manner on random unlabeled examples, we can efficiently generate an explicit set of $$\tilde{O}(1/\gamma^2)$$ features, such that if the data was linearly separable with margin ? under the kernel, then it is approximately separable in this new feature space.

...read moreread less

155 citations

Journal Article•DOI•

[...]

Claudia Perlich¹, Foster Provost²•Institutions (2)

IBM¹, New York University²

Propositionalization-based relational subgroup discovery with RSD

TL;DR: This paper provides extensive empirical evidence that the distribution-based aggregators indeed do facilitate modeling with high-dimensional categorical attributes, and conjecture special properties of identifier attributes, e.g., they proxy for unobserved attributes and for information deeper in the relationship network.

...read moreread less

Abstract: Identifier attributes--very high-dimensional categorical attributes such as particular product ids or people's names--rarely are incorporated in statistical modeling. However, they can play an important role in relational modeling: it may be informative to have communicated with a particular set of people or to have purchased a particular set of products. A key limitation of existing relational modeling techniques is how they aggregate bags (multisets) of values from related entities. The aggregations used by existing methods are simple summaries of the distributions of features of related entities: e.g., MEAN, MODE, SUM, or COUNT. This paper's main contribution is the introduction of aggregation operators that capture more information about the value distributions, by storing meta-data about value distributions and referencing this meta-data when aggregating--for example by computing class-conditional distributional distances. Such aggregations are particularly important for aggregating values from high-dimensional categorical attributes, for which the simple aggregates provide little information. In the first half of the paper we provide general guidelines for designing aggregation operators, introduce the new aggregators in the context of the relational learning system ACORA (Automated Construction of Relational Attributes), and provide theoretical justification. We also conjecture special properties of identifier attributes, e.g., they proxy for unobserved attributes and for information deeper in the relationship network. In the second half of the paper we provide extensive empirical evidence that the distribution-based aggregators indeed do facilitate modeling with high-dimensional categorical attributes, and in support of the aforementioned conjectures.

...read moreread less

Journal Article•DOI•

[...]

Filip Železný¹, Nada Lavrač•Institutions (1)

Czech Technical University in Prague¹

Whole-program compilation in MLton

TL;DR: This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction, which was successfully applied to standard ILP problems and real-life problems.

...read moreread less

Abstract: Relational rule learning algorithms are typically designed to construct classification and prediction rules. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction. The proposed approach was successfully applied to standard ILP problems (East-West trains, King-Rook-King chess endgame and mutagenicity prediction) and two real-life problems (analysis of telephone calls and traffic accident analysis).

...read moreread less

Proceedings Article•DOI•

[...]

Stephen Weeks

Type-safe modular hash-consing

TL;DR: This talk will describe MLton's approach to whole-program compilation, covering the optimizations and the intermediate languages, as well as some of the engineering challenges that were overcome to make it feasible to use MLton on programs with over one hundred thousand lines.

...read moreread less

Abstract: MLton is a stable, robust, widely ported, Standard ML (SML) compiler that generates efficient executables. Whole-program compilation is the key to MLton's success, significantly improving both correctness and efficiency. Whole-program compilation makes possible a number of optimizations that reduce or eliminate the cost of SML's powerful abstraction mechanisms, such as parametric modules, polymorphism, and higher-order functions. It also allows MLton to use a simply-typed, first-order, intermediate language. By structuring the bulk of MLton's optimizer as small passes on whole programs in this simple intermediate language, it is easy to implement and debug new optimizations. This intermediate language uses a variant of standard control-flow graphs and static single assignment form, which makes it easy to implement traditional local optimizations as well. Having the whole program also enables standard data representations such as unboxed integers and arrays, as well as efficient representations for user-defined data structures.This talk will describe MLton's approach to whole-program compilation, covering the optimizations and the intermediate languages, as well as some of the engineering challenges that were overcome to make it feasible to use MLton on programs with over one hundred thousand lines. It will also cover the history of the MLton project from its inception in 1997 until now, and give some lessons learned and thoughts on the future of MLton.

...read moreread less

Proceedings Article•DOI•

[...]

Jean-Christophe Filliâtre¹, Sylvain Conchon¹•Institutions (1)

University of Paris-Sud¹

Leveraging .NET meta-programming components from F#: integrated queries and interoperable heterogeneous execution

TL;DR: This paper introduces anocaml library that encapsulates hash-consed terms in an abstract datatype, thus safely ensuring maximal sharing and is parameterized by an equality that allows the user to identify terms according to an arbitrary equivalence relation.

...read moreread less

Abstract: Hash-consing is a technique to share values that are structurally equal. Beyond the obvious advantage of saving memory blocks, hash-consing may also be used to speed up fundamental operations and data structures by several orders of magnitude when sharing is maximal. This paper introduces an \ocaml\ hash-consing library that encapsulates hash-consed terms in an abstract datatype, thus safely ensuring maximal sharing. This library is also parameterized by an equality that allows the user to identify terms according to an arbitrary equivalence relation.

...read moreread less

Proceedings Article•DOI•

[...]

Don Syme¹•Institutions (1)

Microsoft¹

Classification using Hierarchical Naïve Bayes models

TL;DR: This paper explores the use of a modest meta-programming extension to F# to access and leverage the functionality of LINQ and other components, and demonstrates an implementation of language integrated SQL queries using the LINQ/SQLMetal libraries.

...read moreread less

Abstract: Language-integrated meta-programming and extensible compilation have been recurring themes of programming languages since the invention of LISP. A recent real-world application of these techniques is the use of small meta-programs to specify database queries, as used in the Microsoft LINQ extensions for .NET. It is important that .NET languages such as F# are able to leverage the functionality provided by LINQ and related components for heterogeneous execution, both for pragmatic reasons and as a first step toward applying more disciplined, formal approaches to these problems. This paper explores the use of a modest meta-programming extension to F# to access and leverage the functionality of LINQ and other components. We do this by demonstrating an implementation of language integrated SQL queries using the LINQ/SQLMetal libraries. We also sketch two other applications: the execution of data-parallel quoted F# programs on a GPU via the Accelerator libraries, and dynamic native-code compilation via LINQ.

...read moreread less

Journal Article•DOI•

[...]

Helge Langseth, Thomas D. Nielsen¹•Institutions (1)

Aalborg University¹

01 May 2006-Machine Learning

TL;DR: In this paper, Hierarchical Naive Bayes models are extended with latent variables to relax the assumption that all attributes used to describe an instance are conditionally independent given the class of that instance.

...read moreread less

Abstract: Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naive Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe an instance are conditionally independent given the class of that instance. When this assumption is violated (which is often the case in practice) it can reduce classification accuracy due to "information double-counting" and interaction omission. In this paper we focus on a relatively new set of models, termed Hierarchical Naive Bayes models. Hierarchical Naive Bayes models extend the modeling flexibility of Naive Bayes models by introducing latent variables to relax some of the independence statements in these models. We propose a simple algorithm for learning Hierarchical Naive Bayes models in the context of classification. Experimental results show that the learned models can significantly improve classification accuracy as compared to other frameworks.

...read moreread less

Journal Article•DOI•

Semi-supervised model-based document clustering: A comparative study

[...]

Shi Zhong¹•Institutions (1)

Florida Atlantic University¹

XRules: An effective algorithm for structural classification of XML data

TL;DR: Deterministic annealing can often significantly improve the performance of semi-supervised clustering and the constrained approach is the best when available labels are complete whereas the feedback-based approach excels when available labeled data are incomplete.

...read moreread less

Abstract: Semi-supervised learning has become an attractive methodology for improving classification models and is often viewed as using unlabeled data to aid supervised learning. However, it can also be viewed as using labeled data to help clustering, namely, semi-supervised clustering. Viewing semi-supervised learning from a clustering angle is useful in practical situations when the set of labels available in labeled data are not complete, i.e., unlabeled data contain new classes that are not present in labeled data. This paper analyzes several multinomial model-based semi-supervised document clustering methods under a principled model-based clustering framework. The framework naturally leads to a deterministic annealing extension of existing semi-supervised clustering approaches. We compare three (slightly) different semi-supervised approaches for clustering documents: Seeded damnl, Constrained damnl, and Feedback-based damnl, where damnl stands for multinomial model-based deterministic annealing algorithm. The first two are extensions of the seeded k-means and constrained k-means algorithms studied by Basu et al. (2002); the last one is motivated by Cohn et al. (2003). Through empirical experiments on text datasets, we show that: (a) deterministic annealing can often significantly improve the performance of semi-supervised clustering; (b) the constrained approach is the best when available labels are complete whereas the feedback-based approach excels when available labels are incomplete.

...read moreread less

Journal Article•DOI•

[...]

Mohammed J. Zaki¹, Charu C. Aggarwal²•Institutions (2)

Rensselaer Polytechnic Institute¹, IBM²

Machine learning and games

TL;DR: This paper discusses the problem of rule based classification of XML data by using frequent discriminatory substructures within XML documents and shows the effectiveness of the method with respect to other classifiers.

...read moreread less

Abstract: XML documents have recently become ubiquitous because of their varied applicability in a number of applications. Classification is an important problem in the data mining domain, but current classification methods for XML documents use IR-based methods in which each document is treated as a bag of words. Such techniques ignore a significant amount of information hidden inside the documents. In this paper we discuss the problem of rule based classification of XML data by using frequent discriminatory substructures within XML documents. Such a technique is more capable of finding the classification characteristics of documents. In addition, the technique can also be extended to cost sensitive classification. We show the effectiveness of the method with respect to other classifiers. We note that the methodology discussed in this paper is applicable to any kind of semi-structured data.

...read moreread less

Journal Article•DOI•

[...]

Michael Bowling¹, Johannes Fürnkranz², Thore Graepel³, Ron Musick•Institutions (3)

University of Alberta¹, Technische Universität Darmstadt², Microsoft³

01 Jun 2006-Machine Learning

TL;DR: AI research on computer games began to follow developments in the games industry early on, but since John Laird’s keynote address at the AAAI 2000 conference, numerous workshops, conferences, and special issues of journals demonstrate the growing importance of game-playing applications for Artificial Intelligence.

...read moreread less

Abstract: The history of the interaction of machine learning and computer game-playing goes back to the earliest days of Artificial Intelligence, when Arthur Samuel worked on his famous checker-playing program, pioneering many machine-learning and game-playing techniques (Samuel, 1959, 1967). Since then, both fields have advanced considerably, and research in the intersection of the two can be found regularly in conferences in their respective fields and in general AI conferences. For surveys of the field we refer to Ginsberg (1998), Schaeffer (2000), Furnkranz (2001); edited volumes have been compiled by Schaeffer and van den Herik (2002) and by Furnkranz and Kubat (2001). In recent years, the computer games industry has discovered AI as a necessary ingredient to make games more entertaining and challenging and, vice versa, AI has discovered computer games as an interesting and rewarding application area. The industry’s perspective is witnessed by a plethora of recent books on gentle introductions to AI techniques for game programmers (Collins, 2002; Champanard, 2003; Bourg & Seemann, 2004; Schwab, 2004) or a series of edited collections of articles (Rabin, 2002, 2003, 2006). AI research on computer games began to follow developments in the games industry early on, but since John Laird’s keynote address at the AAAI 2000 conference, in which he advocated Interactive Computer Games as a challenging and rewarding application area for AI (Laird & van Lent, 2001), numerous workshops (Fu & Orkin, 2004; Aha et al., 2005), conferences, and special issues of journals (Forbus & Laird, 2002) demonstrate the growing importance of game-playing applications for Artificial Intelligence.

...read moreread less

Journal Article•DOI•

Classification-based melody transcription

[...]

Daniel P. W. Ellis¹, Graham E. Poliner¹•Institutions (1)

Columbia University¹

A suffix tree approach to anti-spam email filtering

TL;DR: This work presents a classification-based system for performing automatic melody transcription that makes no assumptions beyond what is learned from its training data, and shows that a simple frame-level note classifier, temporally smoothed by post processing with a hidden Markov model, produces results comparable to state of the art model-based transcription systems.

...read moreread less

Abstract: The melody of a musical piece--informally, the part you would hum along with--is a useful and compact summary of a full audio recording. The extraction of melodic content has practical applications ranging from content-based audio retrieval to the analysis of musical structure. Whereas previous systems generate transcriptions based on a model of the harmonic (or periodic) structure of musical pitches, we present a classification-based system for performing automatic melody transcription that makes no assumptions beyond what is learned from its training data. We evaluate the success of our algorithm by predicting the melody of the ADC 2004 Melody Competition evaluation set, and we show that a simple frame-level note classifier, temporally smoothed by post processing with a hidden Markov model, produces results comparable to state of the art model-based transcription systems.

...read moreread less

Journal Article•DOI•

[...]

Rajesh Pampapathi¹, Boris Mirkin¹, Mark Levene¹•Institutions (1)

Birkbeck, University of London¹

A Unified View on Clustering Binary Data

TL;DR: The results show that the character level representation of emails and classes facilitated by the suffix tree can significantly improve classification accuracy when compared with the currently popular methods, such as naive Bayes.

...read moreread less

Abstract: We present an approach to email filtering based on the suffix tree data structure. A method for the scoring of emails using the suffix tree is developed and a number of scoring and score normalisation functions are tested. Our results show that the character level representation of emails and classes facilitated by the suffix tree can significantly improve classification accuracy when compared with the currently popular methods, such as naive Bayes. We believe the method can be extended to the classification of documents in other domains.

...read moreread less

Journal Article•DOI•

[...]

Tao Li¹•Institutions (1)

Florida International University¹

01 Mar 2006-Machine Learning

TL;DR: A unified view of binary data clustering is presented by examining the connections among various clustering criteria and experimental studies are conducted to empirically verify the relationships.

...read moreread less

Abstract: Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This paper studies the problem of clustering binary data. Binary data have been occupying a special place in the domain of data analysis. A unified view of binary data clustering is presented by examining the connections among various clustering criteria. Experimental studies are conducted to empirically verify the relationships.

...read moreread less

Journal Article•DOI•

Aligning music audio with symbolic scores using a hybrid graphical model

[...]

Christopher Raphael¹•Institutions (1)

Indiana University¹

PRL: A probabilistic relational language

TL;DR: A new method for establishing an alignment between a polyphonic musical score and a corresponding sampled audio performance using a graphical model containing both latent discrete variables, corresponding to score position, as well as a latent continuous tempo process.

...read moreread less

Abstract: We present a new method for establishing an alignment between a polyphonic musical score and a corresponding sampled audio performance. The method uses a graphical model containing both latent discrete variables, corresponding to score position, as well as a latent continuous tempo process. We use a simple data model based only on the pitch content of the audio signal. The data interpretation is defined to be the most likely configuration of the hidden variables, given the data, and we develop computational methodology to identify or approximate this configuration using a variant of dynamic programming involving parametrically represented continuous variables. Experiments are presented on a 55-minute hand-marked orchestral test set.

...read moreread less

Journal Article•DOI•

[...]

Lise Getoor¹, John Grant²•Institutions (2)

University of Maryland, College Park¹, Towson University²

Ocsigen: typing web interaction with objective Caml

TL;DR: The goal in this work is to present a unifying framework that supports all of the types of relational uncertainty yet is based on logic programming formalisms, and facilitates understanding the relationship between the frame-based approaches and alternate logic programming approaches, and allows greater transfer of ideas between them.

...read moreread less

Abstract: In this paper, we describe the syntax and semantics for a probabilistic relational language (PRL). PRL is a recasting of recent work in Probabilistic Relational Models (PRMs) into a logic programming framework. We show how to represent varying degrees of complexity in the semantics including attribute uncertainty, structural uncertainty and identity uncertainty. Our approach is similar in spirit to the work in Bayesian Logic Programs (BLPs), and Logical Bayesian Networks (LBNs). However, surprisingly, there are still some important differences in the resulting formalism; for example, we introduce a general notion of aggregates based on the PRM approaches. One of our contributions is that we show how to support richer forms of structural uncertainty in a probabilistic logical language than have been previously described. Our goal in this work is to present a unifying framework that supports all of the types of relational uncertainty yet is based on logic programming formalisms. We also believe that it facilitates understanding the relationship between the frame-based approaches and alternate logic programming approaches, and allows greater transfer of ideas between them.

...read moreread less

Proceedings Article•DOI•

[...]

Vincent Balat

Melodic analysis with segment classes

TL;DR: This paper describes how Ocsigen uses the Objective Caml type system in a thoroughgoing way in order to produce valid XHTML and valid remote function calls through links and form clicking.

...read moreread less

Abstract: Ocsigen is a framework for programming highly dynamic web sites in Objective Caml. It allows to program sites as Ocaml applications and introduces new concepts to take into account the particularities of Web interaction, especially the management of URLs and sessions. This paper describes how Ocsigen uses the Objective Caml type system in a thoroughgoing way in order to produce valid XHTML and valid remote function calls through links and form clicking. It also describes how Ocsigen handles the progression of a Web user through a site, using sophisticated and high-level sessions mechanisms.

...read moreread less

Journal Article•DOI•

[...]

Darrell C. Conklin¹•Institutions (1)

City University London¹

Learning to bid in bridge

TL;DR: A representation for melodic segment classes is presented and it is applied to sequential pattern discovery and to the statistical modeling of musical style.

...read moreread less

Abstract: This paper presents a representation for melodic segment classes and applies it to music data mining. Melody is modeled as a sequence of segments, each segment being a sequence of notes. These segments are assigned to classes through a knowledge representation scheme which allows the flexible construction of abstract views of the music surface. The representation is applied to sequential pattern discovery and to the statistical modeling of musical style.

...read moreread less

Journal Article•DOI•

[...]

Asaf Amit¹, Shaul Markovitch¹•Institutions (1)

Technion – Israel Institute of Technology¹

01 Jun 2006-Machine Learning

TL;DR: A new decision-making algorithm that allows models to be used for both opponent agents and partners, while utilizing a novel model-based Monte Carlo sampling method to overcome the problem of hidden information is presented.

...read moreread less

Abstract: Bridge bidding is considered to be one of the most difficult problems for game-playing programs. It involves four agents rather than two, including a cooperative agent. In addition, the partial observability of the game makes it impossible to predict the outcome of each action. In this paper we present a new decision-making algorithm that is capable of overcoming these problems. The algorithm allows models to be used for both opponent agents and partners, while utilizing a novel model-based Monte Carlo sampling method to overcome the problem of hidden information. The paper also presents a learning framework that uses the above decision-making algorithm for co-training of partners. The agents refine their selection strategies during training and continuously exchange their refined strategies. The refinement is based on inductive learning applied to examples accumulated for classes of states with conflicting actions. The algorithm was empirically evaluated on a set of bridge deals. The pair of agents that co-trained significantly improved their bidding performance to a level surpassing that of the current state-of-the-art bidding algorithm.

...read moreread less

Journal Article•DOI•

An efficient top-down search algorithm for learning Boolean networks of gene expression

[...]

Dougu Nam¹, Seunghyun Seo², Sangsoo Kim•Institutions (2)

Korea Research Institute of Bioscience and Biotechnology¹, Seoul National University²

Seminal: searching for ML type-error messages

TL;DR: A simple and new approach to Boolean networks is suggested, and a randomized network search algorithm with average time complexity O(mnk+1/ (log m)(k−1) is provided.

...read moreread less

Abstract: Boolean networks provide a simple and intuitive model for gene regulatory networks, but a critical defect is the time required to learn the networks. In recent years, efficient network search algorithms have been developed for a noise-free case and for a limited function class. In general, the conventional algorithm has the high time complexity of O(22k mn k+1) where m is the number of measurements, n is the number of nodes (genes), and k is the number of input parents. Here, we suggest a simple and new approach to Boolean networks, and provide a randomized network search algorithm with average time complexity O (mn k+1/ (log m)(k?1)). We show the efficiency of our algorithm via computational experiments, and present optimal parameters. Additionally, we provide tests for yeast expression data.

...read moreread less

Proceedings Article•DOI•

[...]

Benjamin S. Lerner¹, Dan Grossman¹, Craig Chambers¹•Institutions (1)

University of Washington¹

A case based approach to expressivity-aware tempo transformation

TL;DR: This paper presents a new way to generate type-error messages in a polymorphic, implicitly, and strongly typed language (specifically Caml), and separates error-message generation from type-checking by taking a fundamentally new approach.

...read moreread less

Abstract: We present a new way to generate type-error messages in a polymorphic, implicitly, and strongly typed language (specifically Caml). Our method separates error-message generation from type-checking by taking a fundamentally new approach: we present to programmers small term-level modifications that cause an ill-typed program to become well-typed. This approach aims to improve feedback to programmers with no change to the underlying type-checker nor the compilation of well-typed programs.We have added a prototype implementation of our approach to the Objective Caml system by intercepting type-checker error messages and using the type-checker on candidate changes to see if they succeed. This novel front-end architecture naturally decomposes into (1) enumerating local changes to the abstract syntax tree that may remove type errors, (2) searching for places to try the changes, (3) using the type-checker to evaluate the changes, and (4) ranking the changes and presenting them to the user.

...read moreread less

Journal Article•DOI•

[...]

Maarten Grachten¹, Josep Lluis Arcos¹, Ramon López de Mántaras¹•Institutions (1)

Spanish National Research Council¹