scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Pattern Recognition and Machine Learning

01 Aug 2007-Technometrics (Taylor & Francis)-Vol. 49, Iss: 3, pp 366-366
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
Abstract: (2007). Pattern Recognition and Machine Learning. Technometrics: Vol. 49, No. 3, pp. 366-366.
Citations
More filters
Proceedings Article
01 Jan 2008
TL;DR: A new model is proposed that can be used to provide a user with highly influential blog postings on the topic of the user’s interest and which shows that the new PageRank results in superior performance than the traditional PageRank on key-word search.
Abstract: In this work, we address the twin problems of unsupervised topic discovery and estimation of topic specific influence of blogs. We propose a new model that can be used to provide a user with highly influential blog postings on the topic of the user’s interest. We adopt the framework of an unsupervised model called Latent Dirichlet Allocation(Blei, Ng, & Jordan 2003), known for its effectiveness in topic discovery. An extension of this model, which we call Link-LDA (Erosheva, Fienberg, & Lafferty 2004), defines a generative model for hyperlinks and thereby models topic specific influence of documents, the problem of our interest. However, this model does not exploit the topical relationship between the documents on either side of a hyperlink, i.e., the notion that documents tend to link to other documents on the same topic. We propose a new model, called Link-PLSA-LDA, that combines PLSA (Hoffman 1999) and LDA (Blei, Ng, & Jordan 2003) into a single framework, and explicitly models the topical relationship between the linking and the linked document. The output of the new model on blog data reveals very interesting visualizations of topics and influential blogs on each topic. We also perform quantitative evaluation of the model using log-likelihood of unseen data and on the task of link prediction. Both experiments show that that the new model performs better, suggesting its superiority over Link-LDA in modeling topics and topic specific influence of blogs. Introduction Proliferation of blogs in the recent past has posed several new, interesting challenges to researchers in the information retrieval and data mining community. In particular, there is an increasing need for automatic techniques to help the users quickly access blogs that are not only informative and popular, but also relevant to the user’s topics of interest. Significant progress has been made in the recent past, towards this objective. For example Java et al (Java et al. 2006) studied the performance of various algorithms such as PageRank, HITS and in-degree, on modeling influence of blogs. Kale et al (Kale et al. 2006) exploited the polarity (agreement/disagreement) of the hyperlinks and applied a trust propagation algorithm to model the propagation of influence between blogs. Copyright c © 2008, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. The above mentioned papers address modeling influence in general, but it is also important to model influence of blogs with respect to the topic of the user’s interest. This problem has been addressed by the work of Haveliwala (Haveliwala 2002) in the context of key-word search. In this paper, PageRanks of documents are pre-computed for a certain number of topics. At query time, for each document matching the query, its PageRanks for various topics are combined based on the similarity of the query to each topic, to obtain a topic-sensitive PageRank. The author shows that the new PageRank results in superior performance than the traditional PageRank on key-word search. The topics used in the algorithm are, however, obtained from an external repository. Ideally, it would be very useful to mine these topics automatically as well. The problem of automatic topic mining from blogs has been addressed by Glance et al (Natalie S. Glance & Tomokiyo 2006), where the authors used a combination of NLP techniques, clustering and heuristics to mine topics and trends from blogs. However, this work does not address modeling the influence of blog postings with respect to the topics discovered. In our work, we aim at addressing both these problems simultaneously, i.e., topic discovery as well as modeling topic specific influence of blogs, in a completely unsupervised fashion. Towards this objective, we employ the probabilistic framework of latent topic models such as the Latent Dirichlet Allocation (Blei, Ng, & Jordan 2003), and propose a new model in this framework. The rest of the paper is organized as follows. In section , we discuss some of the past work done on joint models of topics and influence in the framework of latent topic models. We describe our new model in section . In section , we report the results of our experiments on blog data. We conclude the discussion in section with a few remarks on directions for future work. Note that in the rest of the paper, we use the terms ‘citation’ and ‘hyperlink’ interchangeably. Likewise, note that the term ‘citing’ is synonymous to ‘linking’ and so is ‘cited’ to ‘linked’. The reader is also recommended to refer to table 1 for some frequent notation used in this paper. M Total number of documents M← Number of cited documents M→ Number of citing documents V Vocabulary size K Number of topics N← Total number of words in the cited set d A citing document d A cited document ∆(p) A simplex of dimension (p− 1) c(d, d) citation from d to d Dir(·|α) Dirichlet distribution with parameter α Mult(·|β) Multinomial distribution with parameter β Ld Number of hyperlinks in document d Nd Number of words in document d βkw Probability of word w w.r.t. topic k Ωkd′ Probability of hyperlink to document d w.r.t. topic k πk Probability of topic k in the cited document set.

137 citations


Cites background from "Pattern Recognition and Machine Lea..."

  • ...One can see that information flows from the cited documents to the citing documents through the unobserved nodesβ andΩ, as per the D-separation principle in Bayesian networks (Bishop 2006)....

    [...]

  • ...One can see that information flows from the cited documents to the citing documents through the unobserved nodes β and Ω, as per the D-separation principle in Bayesian networks [2]....

    [...]

Journal ArticleDOI
TL;DR: Combining the power of automated ensembles and locality can lead to competitive results in SEE by analysing such approaches and providing several insights that can be used by future research in the area.
Abstract: ContextEnsembles of learning machines and locality are considered two important topics for the next research frontier on Software Effort Estimation (SEE). ObjectivesWe aim at (1) evaluating whether existing automated ensembles of learning machines generally improve SEEs given by single learning machines and which of them would be more useful; (2) analysing the adequacy of different locality approaches; and getting insight on (3) how to improve SEE and (4) how to evaluate/choose machine learning (ML) models for SEE. MethodA principled experimental framework is used for the analysis and to provide insights that are not based simply on intuition or speculation. A comprehensive experimental study of several automated ensembles, single learning machines and locality approaches, which present features potentially beneficial for SEE, is performed. Additionally, an analysis of feature selection and regression trees (RTs), and an investigation of two tailored forms of combining ensembles and locality are performed to provide further insight on improving SEE. ResultsBagging ensembles of RTs show to perform well, being highly ranked in terms of performance across different data sets, being frequently among the best approaches for each data set and rarely performing considerably worse than the best approach for any data set. They are recommended over other learning machines should an organisation have no resources to perform experiments to chose a model. Even though RTs have been shown to be more reliable locality approaches, other approaches such as k-Means and k-Nearest Neighbours can also perform well, in particular for more heterogeneous data sets. ConclusionCombining the power of automated ensembles and locality can lead to competitive results in SEE. By analysing such approaches, we provide several insights that can be used by future research in the area.

137 citations

Journal ArticleDOI
TL;DR: This paper combines the concepts of descriptive learning, predictive learning, and prescriptive learning into a uniform framework, so as to build a parallel system allowing learning system improved by self-boosting to design machine learning system for real-world problems.
Abstract: The development of machine learning in complex system is hindered by two problems nowadays. The first problem is the inefficiency of exploration in state and action space, which leads to the data-hungry of some state-of-art data-driven algorithm. The second problem is the lack of a general theory which can be used to analyze and implement a complex learning system. In this paper, we proposed a general methods that can address both two issues. We combine the concepts of descriptive learning, predictive learning, and prescriptive learning into a uniform framework, so as to build a parallel system allowing learning system improved by self-boosting. Formulating a new perspective of data, knowledge and action, we provide a new methodology called parallel learning to design machine learning system for real-world problems.

137 citations


Cites background from "Pattern Recognition and Machine Lea..."

  • ...Conventionally, it is popular to assume that data was sampled independently from a distribution [5], and action was taken by a policy given certain data [6]....

    [...]

Journal ArticleDOI
TL;DR: It is demonstrated that mRVMs can produce state-of-the-art results on multiclass discrimination problems and this is achieved by utilizing only a very small fraction of the available observation data.
Abstract: In this paper, we investigate the sparsity and recognition capabilities of two approximate Bayesian classification algorithms, the multiclass multi-kernel relevance vector machines (mRVMs) that have been recently proposed. We provide an insight into the behavior of the mRVM models by performing a wide experimentation on a large range of real-world datasets. Furthermore, we monitor various model fitting characteristics that identify the predictive nature of the proposed methods and compare against existing classification techniques. By introducing novel convergence measures, sample selection strategies and model improvements, it is demonstrated that mRVMs can produce state-of-the-art results on multiclass discrimination problems. In addition, this is achieved by utilizing only a very small fraction of the available observation data.

137 citations


Cites methods or result from "Pattern Recognition and Machine Lea..."

  • ...Although the RVM provides significantly competitive results in contrast to the traditional SVM, its adaptation to the multiclass setting has been problematic, due to the bad scaling of the type-II ML procedure with respect to the number of classes C [6] and the dimensionality of the Hessian required for the Laplace approximation [7]....

    [...]

  • ...MODEL FORMULATION Following the standard approach in the machine learning literature [7], [1], in classification we are given1 a training set {xi , ti }N i=1, where x ∈ D our D featured observations and t ∈ {1....

    [...]

Journal ArticleDOI
TL;DR: A portfolio allocation model that converts the portfolio optimization problem into an integer linear programming is proposed as a decision support system for unprofessional lenders by incorporating cost-sensitive learning and extreme gradient boosting to enhance the capability of discriminating potential default borrowers.

137 citations


Cites background from "Pattern Recognition and Machine Lea..."

  • ...A complex decision tree or a neural network can be easily overfitting, a phenomenon in which a model has a small error rate for a training set while having poor predictive performance (Bishop, 2006; Sun et al., 2007)....

    [...]