Showing papers on "Unsupervised learning published in 2006"

PDF

Open Access

Journal Article•DOI•

Extreme learning machine: Theory and applications

[...]

Guang-Bin Huang, Qin-Yu Zhu, Chee Kheong Siew

01 Dec 2006-Neurocomputing

TL;DR: A new learning algorithm called ELM is proposed for feedforward neural networks (SLFNs) which randomly chooses hidden nodes and analytically determines the output weights of SLFNs which tends to provide good generalization performance at extremely fast learning speed.

...read moreread less

10,217 citations

Book•

Pattern Recognition and Machine Learning (Information Science and Statistics)

[...]

Christopher M. Bishop

01 Aug 2006

TL;DR: Looking for competent reading resources?

...read moreread less

Abstract: Looking for competent reading resources? We have pattern recognition and machine learning information science and statistics to read, not only read, but also download them or even check out online. Locate this fantastic book writtern by by now, simply here, yeah just here. Obtain the reports in the kinds of txt, zip, kindle, word, ppt, pdf, as well as rar. Once again, never ever miss to review online and download this book in our site right here. Click the link.

...read moreread less

8,923 citations

Journal Article•

Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

[...]

Mikhail Belkin, Partha Niyogi, Vikas Sindhwani

01 Dec 2006-Journal of Machine Learning Research

TL;DR: A semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner is proposed and properties of reproducing kernel Hilbert spaces are used to prove new Representer theorems that provide theoretical basis for the algorithms.

...read moreread less

Abstract: We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner. Some transductive graph learning algorithms and standard methods including support vector machines and regularized least squares can be obtained as special cases. We use properties of reproducing kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graph-based approaches) we obtain a natural out-of-sample extension to novel examples and so are able to handle both transductive and truly semi-supervised settings. We present experimental evidence suggesting that our semi-supervised algorithms are able to use unlabeled data effectively. Finally we have a brief discussion of unsupervised and fully supervised learning within our general framework.

...read moreread less

3,919 citations

Proceedings Article•DOI•

An empirical comparison of supervised learning algorithms

[...]

Rich Caruana¹, Alexandru Niculescu-Mizil¹•Institutions (1)

Cornell University¹

25 Jun 2006

TL;DR: A large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps is presented.

...read moreread less

Abstract: A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90's. We present a large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps. We also examine the effect that calibrating the models via Platt Scaling and Isotonic Regression has on their performance. An important aspect of our study is the use of a variety of performance criteria to evaluate the learning methods.

...read moreread less

2,450 citations

Journal Article•DOI•

Machine learning: a review of classification and combining techniques

[...]

Sotiris Kotsiantis¹, Ioannis D. Zaharakis², P. E. Pintelas¹•Institutions (2)

University of Peloponnese¹, Research Academic Computer Technology Institute²

01 Nov 2006-Artificial Intelligence Review

TL;DR: Various classification algorithms and the recent attempt for improving classification accuracy—ensembles of classifiers are described.

...read moreread less

Abstract: Supervised classification is one of the tasks most frequently carried out by so-called Intelligent Systems. Thus, a large number of techniques have been developed based on Artificial Intelligence (Logic-based techniques, Perceptron-based techniques) and Statistics (Bayesian Networks, Instance-based techniques). The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. This paper describes various classification algorithms and the recent attempt for improving classification accuracy--ensembles of classifiers.

...read moreread less

1,127 citations

Proceedings Article•DOI•

Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words.

[...]

Juan Carlos Niebles¹, Hongcheng Wang, Li Fei-Fei²•Institutions (2)

Universidad del Norte, Colombia¹, Princeton University²

01 Jan 2006

TL;DR: The approach is not only able to classify different actions, but also to localize different actions simultaneously in a novel and complex video sequence.

...read moreread less

Abstract: We present a novel unsupervised learning method for human action categories. A video sequence is represented as a collection of spatial-temporal words by extracting space-time interest points. The algorithm automatically learns the probability distributions of the spatial-temporal words and the intermediate topics corresponding to human action categories. This is achieved by using latent topic models such as the probabilistic Latent Semantic Analysis (pLSA) model and Latent Dirichlet Allocation (LDA). Our approach can handle noisy feature points arisen from dynamic background and moving cameras due to the application of the probabilistic models. Given a novel video sequence, the algorithm can categorize and localize the human action(s) contained in the video. We test our algorithm on three challenging datasets: the KTH human motion dataset, the Weizmann human action dataset, and a recent dataset of figure skating actions. Our results reflect the promise of such a simple approach. In addition, our algorithm can recognize and localize multiple actions in long and complex video sequences containing multiple motions.

...read moreread less

927 citations

Link prediction using supervised learning

[...]

Mohammad Al Hasan, Vineet Chaoji, Saeed Salem¹, Mohammed J. Zaki•Institutions (1)

Rensselaer Polytechnic Institute¹

01 Jan 2006

TL;DR: This research identifies a set of features that are key to the superior performance under the supervised learning setup, and shows that a small subset of features always plays a significant role in the link prediction job.

...read moreread less

Abstract: Social network analysis has attracted much attention in recent years. Link prediction is a key research directions within this area. In this research, we study link prediction as a supervised learning task. Along the way, we identify a set of features that are key to the superior performance under the supervised learning setup. The identified features are very easy to compute, and at the same time surprisingly effective in solving the link prediction problem. We also explain the effectiveness of the features from their class density distribution. Then we compare different classes of supervised learning algorithms in terms of their prediction performance using various performance metrics, such as accuracy, precision-recall, F-values, squared error etc. with a 5-fold cross validation. Our results on two practical social network datasets shows that most of the well-known classification algorithms (decision tree, k-nn,multilayer perceptron, SVM, rbf network) can predict link with surpassing performances, but SVM defeats all of them with narrow margin in all different performance measures. Again, ranking of features with popular feature ranking algorithms shows that a small subset of features always plays a significant role in the link prediction job.

...read moreread less

883 citations

Journal Article•DOI•

Unsupervised Learning of Image Manifolds by Semidefinite Programming

[...]

Kilian Q. Weinberger¹, Lawrence K. Saul¹•Institutions (1)

University of Pennsylvania¹

01 Oct 2006-International Journal of Computer Vision

TL;DR: An algorithm for unsupervised learning of image manifolds by semidefinite programming that computes a low dimensional representation of each image with the property that distances between nearby images are preserved.

...read moreread less

Abstract: Can we detect low dimensional structure in high dimensional data sets of images? In this paper, we propose an algorithm for unsupervised learning of image manifolds by semidefinite programming. Given a data set of images, our algorithm computes a low dimensional representation of each image with the property that distances between nearby images are preserved. More generally, it can be used to analyze high dimensional data that lies on or near a low dimensional manifold. We illustrate the algorithm on easily visualized examples of curves and surfaces, as well as on actual images of faces, handwritten digits, and solid objects.

...read moreread less

590 citations

Unsupervised Learning of Human Action Categories

[...]

Juan Carlos Niebles¹, Hongcheng Wang¹, Li Fei-Fei¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Jan 2006

TL;DR: The approach is not only able to classify different actions, but also to localize different actions simultaneously in a novel and complex video sequence.

...read moreread less

Abstract: Imagine a video taken on a sunny beach, can a computer automatically tell what is happening in the scene? Can it identify different human activities in the video, such as water surfing, people walking and lying on the beach? To automatically classify or localize different actions in video sequences is very useful for a variety of tasks, such as video surveillance, objectlevel video summarization, video indexing, digital library organization, etc. However, it remains a challenging task for computers to achieve robust action recognition due to cluttered background, camera motion, occlusion, and geometric and photometric variances of objects. For example, in a live video of a skating competition, the skater moves rapidly across the rink, and the camera also moves to follow the skater. With moving camera, non-stationary background, and moving target, few vision algorithms could identify, categorize and localize such motions well. In addition, the challenge is even greater when there are multiple activities in a complex video sequence (Figure 1). We present a video demo for our novel unsupervised learning method for human action categories [1]. A video sequence is represented as a collection of spatial-temporal words by extracting space-time interest points. The algorithm learns the probability distributions of the spatial-temporal words and intermediate topics corresponding to human action categories automatically using a probabilistic Latent Semantic Analysis (pLSA) model [4]. The learned model is then used for human action categorization and localization in a novel video, by maximizing the posterior of action category (topic) distributions. The contributions of this work are as follows: • Unsupervised learning of actions using ‘video words’ representation. We deploy a pLSA model with ‘bag of video words’ representation for video analysis; • Multiple action localization and categorization. Our approach is not only able to classify different actions, but also to localize different actions simultaneously in a novel and complex video sequence.

...read moreread less

535 citations

Journal Article•DOI•

Unsupervised Learning With Random Forest Predictors

[...]

Tao Shi¹, Steve Horvath¹•Institutions (1)

University of California, Los Angeles¹

01 Mar 2006-Journal of Computational and Graphical Statistics

TL;DR: The RF dissimilarity is useful for detecting tumor sample clusters on the basis of tumor marker expressions and can be described with simple thresholding rules in this application.

...read moreread less

Abstract: A random forest (RF) predictor is an ensemble of individual tree predictors. As part of their construction, RF predictors naturally lead to a dissimilarity measure between the observations. One can also define an RF dissimilarity measure between unlabeled data: the idea is to construct an RF predictor that distinguishes the “observed” data from suitably generated synthetic data. The observed data are the original unlabeled data and the synthetic data are drawn from a reference distribution. Here we describe the properties of the RF dissimilarity and make recommendations on how to use it in practice.An RF dissimilarity can be attractive because it handles mixed variable types well, is invariant to monotonic transformations of the input variables, and is robust to outlying observations. The RF dissimilarity easily deals with a large number of variables due to its intrinsic variable selection; for example, the Addcl 1 RF dissimilarity weighs the contribution of each variable according to how dependent it is ...

...read moreread less

460 citations

Proceedings Article•

Multi-Instance Multi-Label Learning with Application to Scene Classification

[...]

Zhi-Hua Zhou¹, Min-Ling Zhang¹•Institutions (1)

Nanjing University¹

04 Dec 2006

TL;DR: This paper formalizes multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels, and proposes the MIMLBOOST and MIMLSVM algorithms which achieve good performance in an application to scene classification.

...read moreread less

Abstract: In this paper, we formalize multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels Such a problem can occur in many real-world tasks, eg an image usually contains multiple patches each of which can be described by a feature vector, and the image can belong to multiple categories since its semantics can be recognized in different ways We analyze the relationship between multi-instance multi-label learning and the learning frameworks of traditional supervised learning, multi-instance learning and multi-label learning Then, we propose the MIMLBOOST and MIMLSVM algorithms which achieve good performance in an application to scene classification

...read moreread less

Proceedings Article•DOI•

Higher order learning with graphs

[...]

Sameer Agarwal¹, Kristin Branson¹, Serge Belongie¹•Institutions (1)

University of California, San Diego¹

25 Jun 2006

TL;DR: It is shown that various formulations of the semi-supervised and the unsupervised learning problem on hypergraphs result in the same graph theoretic problem and can be analyzed using existing tools.

...read moreread less

Abstract: Recently there has been considerable interest in learning with higher order relations (i.e., three-way or higher) in the unsupervised and semi-supervised settings. Hypergraphs and tensors have been proposed as the natural way of representing these relations and their corresponding algebra as the natural tools for operating on them. In this paper we argue that hypergraphs are not a natural representation for higher order relations, indeed pairwise as well as higher order relations can be handled using graphs. We show that various formulations of the semi-supervised and the unsupervised learning problem on hypergraphs result in the same graph theoretic problem and can be analyzed using existing tools.

...read moreread less

Journal Article•DOI•

Learning from neural control

[...]

Cong Wang¹, David J. Hill²•Institutions (2)

South China University of Technology¹, Australian National University²

01 Jan 2006-IEEE Transactions on Neural Networks

TL;DR: The presented deterministic learning mechanism and the neural learning control scheme provide elementary components toward the development of a biologically-plausible learning and control methodology.

...read moreread less

Abstract: One of the amazing successes of biological systems is their ability to "learn by doing" and so adapt to their environment. In this paper, first, a deterministic learning mechanism is presented, by which an appropriately designed adaptive neural controller is capable of learning closed-loop system dynamics during tracking control to a periodic reference orbit. Among various neural network (NN) architectures, the localized radial basis function (RBF) network is employed. A property of persistence of excitation (PE) for RBF networks is established, and a partial PE condition of closed-loop signals, i.e., the PE condition of a regression subvector constructed out of the RBFs along a periodic state trajectory, is proven to be satisfied. Accurate NN approximation for closed-loop system dynamics is achieved in a local region along the periodic state trajectory, and a learning ability is implemented during a closed-loop feedback control process. Second, based on the deterministic learning mechanism, a neural learning control scheme is proposed which can effectively recall and reuse the learned knowledge to achieve closed-loop stability and improved control performance. The significance of this paper is that the presented deterministic learning mechanism and the neural learning control scheme provide elementary components toward the development of a biologically-plausible learning and control methodology. Simulation studies are included to demonstrate the effectiveness of the approach.

...read moreread less

Proceedings Article•DOI•

Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization

[...]

Andrew Goldberg¹, Xiaojin Zhu¹•Institutions (1)

University of Wisconsin-Madison¹

09 Jun 2006

TL;DR: A graph-based semi-supervised learning algorithm is presented to address the sentiment analysis task of rating inference and achieves significantly better predictive accuracy over other methods that ignore the unlabeled examples during training.

...read moreread less

Abstract: We present a graph-based semi-supervised learning algorithm to address the sentiment analysis task of rating inference. Given a set of documents (e.g., movie reviews) and accompanying ratings (e.g., "4 stars"), the task calls for inferring numerical ratings for unlabeled documents based on the perceived sentiment expressed by their text. In particular, we are interested in the situation where labeled data is scarce. We place this task in the semi-supervised setting and demonstrate that considering unlabeled reviews in the learning process can improve rating-inference performance. We do so by creating a graph on both labeled and unlabeled data to encode certain assumptions for this task. We then solve an optimization problem to obtain a smooth rating function over the whole graph. When only limited labeled data is available, this method achieves significantly better predictive accuracy over other methods that ignore the unlabeled examples during training.

...read moreread less

Journal Article•DOI•

Data driven image models through continuous joint alignment

[...]

Erik Learned-Miller

01 Feb 2006-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A family of techniques that are called congealing for modeling image classes from data to eliminate "nuisance" variables such as affine deformations from handwritten digits or unwanted bias fields from magnetic resonance images is presented.

...read moreread less

Abstract: This paper presents a family of techniques that we call congealing for modeling image classes from data. The idea is to start with a set of images and make them appear as similar as possible by removing variability along the known axes of variation. This technique can be used to eliminate "nuisance" variables such as affine deformations from handwritten digits or unwanted bias fields from magnetic resonance images. In addition to separating and modeling the latent images - i.e., the images without the nuisance variables - we can model the nuisance variables themselves, leading to factorized generative image models. When nuisance variable distributions are shared between classes, one can share the knowledge learned in one task with another task, leading to efficient learning. We demonstrate this process by building a handwritten digit classifier from just a single example of each class. In addition to applications in handwritten character recognition, we describe in detail the application of bias removal from magnetic resonance images. Unlike previous methods, we use a separate, nonparametric model for the intensity values at each pixel. This allows us to leverage the data from the MR images of different patients to remove bias from each other. Only very weak assumptions are made about the distributions of intensity values in the images. In addition to the digit and MR applications, we discuss a number of other uses of congealing and describe experiments about the robustness and consistency of the method.

...read moreread less

Proceedings Article•DOI•

Constructing informative priors using transfer learning

[...]

Rajat Raina¹, Andrew Y. Ng¹, Daphne Koller¹•Institutions (1)

Stanford University¹

25 Jun 2006

TL;DR: An algorithm for automatically constructing a multivariate Gaussian prior with a full covariance matrix for a given supervised learning task, which relaxes a commonly used but overly simplistic independence assumption, and allows parameters to be dependent.

...read moreread less

Abstract: Many applications of supervised learning require good generalization from limited labeled data. In the Bayesian setting, we can try to achieve this goal by using an informative prior over the parameters, one that encodes useful domain knowledge. Focusing on logistic regression, we present an algorithm for automatically constructing a multivariate Gaussian prior with a full covariance matrix for a given supervised learning task. This prior relaxes a commonly used but overly simplistic independence assumption, and allows parameters to be dependent. The algorithm uses other "similar" learning problems to estimate the covariance of pairs of individual parameters. We then use a semidefinite program to combine these estimates and learn a good prior for the current learning task. We apply our methods to binary text classification, and demonstrate a 20 to 40% test error reduction over a commonly used prior.

...read moreread less

Journal Article•DOI•

Machine Learning for Detection and Diagnosis of Disease

[...]

Paul Sajda¹•Institutions (1)

Columbia University¹

11 Jul 2006-Annual Review of Biomedical Engineering

TL;DR: The review describes recent developments in machine learning, focusing on supervised and unsupervised linear methods and Bayesian inference, which have made significant impacts in the detection and diagnosis of disease in biomedicine.

...read moreread less

Abstract: Machine learning offers a principled approach for developing sophisticated, automatic, and objective algorithms for analysis of high-dimensional and multimodal biomedical data. This review focuses on several advances in the state of the art that have shown promise in improving detection, diagnosis, and therapeutic monitoring of disease. Key in the advancement has been the development of a more in-depth understanding and theoretical analysis of critical issues related to algorithmic construction and learning theory. These include trade-offs for maximizing generalization performance, use of physically realistic constraints, and incorporation of prior knowledge and uncertainty. The review describes recent developments in machine learning, focusing on supervised and unsupervised linear methods and Bayesian inference, which have made significant impacts in the detection and diagnosis of disease in biomedicine. We describe the different methodologies and, for each, provide examples of their application to specific domains in biomedical diagnostics.

...read moreread less

Proceedings Article•DOI•

Agnostic active learning

[...]

Maria-Florina Balcan¹, Alina Beygelzimer², John Langford³•Institutions (3)

Carnegie Mellon University¹, IBM², Toyota Technological Institute at Chicago³

25 Jun 2006

TL;DR: The first active learning algorithm which works in the presence of arbitrary forms of noise is state and analyzed, and it is shown that A2 achieves an exponential improvement over the usual sample complexity of supervised learning.

...read moreread less

Abstract: We state and analyze the first active learning algorithm which works in the presence of arbitrary forms of noise. The algorithm, A2 (for Agnostic Active), relies only upon the assumption that the samples are drawn i.i.d. from a fixed distribution. We show that A2 achieves an exponential improvement (i.e., requires only O (ln 1/e) samples to find an e-optimal classifier) over the usual sample complexity of supervised learning, for several settings considered before in the realizable case. These include learning threshold classifiers and learning homogeneous linear separators with respect to an input distribution which is uniform over the unit sphere.

...read moreread less

Proceedings Article•DOI•

Exploratory Under-Sampling for Class-Imbalance Learning

[...]

Xu-Ying Liu¹, Jianxin Wu², Zhi-Hua Zhou¹•Institutions (2)

Nanjing University¹, Georgia Institute of Technology²

18 Dec 2006

TL;DR: Experiments show that the proposed algorithms, BalanceCascade and EasyEnsemble, have better AUC scores than many existing class-imbalance learning methods and have approximately the same training time as that of under-sampling, which trains significantly faster than other methods.

...read moreread less

Abstract: Under-sampling is a class-imbalance learning method which uses only a subset of major class examples and thus is very efficient. The main deficiency is that many major class examples are ignored. We propose two algorithms to overcome the deficiency. EasyEnsemble samples several subsets from the major class, trains a learner using each of them, and combines the outputs of those learners. BalanceCascade is similar to EasyEnsemble except that it removes correctly classified major class examples of trained learners from further consideration. Experiments show that both of the proposed algorithms have better AUC scores than many existing class-imbalance learning methods. Moreover, they have approximately the same training time as that of under-sampling, which trains significantly faster than other methods.

...read moreread less

Book Chapter•DOI•

Learning semantic scene models by trajectory analysis

[...]

Xiaogang Wang¹, Kinh Tieu¹, Eric Grimson¹•Institutions (1)

Massachusetts Institute of Technology¹

07 May 2006

TL;DR: An unsupervised learning framework to segment a scene into semantic regions and to build semantic scene models from long-term observations of moving objects in the scene is described and novel clustering algorithms which use both similarity and comparison confidence are introduced.

...read moreread less

Abstract: In this paper, we describe an unsupervised learning framework to segment a scene into semantic regions and to build semantic scene models from long-term observations of moving objects in the scene. First, we introduce two novel similarity measures for comparing trajectories in far-field visual surveillance. The measures simultaneously compare the spatial distribution of trajectories and other attributes, such as velocity and object size, along the trajectories. They also provide a comparison confidence measure which indicates how well the measured image-based similarity approximates true physical similarity. We also introduce novel clustering algorithms which use both similarity and comparison confidence. Based on the proposed similarity measures and clustering methods, a framework to learn semantic scene models by trajectory analysis is developed. Trajectories are first clustered into vehicles and pedestrians, and then further grouped based on spatial and velocity distributions. Different trajectory clusters represent different activities. The geometric and statistical models of structures in the scene, such as roads, walk paths, sources and sinks, are automatically learned from the trajectory clusters. Abnormal activities are detected using the semantic scene models. The system is robust to low-level tracking errors.

...read moreread less

Journal Article•DOI•

An incremental network for on-line unsupervised classification and topology learning

[...]

Shen Furao, Osamu Hasegawa¹•Institutions (1)

Tokyo Institute of Technology¹

01 Jan 2006-Neural Networks

TL;DR: The design of two-layer neural network enables this system to represent the topological structure of unsupervised on-line data, report the reasonable number of clusters, and give typical prototype patterns of every cluster without prior conditions such as a suitable number of nodes or a good initial codebook.

...read moreread less

Journal Article•DOI•

Incremental nonlinear dimensionality reduction by manifold learning

[...]

M.H.C. Law¹, Anil K. Jain¹•Institutions (1)

Michigan State University¹

01 Mar 2006-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An incremental version of ISOMAP, one of the key manifold learning algorithms, is described and it is demonstrated that this modified algorithm can maintain an accurate low-dimensional representation of the data in an efficient manner.

...read moreread less

Abstract: Understanding the structure of multidimensional patterns, especially in unsupervised cases, is of fundamental importance in data mining, pattern recognition, and machine learning. Several algorithms have been proposed to analyze the structure of high-dimensional data based on the notion of manifold learning. These algorithms have been used to extract the intrinsic characteristics of different types of high-dimensional data by performing nonlinear dimensionality reduction. Most of these algorithms operate in a "batch" mode and cannot be efficiently applied when data are collected sequentially. In this paper, we describe an incremental version of ISOMAP, one of the key manifold learning algorithms. Our experiments on synthetic data as well as real world images demonstrate that our modified algorithm can maintain an accurate low-dimensional representation of the data in an efficient manner.

...read moreread less

Spectral Methods for Dimensionality Reduction.

[...]

Lawrence K. Saul, Kilian Q. Weinberger, Fei Sha, Jihun Ham, Daniel D. Lee - Show less +1 more

01 Jan 2006

TL;DR: This chapter provides an overview of unsupervised learning algorithms that can be viewed as spectral methods for linear and nonlinear dimensionality reduction and manifold learning.

...read moreread less

Abstract: How can we search for low dimensional structure in high dimensional data? If the data is mainly confined to a low dimensional subspace, then simple linear methods can be used to discover the subspace and estimate its dimensionality. More generally, though, if the data lies on (or near) a low dimensional submanifold, then its structure may be highly nonlinear, and linear methods are bound to fail. Spectral methods have recently emerged as a powerful tool for nonlinear dimensionality reduction and manifold learning. These methods are able to reveal low dimensional structure in high dimensional data from the top or bottom eigenvectors of specially constructed matrices. To analyze data that lies on a low dimensional submanifold, the matrices are constructed from sparse weighted graphs whose vertices represent input patterns and whose edges indicate neighborhood relations. The main computations for manifold learning are based on tractable, polynomial-time optimizations, such as shortest path problems, least squares fits, semidefinite programming, and matrix diagonalization. This chapter provides an overview of unsupervised learning algorithms that can be viewed as spectral methods for linear and nonlinear dimensionality reduction.

...read moreread less

Proceedings Article•DOI•

Adaptive event detection with time-varying poisson processes

[...]

Alexander T. Ihler¹, Jon Hutchins¹, Padhraic Smyth¹•Institutions (1)

University of California, Irvine¹

20 Aug 2006

TL;DR: The experimental results indicate that the proposed time-varying Poisson model provides a robust and accurate framework for adaptively and autonomously learning how to separate unusual bursty events from traces of normal human activity.

...read moreread less

Abstract: Time-series of count data are generated in many different contexts, such as web access logging, freeway traffic monitoring, and security logs associated with buildings. Since this data measures the aggregated behavior of individual human beings, it typically exhibits a periodicity in time on a number of scales (daily, weekly,etc.) that reflects the rhythms of the underlying human activity and makes the data appear non-homogeneous. At the same time, the data is often corrupted by a number of bursty periods of unusual behavior such as building events, traffic accidents, and so forth. The data mining problem of finding and extracting these anomalous events is made difficult by both of these elements. In this paper we describe a framework for unsupervised learning in this context, based on a time-varying Poisson process model that can also account for anomalous events. We show how the parameters of this model can be learned from count time series using statistical estimation techniques. We demonstrate the utility of this model on two datasets for which we have partial ground truth in the form of known events, one from freeway traffic data and another from building access data, and show that the model performs significantly better than a non-probabilistic, threshold-based technique. We also describe how the model can be used to investigate different degrees of periodicity in the data, including systematic day-of-week and time-of-day effects, and make inferences about the detected events (e.g., popularity or level of attendance). Our experimental results indicate that the proposed time-varying Poisson model provides a robust and accurate framework for adaptively and autonomously learning how to separate unusual bursty events from traces of normal human activity.

...read moreread less

Journal Article•DOI•

Statistical Learning Within and Between Modalities Pitting Abstract Against Stimulus-Specific Representations

[...]

Christopher M. Conway¹, Morten H. Christiansen²•Institutions (2)

Indiana University¹, Cornell University²

01 Oct 2006

TL;DR: The findings show that statistical learning results in knowledge that is stimulus-specific rather than abstract, and show furthermore that learning can proceed in parallel for multiple input streams along separate perceptual dimensions or sense modalities.

...read moreread less

Abstract: When learners encode sequential patterns and generalize their knowledge to novel instances, are they relying on abstract or stimulus-specific representations? Research on artificial grammar learning (AGL) has shown transfer of learning from one stimulus set to another, and such findings have encouraged the view that statistical learning is mediated by abstract representations that are independent of the sense modality or perceptual features of the stimuli. Using a novel modification of the standard AGL paradigm, we obtained data to the contrary. These experiments pitted abstract processing against stimulus-specific learning. The findings show that statistical learning results in knowledge that is stimulus-specific rather than abstract. They show furthermore that learning can proceed in parallel for multiple input streams along separate perceptual dimensions or sense modalities. We conclude that learning sequential structure and generalizing to novel stimuli inherently involve learning mechanisms that are ...

...read moreread less

Book Chapter•DOI•

Kernel-Based reinforcement learning

[...]

Guanghua Hu¹, Yuqin Qiu¹, Liming Xiang²•Institutions (2)

Yunnan University¹, City University of Hong Kong²

16 Aug 2006

TL;DR: Two kernel-based reinforcement learning algorithms, the e – KRL and the least squares kernel based reinforcement learning (LS-KRL) are proposed and an example shows that the proposed methods can deal effectively with the reinforcement learning problem without having to explore many states.

...read moreread less

Abstract: We consider the problem of approximating the cost-to-go functions in reinforcement learning By mapping the state implicitly into a feature space, we perform a simple algorithm in the feature space, which corresponds to a complex algorithm in the original state space Two kernel-based reinforcement learning algorithms, the e -insensitive kernel based reinforcement learning (e – KRL) and the least squares kernel based reinforcement learning (LS-KRL) are proposed An example shows that the proposed methods can deal effectively with the reinforcement learning problem without having to explore many states

...read moreread less

Book•

Semi-Supervised Learning (Adaptive Computation and Machine Learning)

[...]

Olivier Chapelle, Bernhard Schölkopf, Alexander Zien

01 Sep 2006

Journal Article•DOI•

Real-time learning capability of neural networks

[...]

Guang-Bin Huang¹, Qin-Yu Zhu, Chee-Kheong Siew•Institutions (1)

Nanyang Technological University¹

01 Jul 2006-IEEE Transactions on Neural Networks

TL;DR: A simple learning algorithm capable of real-time learning which can automatically select appropriate values of neural quantizers and analytically determine the parameters (weights and bias) of the network at one time only is proposed.

...read moreread less

Abstract: In some practical applications of neural networks, fast response to external events within an extremely short time is highly demanded and expected. However, the extensively used gradient-descent-based learning algorithms obviously cannot satisfy the real-time learning needs in many applications, especially for large-scale applications and/or when higher generalization performance is required. Based on Huang's constructive network model, this paper proposes a simple learning algorithm capable of real-time learning which can automatically select appropriate values of neural quantizers and analytically determine the parameters (weights and bias) of the network at one time only. The performance of the proposed algorithm has been systematically investigated on a large batch of benchmark real-world regression and classification problems. The experimental results demonstrate that our algorithm can not only produce good generalization performance but also have real-time learning and prediction capability. Thus, it may provide an alternative approach for the practical applications of neural networks where real-time learning and prediction implementation is required.

...read moreread less

Proceedings Article•DOI•

Prototype-Driven Learning for Sequence Models

[...]

Aria Haghighi¹, Dan Klein¹•Institutions (1)

University of California, Berkeley¹

04 Jun 2006

TL;DR: This work investigates prototype-driven learning for primarily unsupervised sequence modeling, where prior knowledge is specified declaratively, by providing a few canonical examples of each target annotation label, then propagated across a corpus using distributional similarity features in a log-linear generative model.

...read moreread less

Abstract: We investigate prototype-driven learning for primarily unsupervised sequence modeling. Prior knowledge is specified declaratively, by providing a few canonical examples of each target annotation label. This sparse prototype information is then propagated across a corpus using distributional similarity features in a log-linear generative model. On part-of-speech induction in English and Chinese, as well as an information extraction task, prototype features provide substantial error rate reductions over competitive baselines and outperform previous work. For example, we can achieve an English part-of-speech tagging accuracy of 80.5% using only three examples of each tag and no dictionary constraints. We also compare to semi-supervised learning and discuss the system's error trends.

...read moreread less

Book Chapter•DOI•

The intention behind web queries

[...]

Ricardo Baeza-Yates¹, Liliana Calderón-Benavides², Cristina N. González-Caro²•Institutions (2)

Yahoo!¹, Pompeu Fabra University²

11 Oct 2006

TL;DR: This work presents a framework for the identification of user’s interest in an automatic way, based on the analysis of query logs, and establishes that the combination of supervised and unsupervised learning is a good alternative to find user‘s goals.

...read moreread less

Abstract: The identification of the user’s intention or interest through queries that they submit to a search engine can be very useful to offer them more adequate results. In this work we present a framework for the identification of user’s interest in an automatic way, based on the analysis of query logs. This identification is made from two perspectives, the objectives or goals of a user and the categories in which these aims are situated. A manual classification of the queries was made in order to have a reference point and then we applied supervised and unsupervised learning techniques. The results obtained show that for a considerable amount of cases supervised learning is a good option, however through unsupervised learning we found relationships between users and behaviors that are not easy to detect just taking the query words. Also, through unsupervised learning we established that there are categories that we are not able to determine in contrast with other classes that were not considered but naturally appear after the clustering process. This allowed us to establish that the combination of supervised and unsupervised learning is a good alternative to find user’s goals. From supervised learning we can identify the user interest given certain established goals and categories; on the other hand, with unsupervised learning we can validate the goals and categories used, refine them and select the most appropriate to the user’s needs.

...read moreread less

Collapse