scispace - formally typeset
Search or ask a question
Reference EntryDOI

IEEE Transactions on Pattern Analysis and Machine Intelligence

15 Oct 2004-
About: The article was published on 2004-10-15. It has received 2118 citations till now.
Citations
More filters
Proceedings Article
21 Aug 2003
TL;DR: An approach to semi-supervised learning is proposed that is based on a Gaussian random field model, and methods to incorporate class priors and the predictions of classifiers obtained by supervised learning are discussed.
Abstract: An approach to semi-supervised learning is proposed that is based on a Gaussian random field model. Labeled and unlabeled data are represented as vertices in a weighted graph, with edge weights encoding the similarity between instances. The learning problem is then formulated in terms of a Gaussian random field on this graph, where the mean of the field is characterized in terms of harmonic functions, and is efficiently obtained using matrix methods or belief propagation. The resulting learning algorithms have intimate connections with random walks, electric networks, and spectral graph theory. We discuss methods to incorporate class priors and the predictions of classifiers obtained by supervised learning. We also propose a method of parameter learning by entropy minimization, and show the algorithm's ability to perform feature selection. Promising experimental results are presented for synthetic data, digit classification, and text classification tasks.

3,908 citations


Cites methods from "IEEE Transactions on Pattern Analys..."

  • ...The normalized cut approach of Shi and Malik (2000) has as its objective function the minimization of the Raleigh quotient R(f) = f> f f>Df = Pij wij(f(i) f(j))2 Pi dif(i)2 (8)...

    [...]

Journal ArticleDOI
TL;DR: A technique for automatically assigning a neuroanatomical label to each location on a cortical surface model based on probabilistic information estimated from a manually labeled training set is presented, comparable in accuracy to manual labeling.
Abstract: We present a technique for automatically assigning a neuroanatomical label to each location on a cortical surface model based on probabilistic information estimated from a manually labeled training set. This procedure incorporates both geometric information derived from the cortical model, and neuroanatomical convention, as found in the training set. The result is a complete labeling of cortical sulci and gyri. Examples are given from two different training sets generated using different neuroanatomical conventions, illustrating the flexibility of the algorithm. The technique is shown to be comparable in accuracy to manual labeling.

3,880 citations


Cites methods from "IEEE Transactions on Pattern Analys..."

  • ...Techniques for labeling geometric features of the cerebral cortex are useful for analyzing a variety of functional and structural neuroimaging data (Rademacher et al., 1992; Caviness et al., 1996; Paus et al., 1996; Seidman et al., 1997; Goldstein et al., 1999, 2001, 2002; Lohmann et al., 1999)....

    [...]

  • ...Here, we use the results of the manual labeling using the validated techniques of the Center for Morphometric Analysis (CMA) (Rademacher et al., 1992; Caviness et al., 1996) to automatically extract the information required for automating the parcellation procedure....

    [...]

  • ...The first is the volumetric one in use at the MGH CMA as described in (Rademacher et al., 1992; Caviness et al., 1996), which defines ∼58 separate labels (see Appendix for a complete list), while the second is an intrinsically surfacebased (SB) parcellation (Destrieux et al., 1998) based on the p P…...

    [...]

Proceedings ArticleDOI
17 Oct 2005
TL;DR: The method is fast, does not require video alignment and is applicable in many scenarios where the background is known, and the robustness of the method is demonstrated to partial occlusions, non-rigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action and low quality video.
Abstract: Human action in video sequences can be seen as silhouettes of a moving torso and protruding limbs undergoing articulated motion. We regard human actions as three-dimensional shapes induced by the silhouettes in the space-time volume. We adopt a recent approach by Gorelick et al. (2004) for analyzing 2D shapes and generalize it to deal with volumetric space-time action shapes. Our method utilizes properties of the solution to the Poisson equation to extract space-time features such as local space-time saliency, action dynamics, shape structure and orientation. We show that these features are useful for action recognition, detection and clustering. The method is fast, does not require video alignment and is applicable in (but not limited to) many scenarios where the background is known. Moreover, we demonstrate the robustness of our method to partial occlusions, non-rigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action and low quality video

2,186 citations


Cites background or methods from "IEEE Transactions on Pattern Analys..."

  • ...Below we generalize the approach in [14] from 2D shapes in images to to deal with volumetric space-time shapes....

    [...]

  • ...This measure can be computed [14] by solving a Poisson equation of the form: ∆U(x, y, t) = −1, with (x, y, t) ∈ S, where the Laplacian of U is defined as∆U = Uxx + Uyy + Utt, subject to the Dirichlet boundary conditions U(x, y, t) = 0 at the bounding surface∂S....

    [...]

  • ...We adopt a recent approach [14] for analyzing 2D shapes and generalize it to deal with volumetric space-time action shapes....

    [...]

  • ...Recently, [14] presented an alternative approach based on a solution to a Poisson equation....

    [...]

  • ...The eigenvectors and eigenvalues of H then reveal the local orientation and aspect ratio of the shape [14]....

    [...]

Journal ArticleDOI
TL;DR: A comprehensive survey of the recent achievements in this field brought about by deep learning techniques, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics.
Abstract: Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.

1,897 citations


Cites background from "IEEE Transactions on Pattern Analys..."

  • ...In response, there has been growing research interest in designing compact and lightweight networks [25, 4, 95, 88, 132, 231], network compression and acceleration [34, 97, 195, 121, 124], and network interpretation and understanding [19, 142, 146]....

    [...]

Journal ArticleDOI
TL;DR: Algorithmic techniques are presented that substantially improve the running time of the loopy belief propagation approach and reduce the complexity of the inference algorithm to be linear rather than quadratic in the number of possible labels for each pixel, which is important for problems such as image restoration that have a large label set.
Abstract: Markov random field models provide a robust and unified framework for early vision problems such as stereo and image restoration. Inference algorithms based on graph cuts and belief propagation have been found to yield accurate results, but despite recent advances are often too slow for practical use. In this paper we present some algorithmic techniques that substantially improve the running time of the loopy belief propagation approach. One of the techniques reduces the complexity of the inference algorithm to be linear rather than quadratic in the number of possible labels for each pixel, which is important for problems such as image restoration that have a large label set. Another technique speeds up and reduces the memory requirements of belief propagation on grid graphs. A third technique is a multi-grid method that makes it possible to obtain good results with a small fixed number of message passing iterations, independent of the size of the input images. Taken together these techniques speed up the standard algorithm by several orders of magnitude. In practice we obtain results that are as accurate as those of other global methods (e.g., using the Middlebury stereo benchmark) while being nearly as fast as purely local methods.

1,560 citations


Cites background from "IEEE Transactions on Pattern Analys..."

  • ...While the MRF formulation of these problems yields an energy minimization problem that is NP hard, good approximation algorithms based on graph cuts [3] and on belief propagation [10, 8] have been developed and demonstrated for the problems of stereo and image restoration....

    [...]

  • ...The Potts model [3] captures the assumption that labelings should be piecewise constant....

    [...]

  • ...The general framework for the problems we consider can be defined as follows (we use the notation from [3])....

    [...]

References
More filters
Proceedings Article
21 Aug 2003
TL;DR: An approach to semi-supervised learning is proposed that is based on a Gaussian random field model, and methods to incorporate class priors and the predictions of classifiers obtained by supervised learning are discussed.
Abstract: An approach to semi-supervised learning is proposed that is based on a Gaussian random field model. Labeled and unlabeled data are represented as vertices in a weighted graph, with edge weights encoding the similarity between instances. The learning problem is then formulated in terms of a Gaussian random field on this graph, where the mean of the field is characterized in terms of harmonic functions, and is efficiently obtained using matrix methods or belief propagation. The resulting learning algorithms have intimate connections with random walks, electric networks, and spectral graph theory. We discuss methods to incorporate class priors and the predictions of classifiers obtained by supervised learning. We also propose a method of parameter learning by entropy minimization, and show the algorithm's ability to perform feature selection. Promising experimental results are presented for synthetic data, digit classification, and text classification tasks.

3,908 citations

Journal ArticleDOI
TL;DR: A technique for automatically assigning a neuroanatomical label to each location on a cortical surface model based on probabilistic information estimated from a manually labeled training set is presented, comparable in accuracy to manual labeling.
Abstract: We present a technique for automatically assigning a neuroanatomical label to each location on a cortical surface model based on probabilistic information estimated from a manually labeled training set. This procedure incorporates both geometric information derived from the cortical model, and neuroanatomical convention, as found in the training set. The result is a complete labeling of cortical sulci and gyri. Examples are given from two different training sets generated using different neuroanatomical conventions, illustrating the flexibility of the algorithm. The technique is shown to be comparable in accuracy to manual labeling.

3,880 citations

Proceedings ArticleDOI
17 Oct 2005
TL;DR: The method is fast, does not require video alignment and is applicable in many scenarios where the background is known, and the robustness of the method is demonstrated to partial occlusions, non-rigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action and low quality video.
Abstract: Human action in video sequences can be seen as silhouettes of a moving torso and protruding limbs undergoing articulated motion. We regard human actions as three-dimensional shapes induced by the silhouettes in the space-time volume. We adopt a recent approach by Gorelick et al. (2004) for analyzing 2D shapes and generalize it to deal with volumetric space-time action shapes. Our method utilizes properties of the solution to the Poisson equation to extract space-time features such as local space-time saliency, action dynamics, shape structure and orientation. We show that these features are useful for action recognition, detection and clustering. The method is fast, does not require video alignment and is applicable in (but not limited to) many scenarios where the background is known. Moreover, we demonstrate the robustness of our method to partial occlusions, non-rigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action and low quality video

2,186 citations

Journal ArticleDOI
TL;DR: A comprehensive survey of the recent achievements in this field brought about by deep learning techniques, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics.
Abstract: Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.

1,897 citations

Journal ArticleDOI
TL;DR: Algorithmic techniques are presented that substantially improve the running time of the loopy belief propagation approach and reduce the complexity of the inference algorithm to be linear rather than quadratic in the number of possible labels for each pixel, which is important for problems such as image restoration that have a large label set.
Abstract: Markov random field models provide a robust and unified framework for early vision problems such as stereo and image restoration. Inference algorithms based on graph cuts and belief propagation have been found to yield accurate results, but despite recent advances are often too slow for practical use. In this paper we present some algorithmic techniques that substantially improve the running time of the loopy belief propagation approach. One of the techniques reduces the complexity of the inference algorithm to be linear rather than quadratic in the number of possible labels for each pixel, which is important for problems such as image restoration that have a large label set. Another technique speeds up and reduces the memory requirements of belief propagation on grid graphs. A third technique is a multi-grid method that makes it possible to obtain good results with a small fixed number of message passing iterations, independent of the size of the input images. Taken together these techniques speed up the standard algorithm by several orders of magnitude. In practice we obtain results that are as accurate as those of other global methods (e.g., using the Middlebury stereo benchmark) while being nearly as fast as purely local methods.

1,560 citations