Showing papers on "Conditional random field published in 2014"

PDF

Open Access

Posted Content•

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs

[...]

Liang-Chieh Chen¹, George Papandreou², Iasonas Kokkinos³, Kevin Murphy², Alan L. Yuille¹ - Show less +1 more•Institutions (3)

University of California, Los Angeles¹, Google², CentraleSupélec³

22 Dec 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF).

...read moreread less

Abstract: Deep Convolutional Neural Networks (DCNNs) have recently shown state of the art performance in high level vision tasks, such as image classification and object detection. This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification (also called "semantic image segmentation"). We show that responses at the final layer of DCNNs are not sufficiently localized for accurate object segmentation. This is due to the very invariance properties that make DCNNs good for high level tasks. We overcome this poor localization property of deep networks by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF). Qualitatively, our "DeepLab" system is able to localize segment boundaries at a level of accuracy which is beyond previous methods. Quantitatively, our method sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 71.6% IOU accuracy in the test set. We show how these results can be obtained efficiently: Careful network re-purposing and a novel application of the 'hole' algorithm from the wavelet community allow dense computation of neural net responses at 8 frames per second on a modern GPU.

...read moreread less

3,389 citations

Posted Content•

Deep Convolutional Neural Fields for Depth Estimation from a Single Image

[...]

Fayao Liu¹, Chunhua Shen¹, Guosheng Lin¹•Institutions (1)

University of Adelaide¹

24 Nov 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: A deep structured learning scheme which learns the unary and pairwise potentials of continuous CRF in a unified deep CNN framework and can be used for depth estimations of general scenes with no geometric priors nor any extra information injected.

...read moreread less

Abstract: We consider the problem of depth estimation from a single monocular image in this work. It is a challenging task as no reliable depth cues are available, e.g., stereo correspondences, motions, etc. Previous efforts have been focusing on exploiting geometric priors or additional sources of information, with all using hand-crafted features. Recently, there is mounting evidence that features from deep convolutional neural networks (CNN) are setting new records for various vision applications. On the other hand, considering the continuous characteristic of the depth values, depth estimations can be naturally formulated into a continuous conditional random field (CRF) learning problem. Therefore, we in this paper present a deep convolutional neural field model for estimating depths from a single image, aiming to jointly explore the capacity of deep CNN and continuous CRF. Specifically, we propose a deep structured learning scheme which learns the unary and pairwise potentials of continuous CRF in a unified deep CNN framework. The proposed method can be used for depth estimations of general scenes with no geometric priors nor any extra information injected. In our case, the integral of the partition function can be analytically calculated, thus we can exactly solve the log-likelihood optimization. Moreover, solving the MAP problem for predicting depths of a new image is highly efficient as closed-form solutions exist. We experimentally demonstrate that the proposed method outperforms state-of-the-art depth estimation methods on both indoor and outdoor scene datasets.

...read moreread less

643 citations

Book•

Automatic Speech Recognition: A Deep Learning Approach

[...]

Dong Yu, Li Deng

11 Nov 2014

TL;DR: This book summarizes the recent advancement in the field of automatic speech recognition with a focus on discriminative and hierarchical models and presents insights and theoretical foundation of a series of recent models such as conditional random field, semi-Markov and hidden conditionalrandom field, deep neural network, deep belief network, and deep stacking models for sequential learning.

...read moreread less

Abstract: This book summarizes the recent advancement in the field of automatic speech recognition with a focus on discriminative and hierarchical models. This will be the first automatic speech recognition book to include a comprehensive coverage of recent developments such as conditional random field and deep learning techniques. It presents insights and theoretical foundation of a series of recent models such as conditional random field, semi-Markov and hidden conditional random field, deep neural network, deep belief network, and deep stacking models for sequential learning. It also discusses practical considerations of using these models in both acoustic and language modeling for continuous speech recognition.

...read moreread less

520 citations

Journal Article•DOI•

Contextual classification of lidar data and building object detection in urban areas

[...]

Joachim Niemeyer¹, Franz Rottensteiner¹, Uwe Soergel²•Institutions (2)

Leibniz University of Hanover¹, Technische Universität Darmstadt²

01 Jan 2014-Isprs Journal of Photogrammetry and Remote Sensing

TL;DR: This work integrates a Random Forest classifier into a Conditional Random Field framework, a flexible approach for obtaining a reliable classification result even in complex urban scenes, and investigates the relevance of different features for the LiDAR points as well as for the interaction of neighbouring points.

...read moreread less

Abstract: In this work we address the task of the contextual classification of an airborne LiDAR point cloud. For that purpose, we integrate a Random Forest classifier into a Conditional Random Field (CRF) framework. It is a flexible approach for obtaining a reliable classification result even in complex urban scenes. In this way, we benefit from the consideration of context on the one hand and from the opportunity to use a large amount of features on the other hand. Considering the interactions in our experiments increases the overall accuracy by 2%, though a larger improvement becomes apparent in the completeness and correctness of some of the seven classes discerned in our experiments. We compare the Random Forest approach to linear models for the computation of unary and pairwise potentials of the CRF, and investigate the relevance of different features for the LiDAR points as well as for the interaction of neighbouring points. In a second step, building objects are detected based on the classified point cloud. For that purpose, the CRF probabilities for the classes are plugged into a Markov Random Field as unary potentials, in which the pairwise potentials are based on a Potts model. The 2D binary building object masks are extracted and evaluated by the benchmark ISPRS Test Project on Urban Classification and 3D Building Reconstruction. The evaluation shows that the main buildings (larger than 50 m 2 ) can be detected very reliably with a correctness larger than 96% and a completeness of 100%.

...read moreread less

455 citations

Posted Content•

Material recognition in the wild with the Materials in Context Database

[...]

Sean Bell¹, Paul Upchurch¹, Noah Snavely¹, Kavita Bala¹•Institutions (1)

Cornell University¹

01 Dec 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: A new, large-scale, open dataset of materials in the wild, the Materials in Context Database (MINC), is introduced, and convolutional neural networks are trained for two tasks: classifying materials from patches, and simultaneous material recognition and segmentation in full images.

...read moreread less

Abstract: Recognizing materials in real-world images is a challenging task. Real-world materials have rich surface texture, geometry, lighting conditions, and clutter, which combine to make the problem particularly difficult. In this paper, we introduce a new, large-scale, open dataset of materials in the wild, the Materials in Context Database (MINC), and combine this dataset with deep learning to achieve material recognition and segmentation of images in the wild. MINC is an order of magnitude larger than previous material databases, while being more diverse and well-sampled across its 23 categories. Using MINC, we train convolutional neural networks (CNNs) for two tasks: classifying materials from patches, and simultaneous material recognition and segmentation in full images. For patch-based classification on MINC we found that the best performing CNN architectures can achieve 85.2% mean class accuracy. We convert these trained CNN classifiers into an efficient fully convolutional framework combined with a fully connected conditional random field (CRF) to predict the material at every pixel in an image, achieving 73.1% mean class accuracy. Our experiments demonstrate that having a large, well-sampled dataset such as MINC is crucial for real-world material recognition and segmentation.

...read moreread less

365 citations

Proceedings Article•DOI•

Spoken language understanding using long short-term memory neural networks

[...]

Kaisheng Yao¹, Baolin Peng¹, Yu Zhang¹, Dong Yu¹, Geoffrey Zweig¹, Yangyang Shi¹ - Show less +2 more•Institutions (1)

Microsoft¹

01 Dec 2014

TL;DR: This paper investigates using long short-term memory (LSTM) neural networks, which contain input, output and forgetting gates and are more advanced than simple RNN, for the word labeling task and proposes a regression model on top of the LSTM un-normalized scores to explicitly model output-label dependence.

...read moreread less

Abstract: Neural network based approaches have recently produced record-setting performances in natural language understanding tasks such as word labeling. In the word labeling task, a tagger is used to assign a label to each word in an input sequence. Specifically, simple recurrent neural networks (RNNs) and convolutional neural networks (CNNs) have shown to significantly outperform the previous state-of-the-art - conditional random fields (CRFs). This paper investigates using long short-term memory (LSTM) neural networks, which contain input, output and forgetting gates and are more advanced than simple RNN, for the word labeling task. To explicitly model output-label dependence, we propose a regression model on top of the LSTM un-normalized scores. We also propose to apply deep LSTM to the task. We investigated the relative importance of each gate in the LSTM by setting other gates to a constant and only learning particular gates. Experiments on the ATIS dataset validated the effectiveness of the proposed models.

...read moreread less

350 citations

Journal Article•DOI•

A Joint Model for Entity Analysis: Coreference, Typing, and Linking

[...]

Greg Durrett¹, Dan Klein¹•Institutions (1)

University of California, Berkeley¹

01 Nov 2014-Transactions of the Association for Computational Linguistics

TL;DR: A joint model of three core tasks in the entity analysis stack: coreference resolution, named entity recognition, and entity linking, which achieves state-of-the-art results for all three tasks.

...read moreread less

Abstract: We present a joint model of three core tasks in the entity analysis stack: coreference resolution (within-document clustering), named entity recognition (coarse semantic typing), and entity linking (matching to Wikipedia entities). Our model is formally a structured conditional random field. Unary factors encode local features from strong baselines for each task. We then add binary and ternary factors to capture cross-task interactions, such as the constraint that coreferent mentions have the same semantic type. On the ACE 2005 and OntoNotes datasets, we achieve state-of-the-art results for all three tasks. Moreover, joint modeling improves performance on each task over strong independent baselines.

...read moreread less

284 citations

Book Chapter•DOI•

Joint Semantic Segmentation and 3D Reconstruction from Monocular Video

[...]

Abhijit Kundu¹, Yin Li¹, Frank Dellaert¹, Fuxin Li¹, James M. Rehg¹ - Show less +1 more•Institutions (1)

Georgia Institute of Technology¹

06 Sep 2014

TL;DR: Improved 3D structure and temporally consistent semantic segmentation for difficult, large scale, forward moving monocular image sequences is demonstrated.

...read moreread less

Abstract: We present an approach for joint inference of 3D scene structure and semantic labeling for monocular video. Starting with monocular image stream, our framework produces a 3D volumetric semantic + occupancy map, which is much more useful than a series of 2D semantic label images or a sparse point cloud produced by traditional semantic segmentation and Structure from Motion(SfM) pipelines respectively. We derive a Conditional Random Field (CRF) model defined in the 3D space, that jointly infers the semantic category and occupancy for each voxel. Such a joint inference in the 3D CRF paves the way for more informed priors and constraints, which is otherwise not possible if solved separately in their traditional frameworks. We make use of class specific semantic cues that constrain the 3D structure in areas, where multiview constraints are weak. Our model comprises of higher order factors, which helps when the depth is unobservable.We also make use of class specific semantic cues to reduce either the degree of such higher order factors, or to approximately model them with unaries if possible. We demonstrate improved 3D structure and temporally consistent semantic segmentation for difficult, large scale, forward moving monocular image sequences.

...read moreread less

282 citations

Proceedings Article•DOI•

Code Mixing: A Challenge for Language Identification in the Language of Social Media

[...]

Utsab Barman¹, Amitava Das¹, Joachim Wagner¹, Jennifer Foster²•Institutions (2)

Dublin City University¹, University of North Texas²

01 Oct 2014

TL;DR: A new dataset is described, which contains Facebook posts and comments that exhibit code mixing between Bengali, English and Hindi, and it is found that the dictionary-based approach is surpassed by supervised classification and sequence labelling, and that it is important to take contextual clues into consideration.

...read moreread less

Abstract: In social media communication, multilingual speakers often switch between languages, and, in such an environment, automatic language identification becomes both a necessary and challenging task. In this paper, we describe our work in progress on the problem of automatic language identification for the language of social media. We describe a new dataset that we are in the process of creating, which contains Facebook posts and comments that exhibit code mixing between Bengali, English and Hindi. We also present some preliminary word-level language identification experiments using this dataset. Different techniques are employed, including a simple unsupervised dictionary-based approach, supervised word-level classification with and without contextual clues, and sequence labelling using Conditional Random Fields. We find that the dictionary-based approach is surpassed by supervised classification and sequence labelling, and that it is important to take contextual clues into consideration.

...read moreread less

273 citations

Proceedings Article•DOI•

Lightweight map matching for indoor localisation using conditional random fields

[...]

Zhuoling Xiao¹, Hongkai Wen¹, Andrew Markham¹, Niki Trigoni¹•Institutions (1)

University of Oxford¹

15 Apr 2014

TL;DR: MapCraft is presented, a novel, robust and responsive technique that is extremely computationally efficient, does not require training in different sites, and tracks well even when presented with very noisy sensor data, enabling a new era of location-aware applications to be developed.

...read moreread less

Abstract: Indoor tracking and navigation is a fundamental need for pervasive and context-aware smartphone applications. Although indoor maps are becoming increasingly available, there is no practical and reliable indoor map matching solution available at present. We present MapCraft, a novel, robust and responsive technique that is extremely computationally efficient (running in under 10 ms on an Android smartphone), does not require training in different sites, and tracks well even when presented with very noisy sensor data. Key to our approach is expressing the tracking problem as a conditional random field (CRF), a technique which has had great success in areas such as natural language processing, but has yet to be considered for indoor tracking. Unlike directed graphical models like Hidden Markov Models, CRFs capture arbitrary constraints that express how well observations support state transitions, given map constraints. Extensive experiments in multiple sites show how MapCraft outperforms state-of-the art approaches, demonstrating excellent tracking error and accurate reconstruction of tortuous trajectories with zero training effort. As proof of its robustness, we also demonstrate how it is able to accurately track the position of a user from accelerometer and magnetometer measurements only (i.e. gyro- and WiFi-free). We believe that such an energy-efficient approach will enable always-on background localisation, enabling a new era of location-aware applications to be developed.

...read moreread less

157 citations

Journal Article•DOI•

A comprehensive study of named entity recognition in Chinese clinical text

[...]

Jianbo Lei¹, Buzhou Tang², Xueqin Lu³, Kaihua Gao³, Min Jiang¹, Hua Xu¹ - Show less +2 more•Institutions (3)

University of Texas at Austin¹, Harbin Institute of Technology Shenzhen Graduate School², Peking University³

01 Sep 2014-Journal of the American Medical Informatics Association

TL;DR: The authors' evaluation on the independent test set showed that most types of feature were beneficial to Chinese NER systems, although the improvements were limited, and the system achieved the highest performance by combining word segmentation and section information, indicating that these two types offeature complement each other.

...read moreread less

Proceedings Article•DOI•

Recurrent conditional random field for language understanding

[...]

Kaisheng Yao¹, Baolin Peng², Geoffrey Zweig¹, Dong Yu¹, Xiaolong Li¹, Feng Gao¹ - Show less +2 more•Institutions (2)

Microsoft¹, Beihang University²

04 May 2014

TL;DR: This paper shows that the performance of an RNN tagger can be significantly improved by incorporating elements of the CRF model; specifically, the explicit modeling of output-label dependencies with transition features, its global sequence-level objective function, and offline decoding.

...read moreread less

Abstract: Recurrent neural networks (RNNs) have recently produced record setting performance in language modeling and word-labeling tasks. In the word-labeling task, the RNN is used analogously to the more traditional conditional random field (CRF) to assign a label to each word in an input sequence, and has been shown to significantly outperform CRFs. In contrast to CRFs, RNNs operate in an online fashion to assign labels as soon as a word is seen, rather than after seeing the whole word sequence. In this paper, we show that the performance of an RNN tagger can be significantly improved by incorporating elements of the CRF model; specifically, the explicit modeling of output-label dependencies with transition features, its global sequence-level objective function, and offline decoding. We term the resulting model a “recurrent conditional random field” and demonstrate its effectiveness on the ATIS travel domain dataset and a variety of web-search language understanding datasets.

...read moreread less

Book Chapter•DOI•

Learning Fully-Connected CRFs for Blood Vessel Segmentation in Retinal Images

[...]

José Ignacio Orlando¹, Matthew B. Blaschko¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

14 Sep 2014

TL;DR: This work presents a novel method for blood vessel segmentation in fundus images based on a discriminatively trained, fully connected conditional random field model with more expressive potentials, and employs recent results enabling extremely fast inference in a fully connected model.

...read moreread less

Abstract: In this work, we present a novel method for blood vessel segmentation in fundus images based on a discriminatively trained, fully connected conditional random field model. Retinal image analysis is greatly aided by blood vessel segmentation as the vessel structure may be considered both a key source of signal, e.g. in the diagnosis of diabetic retinopathy, or a nuisance, e.g. in the analysis of pigment epithelium or choroid related abnormalities. Blood vessel segmentation in fundus images has been considered extensively in the literature, but remains a challenge largely due to the desired structures being thin and elongated, a setting that performs particularly poorly using standard segmentation priors such as a Potts model or total variation. In this work, we overcome this difficulty using a discriminatively trained conditional random field model with more expressive potentials. In particular, we employ recent results enabling extremely fast inference in a fully connected model. We find that this rich but computationally efficient model family, combined with principled discriminative training based on a structured output support vector machine yields a fully automated system that achieves results statistically indistinguishable from an expert human annotator. Implementation details are available at http://pages.saclay.inria.fr/ matthew.blaschko/projects/retina/.

...read moreread less

Proceedings Article•DOI•

DLIREC: Aspect Term Extraction and Term Polarity Classification System

[...]

Zhiqiang Toh¹, Wenting Wang•Institutions (1)

Agency for Science, Technology and Research¹

11 Aug 2014

TL;DR: This paper describes the system used in the Aspect Based Sentiment Analysis Task 4 at the SemEval-2014, which consists of a Conditional Random Field based classifier for Aspect Term Extraction (ATE) and a linear classifiers for Aspects Term Polarity Classification (ATP).

...read moreread less

Abstract: This paper describes our system used in the Aspect Based Sentiment Analysis Task 4 at the SemEval-2014. Our system consists of two components to address two of the subtasks respectively: a Conditional Random Field (CRF) based classifier for Aspect Term Extraction (ATE) and a linear classifier for Aspect Term Polarity Classification (ATP). For the ATE subtask, we implement a variety of lexicon, syntactic and semantic features, as well as cluster features induced from unlabeled data. Our system achieves state-of-the-art performances in ATE, ranking 1st (among 28 submissions) and 2rd (among 27 submissions) for the restaurant and laptop domain respectively.

...read moreread less

Proceedings Article•DOI•

Automatic Feature Learning for Robust Shadow Detection

[...]

Salman H. Khan¹, Mohammed Bennamoun¹, Ferdous Sohel¹, Roberto Togneri¹•Institutions (1)

University of Western Australia¹

23 Jun 2014

TL;DR: This work presents a practical framework to automatically detect shadows in real world scenes from a single photograph using multiple convolutional deep neural networks (ConvNets) and learns features at the super-pixel level and along the object boundaries.

...read moreread less

Abstract: We present a practical framework to automatically detect shadows in real world scenes from a single photograph. Previous works on shadow detection put a lot of effort in designing shadow variant and invariant hand-crafted features. In contrast, our framework automatically learns the most relevant features in a supervised manner using multiple convolutional deep neural networks (ConvNets). The 7-layer network architecture of each ConvNet consists of alternating convolution and sub-sampling layers. The proposed framework learns features at the super-pixel level and along the object boundaries. In both cases, features are extracted using a context aware window centered at interest points. The predicted posteriors based on the learned features are fed to a conditional random field model to generate smooth shadow contours. Our proposed framework consistently performed better than the state-of-the-art on all major shadow databases collected under a variety of conditions.

...read moreread less

Journal Article•DOI•

Brain tumor detection and segmentation in a CRF (conditional random fields) framework with pixel-pairwise affinity and superpixel-level features.

[...]

Wei Wu¹, Wei Wu², Albert Y. C. Chen¹, Liang Zhao¹, Jason J. Corso¹ - Show less +1 more•Institutions (2)

University at Buffalo¹, Wuhan University of Technology²

01 Jan 2014

TL;DR: A robust segmentation method using model-aware affinity demonstrates comparable performance with other state-of-the art algorithms for brain tumor MRI scans.

...read moreread less

Abstract: Detection and segmentation of a brain tumor such as glioblastoma multiforme (GBM) in magnetic resonance (MR) images are often challenging due to its intrinsically heterogeneous signal characteristics. A robust segmentation method for brain tumor MRI scans was developed and tested. Simple thresholds and statistical methods are unable to adequately segment the various elements of the GBM, such as local contrast enhancement, necrosis, and edema. Most voxel-based methods cannot achieve satisfactory results in larger data sets, and the methods based on generative or discriminative models have intrinsic limitations during application, such as small sample set learning and transfer. A new method was developed to overcome these challenges. Multimodal MR images are segmented into superpixels using algorithms to alleviate the sampling issue and to improve the sample representativeness. Next, features were extracted from the superpixels using multi-level Gabor wavelet filters. Based on the features, a support vector machine (SVM) model and an affinity metric model for tumors were trained to overcome the limitations of previous generative models. Based on the output of the SVM and spatial affinity models, conditional random fields theory was applied to segment the tumor in a maximum a posteriori fashion given the smoothness prior defined by our affinity model. Finally, labeling noise was removed using “structural knowledge” such as the symmetrical and continuous characteristics of the tumor in spatial domain. The system was evaluated with 20 GBM cases and the BraTS challenge data set. Dice coefficients were computed, and the results were highly consistent with those reported by Zikic et al. (MICCAI 2012, Lecture notes in computer science. vol 7512, pp 369–376, 2012). A brain tumor segmentation method using model-aware affinity demonstrates comparable performance with other state-of-the art algorithms.

...read moreread less

Journal Article•DOI•

Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text

[...]

Maria Skeppstedt¹, Maria Kvist², Gunnar Nilsson², Hercules Dalianis¹•Institutions (2)

Stockholm University¹, Karolinska Institutet²

01 Jun 2014-Journal of Biomedical Informatics

TL;DR: The entity recognition results for the individual entities Disorder and Finding show that it is meaningful to separate the general category Medical Problem into these two more granular entity types, e.g. for knowledge mining of co-morbidity relations and disorder-finding relations.

...read moreread less

Journal Article•DOI•

Associative Hierarchical Random Fields

[...]

Lubor Ladicky¹, Chris Russell², Pushmeet Kohli³, Philip H. S. Torr⁴•Institutions (4)

ETH Zurich¹, University College London², Microsoft³, Oxford Brookes University⁴

01 Jun 2014-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper proposes a new model-The associative hierarchical random field (AHRF), and a novel algorithm for its optimization; the second is the application of this model to the problem of semantic segmentation.

...read moreread less

Abstract: This paper makes two contributions: the first is the proposal of a new model—The associative hierarchical random field (AHRF), and a novel algorithm for its optimization; the second is the application of this model to the problem of semantic segmentation. Most methods for semantic segmentation are formulated as a labeling problem for variables that might correspond to either pixels or segments such as super-pixels. It is well known that the generation of super pixel segmentations is not unique. This has motivated many researchers to use multiple super pixel segmentations for problems such as semantic segmentation or single view reconstruction. These super-pixels have not yet been combined in a principled manner, this is a difficult problem, as they may overlap, or be nested in such a way that the segmentations form a segmentation tree. Our new hierarchical random field model allows information from all of the multiple segmentations to contribute to a global energy. MAP inference in this model can be performed efficiently using powerful graph cut based move making algorithms. Our framework generalizes much of the previous work based on pixels or segments, and the resulting labelings can be viewed both as a detailed segmentation at the pixel level, or at the other extreme, as a segment selector that pieces together a solution like a jigsaw, selecting the best segments from different segmentations as pieces. We evaluate its performance on some of the most challenging data sets for object class segmentation, and show that this ability to perform inference using multiple overlapping segmentations leads to state-of-the-art results.

...read moreread less

Proceedings Article•DOI•

Context-aware Learning for Sentence-level Sentiment Analysis with Posterior Regularization

[...]

Bishan Yang¹, Claire Cardie¹•Institutions (1)

Cornell University¹

01 Jun 2014

TL;DR: A novel context-aware method for analyzing sentiment at the level of individual sentences that encoding intuitive lexical and discourse knowledge as expressive constraints and integrating them into the learning of conditional random field models via posterior regularization is proposed.

...read moreread less

Abstract: This paper proposes a novel context-aware method for analyzing sentiment at the level of individual sentences. Most existing machine learning approaches suffer from limitations in the modeling of complex linguistic structures across sentences and often fail to capture nonlocal contextual cues that are important for sentiment interpretation. In contrast, our approach allows structured modeling of sentiment while taking into account both local and global contextual information. Specifically, we encode intuitive lexical and discourse knowledge as expressive constraints and integrate them into the learning of conditional random field models via posterior regularization. The context-aware constraints provide additional power to the CRF model and can guide semi-supervised learning when labeled data is limited. Experiments on standard product review datasets show that our method outperforms the state-of-theart methods in both the supervised and semi-supervised settings.

...read moreread less

Journal Article•DOI•

Parsing the Hand in Depth Images

[...]

Hui Liang¹, Junsong Yuan¹, Daniel Thalmann¹•Institutions (1)

Nanyang Technological University¹

13 Feb 2014-IEEE Transactions on Multimedia

TL;DR: A robust hand parsing scheme to extract a high-level description of the hand from the depth image is presented and a Superpixel-Markov Random Field (SMRF) parsing scheme is proposed to enforce the spatial smoothness and the label co-occurrence prior to remove the misclassified regions.

...read moreread less

Abstract: Hand pose tracking and gesture recognition are useful for human-computer interaction, while a major problem is the lack of discriminative features for compact hand representation. We present a robust hand parsing scheme to extract a high-level description of the hand from the depth image. A novel distance-adaptive selection method is proposed to get more discriminative depth-context features. Besides, we propose a Superpixel-Markov Random Field (SMRF) parsing scheme to enforce the spatial smoothness and the label co-occurrence prior to remove the misclassified regions. Compared to pixel-level filtering, the SMRF scheme is more suitable to model the misclassified regions. By fusing the temporal constraints, its performance can be further improved. Overall, the proposed hand parsing scheme is accurate and efficient. The tests on synthesized dataset show it gives much higher accuracy for single-frame parsing and enhanced robustness for continuous sequence parsing compared to benchmarks. The tests on real-world depth images of the hand and human body show the robustness to complex hand configurations of our method and its generalization power to different kinds of articulated objects.

...read moreread less

Book Chapter•DOI•

A High Performance CRF Model for Clothes Parsing

[...]

Edgar Simo-Serra¹, Sanja Fidler², Francesc Moreno-Noguer¹, Raquel Urtasun²•Institutions (2)

Spanish National Research Council¹, University of Toronto²

01 Nov 2014

TL;DR: This paper frames the problem of clothing parsing as the one of inference in a pose-aware Conditional Random Field which exploits appearance, figure/ground segmentation, shape and location priors for each garment as well as similarities between segments, and symmetries between different human body parts.

...read moreread less

Abstract: In this paper we tackle the problem of clothing parsing: Our goal is to segment and classify different garments a person is wearing. We frame the problem as the one of inference in a pose-aware Conditional Random Field (CRF) which exploits appearance, figure/ground segmentation, shape and location priors for each garment as well as similarities between segments, and symmetries between different human body parts. We demonstrate the effectiveness of our approach on the Fashionista dataset [1] and show that we can obtain a significant improvement over the state-of-the-art.

...read moreread less

Proceedings Article•

Multi-label image classification with a probabilistic label enhancement model

[...]

Xin Li¹, Feipeng Zhao¹, Yuhong Guo¹•Institutions (1)

Temple University¹

23 Jul 2014

TL;DR: The experimental results demonstrate the superiority of the label enhancement model in terms of both prediction performance and running time comparing to the-state-of-the-art multi-label learning methods.

...read moreread less

Abstract: In this paper, we present a novel probabilistic label enhancement model to tackle multi-label image classification problem. Recognizing multiple objects in images is a challenging problem due to label sparsity, appearance variations of the objects and occlusions. We propose to tackle these difficulties from a novel perspective by constructing auxiliary labels in the output space. Our idea is to exploit label combinations to enrich the label space and improve the label identification capacity in the original label space. In particular, we identify a set of informative label combination pairs by constructing a tree-structured graph in the label space using the maximum spanning tree algorithm, which naturally forms a conditional random field. We then use the produced label pairs as auxiliary new labels to augment the original labels and perform piecewise training under the framework of conditional random fields. In the test phase, max-product message passing is used to perform efficient inference on the tree graph, which integrates the augmented label pair classifiers and the standard individual binary classifiers for multi-label prediction. We evaluate the proposed approach on several image classification datasets. The experimental results demonstrate the superiority of our label enhancement model in terms of both prediction performance and running time comparing to the-state-of-the-art multi-label learning methods.

...read moreread less

Journal Article•

PyStruct: learning structured prediction in python

[...]

Andreas Müller¹, Sven Behnke¹•Institutions (1)

University of Bonn¹

01 Jan 2014-Journal of Machine Learning Research

TL;DR: PyStruct aims at providing a general purpose implementation of standard structured prediction methods, both for practitioners and as a baseline for researchers, written in Python and adapts paradigms and types from the scientific Python community for seamless integration with other projects.

...read moreread less

Abstract: Structured prediction methods have become a central tool for many machine learning applications. While more and more algorithms are developed, only very few implementations are available. PyStruct aims at providing a general purpose implementation of standard structured prediction methods, both for practitioners and as a baseline for researchers. It is written in Python and adapts paradigms and types from the scientific Python community for seamless integration with other projects.

...read moreread less

Journal Article•DOI•

Sign Language Recognition with the Kinect Sensor Based on Conditional Random Fields

[...]

Hee-Deok Yang¹•Institutions (1)

Chosun University¹

24 Dec 2014-Sensors

TL;DR: This research uses 3D depth information from hand motions, generated from Microsoft's Kinect sensor and applies a hierarchical conditional random field (CRF) that recognizes hand signs from the hand motions to detect candidate segments of signs using hand motions.

...read moreread less

Abstract: Sign language is a visual language used by deaf people. One difficulty of sign language recognition is that sign instances of vary in both motion and shape in three-dimensional (3D) space. In this research, we use 3D depth information from hand motions, generated from Microsoft's Kinect sensor and apply a hierarchical conditional random field (CRF) that recognizes hand signs from the hand motions. The proposed method uses a hierarchical CRF to detect candidate segments of signs using hand motions, and then a BoostMap embedding method to verify the hand shapes of the segmented signs. Experiments demonstrated that the proposed method could recognize signs from signed sentence data at a rate of 90.4%.

...read moreread less

Book Chapter•DOI•

Human Pose Estimation with Fields of Parts

[...]

Martin Kiefel¹, Peter V. Gehler¹•Institutions (1)

Max Planck Society¹

06 Sep 2014

TL;DR: This paper proposes a new formulation of the human pose estimation problem, a binary Conditional Random Field model designed to detect human body parts of articulated people in single images.

...read moreread less

Abstract: This paper proposes a new formulation of the human pose estimation problem. We present the Fields of Parts model, a binary Conditional Random Field model designed to detect human body parts of articulated people in single images.

...read moreread less

Journal Article•DOI•

A Hybrid Object-Oriented Conditional Random Field Classification Framework for High Spatial Resolution Remote Sensing Imagery

[...]

Yanfei Zhong¹, Ji Zhao¹, Liangpei Zhang¹•Institutions (1)

Wuhan University¹

11 Mar 2014-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: A hybrid object-oriented CRF classification framework for HSR imagery, namely, CRF + OO, is proposed to address problems of segmentation scale choice and competitive quantitative and qualitative performance when compared with other state-of-the-art classification algorithms.

...read moreread less

Abstract: High spatial resolution (HSR) remote sensing imagery provides abundant geometric and detailed information, which is important for classification. In order to make full use of the spatial contextual information, object-oriented classification and pairwise conditional random fields (CRFs) are widely used. However, the segmentation scale choice is a challenging problem in object-oriented classification, and the classification result of pairwise CRF always has an oversmooth appearance. In this paper, a hybrid object-oriented CRF classification framework for HSR imagery, namely, CRF $+$ OO, is proposed to address these problems by integrating object-oriented classification and CRF classification. In CRF $+$ OO, a probabilistic pixel classification is first performed, and then, the classification results of two CRF models with different potential functions are used to obtain the segmentation map by a connected-component labeling algorithm. As a result, an object-level classification fusion scheme can be used, which integrates the object-oriented classifications using a majority voting strategy at the object level to obtain the final classification result. The experimental results using two multispectral HSR images (QuickBird and IKONOS) and a hyperspectral HSR image (HYDICE) demonstrate that the proposed classification framework has a competitive quantitative and qualitative performance for HSR image classification when compared with other state-of-the-art classification algorithms.

...read moreread less

Journal Article•DOI•

Towards subject independent continuous sign language recognition

[...]

W.W. Kong¹, Surendra Ranganath²•Institutions (2)

National University of Singapore¹, Sri Jayachamarajendra College of Engineering²

01 Mar 2014-Pattern Recognition

TL;DR: A segment-based probabilistic approach to robustly recognize continuous sign language sentences using a two-layer conditional random field model and a novel decoding scheme for the semi-Markov CRF used in the 2-layer CRF.

...read moreread less

Proceedings Article•DOI•

Learning Depth-Sensitive Conditional Random Fields for Semantic Segmentation of RGB-D Images

[...]

Andreas Müller¹, Sven Behnke¹•Institutions (1)

University of Bonn¹

29 Sep 2014

TL;DR: This work presents a structured learning approach to semantic annotation of RGB-D images, and finds that the conditional random field approach improves upon previous work, setting a new state-of-the-art for the dataset.

...read moreread less

Abstract: We present a structured learning approach to semantic annotation of RGB-D images. Our method learns to reason about spatial relations of objects and fuses low-level class predictions to a consistent interpretation of a scene. Our model incorporates color, depth and 3D scene features, on which an energy function is learned to directly optimize object class prediction using the loss-based maximum-margin principle of structural support vector machines. We evaluate our approach on the NYU V2 dataset of indoor scenes, a challenging dataset covering a wide variety of scene layouts and object classes. We hard-code much less information about the scene layout into our model then previous approaches, and instead learn object relations directly from the data. We find that our conditional random field approach improves upon previous work, setting a new state-of-the-art for the dataset. I. INTRODUCTION For robots to perform varied tasks in unstructured envi- ronments, understanding their surroundings is essential. We formulate the problem of semantic annotation of maps as a dense labeling of RGB-D images into semantic classes. Dense labeling of measured surfaces allows for a detailed reasoning about the scene. In this work, we propose the use of random forests combined with conditional random fields (CRF) to perform robust estimation of structure classes in RGB-D images. The CRF is learned using a structural support vector machine, allowing it to integrate the noisy categorization produced by a pixel-based random forest to a consistent interpretation of the scene. We thereby extend the success of learned CRF models for semantic segmentation in RGB images to the domain of 3D scenes. Our emphasis lies on exploiting the additional depth and 3D information in all processing steps, while relying on learning to create a model that is adjusted to the properties of the sensor input and environment. Our approach starts with a random forest, providing a noisy local estimate of semantic classes based on color and depth information. These estimates are grouped together using a superpixel approach, for which we extend previous superpixel algorithms from the RGB to the RGB-D domain. We then build a geometric model of the scene, based on the neighborhood graph of superpixels. We use this graph not only to capture spatial relations in the 2D plane of the image, but also to model object distances and surface angles in 3D, using a point cloud generated from the RGB-D image. The process is illustrated in Figure 1.

...read moreread less

Book Chapter•DOI•

Emotion Cause Detection with Linguistic Construction in Chinese Weibo Text

[...]

Lin Gui¹, Li Yuan¹, Ruifeng Xu¹, Bin Liu¹, Qin Lu², Yu Zhou¹ - Show less +2 more•Institutions (2)

Harbin Institute of Technology¹, Hong Kong Polytechnic University²

05 Dec 2014

TL;DR: A rule- based emotion cause detection method is developed which uses 25 manually complied rules and two machine learning based cause detection methods are developed including a classification-based method using support vec- tor machines and a sequence labeling based method using conditional random fields model.

...read moreread less

Abstract: To identify the cause of emotion is a new challenge for researchers in nature language processing. Currently, there is no existing works on emotion cause detection from Chinese micro-blogging (Weibo) text. In this study, an emotion cause annotated corpus is firstly designed and developed through anno- tating the emotion cause expressions in Chinese Weibo Text. Up to now, an emotion cause annotated corpus which consists of the annotations for 1,333 Chinese Weibo is constructed. Based on the observations on this corpus, the characteristics of emotion cause expression are identified. Accordingly, a rule- based emotion cause detection method is developed which uses 25 manually complied rules. Furthermore, two machine learning based cause detection me- thods are developed including a classification-based method using support vec- tor machines and a sequence labeling based method using conditional random fields model. It is the largest available resources in this research area. The expe- rimental results show that the rule-based method achieves 68.30% accuracy rate. Furthermore, the method based on conditional random fields model achieved 77.57% accuracy which is 37.45% higher than the reference baseline method. These results show the effectiveness of our proposed emotion cause detection method.

...read moreread less

Proceedings Article•DOI•

Hierarchical Semantic Labeling for Task-Relevant RGB-D Perception

[...]

Chenxia Wu¹, Ian Lenz², Ashutosh Saxena²•Institutions (2)

Zhejiang University¹, Cornell University²

12 Jul 2014

TL;DR: This work presents an algorithm that produces hierarchical labelings of a scene, following is-part-of and is-type-of relationships, based on a Conditional Random Field that relates pixel-wise and pair-wise observations to labels.

...read moreread less

Abstract: Semantic labeling of RGB-D scenes is very important in enabling robots to perform mobile manipulation tasks, but different tasks may require entirely different sets of labels. For example, when navigating to an object, we may need only a single label denoting its class, but to manipulate it, we might need to identify individual parts. In this work, we present an algorithm that produces hierarchical labelings of a scene, following is-part-of and is-type-of relationships. Our model is based on a Conditional Random Field that relates pixel-wise and pair-wise observations to labels. We encode hierarchical labeling constraints into the model while keeping inference tractable. Our model thus predicts different specificities in labeling based on its confidence—if it is not sure whether an object is Pepsi or Sprite, it will predict soda rather than making an arbitrary choice. In extensive experiments, both offline on standard datasets as well as in online robotic experiments, we show that our model outperforms other stateof-the-art methods in labeling performance as well as in success rate for robotic tasks.

...read moreread less

Collapse