Top 12 papers published by Guosheng Lin from Nanyang Technological University in 2017

Proceedings Article•DOI•

RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation

[...]

Guosheng Lin¹, Anton Milan², Chunhua Shen², Ian Reid²•Institutions (2)

Nanyang Technological University¹, University of Adelaide²

21 Jul 2017

TL;DR: RefineNet is presented, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution prediction using long-range residual connections and introduces chained residual pooling, which captures rich background context in an efficient manner.

...read moreread less

Abstract: Recently, very deep convolutional neural networks (CNNs) have shown outstanding performance in object recognition and have also been the first choice for dense classification problems such as semantic segmentation. However, repeated subsampling operations like pooling or convolution striding in deep CNNs lead to a significant decrease in the initial image resolution. Here, we present RefineNet, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution prediction using long-range residual connections. In this way, the deeper layers that capture high-level semantic features can be directly refined using fine-grained features from earlier convolutions. The individual components of RefineNet employ residual connections following the identity mapping mindset, which allows for effective end-to-end training. Further, we introduce chained residual pooling, which captures rich background context in an efficient manner. We carry out comprehensive experiments and set new state-of-the-art results on seven public datasets. In particular, we achieve an intersection-over-union score of 83.4 on the challenging PASCAL VOC 2012 dataset, which is the best reported result to date.

...read moreread less

2,260 citations

Proceedings Article•DOI•

Sequential Person Recognition in Photo Albums with a Recurrent Network

[...]

Yao Li¹, Guosheng Lin², Bohan Zhuang¹, Lingqiao Liu¹, Chunhua Shen¹, Anton van den Hengel¹ - Show less +2 more•Institutions (2)

University of Adelaide¹, Nanyang Technological University²

01 Jul 2017

TL;DR: This work proposes a novel recurrent network architecture, in which relational information between instances labels and appearance are modeled jointly, and demonstrates that this simple but elegant formulation achieves state-of-the-art performance on the newly released People In Photo Albums (PIPA) dataset.

...read moreread less

Abstract: Recognizing the identities of people in everyday photos is still a very challenging problem for machine vision, due to issues such as non-frontal faces, changes in clothing, location, lighting. Recent studies have shown that rich relational information between people in the same photo can help in recognizing their identities. In this work, we propose to model the relational information between people as a sequence prediction task. At the core of our work is a novel recurrent network architecture, in which relational information between instances labels and appearance are modeled jointly. In addition to relational cues, scene context is incorporated in our sequence prediction model with no additional cost. In this sense, our approach is a unified framework for modeling both contextual cues and visual appearance of person instances. Our model is trained end-to-end with a sequence of annotated instances in a photo as inputs, and a sequence of corresponding labels as targets. We demonstrate that this simple but elegant formulation achieves state-of-the-art performance on the newly released People In Photo Albums (PIPA) dataset.

...read moreread less

24 citations

Journal Article•DOI•

Discriminative Training of Deep Fully Connected Continuous CRFs With Task-Specific Loss

[...]

Fayao Liu¹, Guosheng Lin¹, Chunhua Shen¹•Institutions (1)

University of Adelaide¹

01 May 2017-IEEE Transactions on Image Processing

TL;DR: It is shown that although the proposed deep CRF model is continuously valued, with the equipment of task-specific loss, it achieves impressive results even on discrete labeling tasks.

...read moreread less

Abstract: Recent works on deep conditional random fields (CRFs) have set new records on many vision tasks involving structured predictions. Here, we propose a fully connected deep continuous CRF model with task-specific losses for both discrete and continuous labeling problems. We exemplify the usefulness of the proposed model on multi-class semantic labeling (discrete) and the robust depth estimation (continuous) problems. In our framework, we model both the unary and the pairwise potential functions as deep convolutional neural networks (CNNs), which are jointly learned in an end-to-end fashion. The proposed method possesses the main advantage of continuously valued CRFs, which is a closed-form solution for the maximum a posteriori (MAP) inference. To better take into account the quality of the predicted estimates during the cause of learning, instead of using the commonly employed maximum likelihood CRF parameter learning protocol, we propose task-specific loss functions for learning the CRF parameters. It enables direct optimization of the quality of the MAP estimates during the learning process. Specifically, we optimize the multi-class classification loss for the semantic labeling task and the Tukey’s biweight loss for the robust depth estimation problem. Experimental results on the semantic labeling and robust depth estimation tasks demonstrate that the proposed method compare favorably against both baseline and state-of-the-art methods. In particular, we show that although the proposed deep CRF model is continuously valued, with the equipment of task-specific loss, it achieves impressive results even on discrete labeling tasks.

...read moreread less

23 citations

Posted Content•

Structured Learning of Tree Potentials in CRF for Image Segmentation

[...]

Fayao Liu¹, Guosheng Lin², Ruizhi Qiao¹, Chunhua Shen¹•Institutions (2)

University of Adelaide¹, Nanyang Technological University²

26 Mar 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: A new approach to image segmentation is proposed, which exploits the advantages of both conditional random fields (CRFs) and decision trees, and forms the unary and pairwise potentials as nonparametric forests—ensembles of decision trees and learn the ensemble parameters and the trees in a unified optimization problem within the large-margin framework.

...read moreread less

Abstract: We propose a new approach to image segmentation, which exploits the advantages of both conditional random fields (CRFs) and decision trees. In the literature, the potential functions of CRFs are mostly defined as a linear combination of some pre-defined parametric models, and then methods like structured support vector machines (SSVMs) are applied to learn those linear coefficients. We instead formulate the unary and pairwise potentials as nonparametric forests---ensembles of decision trees, and learn the ensemble parameters and the trees in a unified optimization problem within the large-margin framework. In this fashion, we easily achieve nonlinear learning of potential functions on both unary and pairwise terms in CRFs. Moreover, we learn class-wise decision trees for each object that appears in the image. Due to the rich structure and flexibility of decision trees, our approach is powerful in modelling complex data likelihoods and label relationships. The resulting optimization problem is very challenging because it can have exponentially many variables and constraints. We show that this challenging optimization can be efficiently solved by combining a modified column generation and cutting-planes techniques. Experimental results on both binary (Graz-02, Weizmann horse, Oxford flower) and multi-class (MSRC-21, PASCAL VOC 2012) segmentation datasets demonstrate the power of the learned nonlinear nonparametric potentials.

...read moreread less

14 citations

Journal Article•DOI•

Structured Learning of Binary Codes with Column Generation for Optimizing Ranking Measures

[...]

Guosheng Lin¹, Fayao Liu¹, Chunhua Shen¹, Jianxin Wu², Heng Tao Shen - Show less +1 more•Institutions (2)

University of Adelaide¹, Nanjing University²

01 Jun 2017-International Journal of Computer Vision

TL;DR: This work proposes a column generation based binary code learning framework for data-dependent hash function learning and demonstrates the generality of the method by applying it to ranking prediction and image retrieval, and shows that it outperforms several state-of-the-art hashing methods.

...read moreread less

Abstract: Hashing methods aim to learn a set of hash functions which map the original features to compact binary codes with similarity preserving in the Hamming space. Hashing has proven a valuable tool for large-scale information retrieval. We propose a column generation based binary code learning framework for data-dependent hash function learning. Given a set of triplets that encode the pairwise similarity comparison information, our column generation based method learns hash functions that preserve the relative comparison relations within the large-margin learning framework. Our method iteratively learns the best hash functions during the column generation procedure. Existing hashing methods optimize over simple objectives such as the reconstruction error or graph Laplacian related loss functions, instead of the performance evaluation criteria of interest--multivariate performance measures such as the AUC and NDCG. Our column generation based method can be further generalized from the triplet loss to a general structured learning based framework that allows one to directly optimize multivariate performance measures. For optimizing general ranking measures, the resulting optimization problem can involve exponentially or infinitely many variables and constraints, which is more challenging than standard structured output learning. We use a combination of column generation and cutting-plane techniques to solve the optimization problem. To speed-up the training we further explore stage-wise training and propose to optimize a simplified NDCG loss for efficient inference. We demonstrate the generality of our method by applying it to ranking prediction and image retrieval, and show that it outperforms several state-of-the-art hashing methods.

...read moreread less

13 citations

Posted Content•

Weakly Supervised Semantic Segmentation Based on Co-segmentation

[...]

Tong Shen¹, Guosheng Lin¹, Lingqiao Liu¹, Chunhua Shen¹, Ian Reid¹ - Show less +1 more•Institutions (1)

University of Adelaide¹

25 May 2017

TL;DR: This work proposes a novel method for weakly supervised semantic segmentation with only image-level labels, which relies on a large scale co-segmentation framework that can produce object masks for a group of images containing objects belonging to the same semantic class.

...read moreread less

Abstract: Training a Fully Convolutional Network (FCN) for semantic segmentation requires a large number of masks with pixel level labelling, which involves a large amount of human labour and time for annotation. In contrast, web images and their image-level labels are much easier and cheaper to obtain. In this work, we propose a novel method for weakly supervised semantic segmentation with only image-level labels. The method utilizes the internet to retrieve a large number of images and uses a large scale co-segmentation framework to generate masks for the retrieved images. We first retrieve images from search engines, e.g. Flickr and Google, using semantic class names as queries, e.g. class names in the dataset PASCAL VOC 2012. We then use high quality masks produced by co-segmentation on the retrieved images as well as the target dataset images with image level labels to train segmentation networks. We obtain an IoU score of 56.9 on test set of PASCAL VOC 2012, which reaches the state-of-the-art performance.

...read moreread less

12 citations

Posted Content•

Weakly Supervised Semantic Segmentation Based on Web Image Co-segmentation

[...]

Tong Shen, Guosheng Lin, Lingqiao Liu, Chunhua Shen, Ian Reid - Show less +1 more

25 May 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes a novel method for weakly supervised semantic segmentation with only image-level labels and utilizes the internet to retrieve a large number of images and uses a large scale co-segmentation framework to generate masks for the retrieved images.

...read moreread less

Abstract: Training a Fully Convolutional Network (FCN) for semantic segmentation requires a large number of masks with pixel level labelling, which involves a large amount of human labour and time for annotation. In contrast, web images and their image-level labels are much easier and cheaper to obtain. In this work, we propose a novel method for weakly supervised semantic segmentation with only image-level labels. The method utilizes the internet to retrieve a large number of images and uses a large scale co-segmentation framework to generate masks for the retrieved images. We first retrieve images from search engines, e.g. Flickr and Google, using semantic class names as queries, e.g. class names in the dataset PASCAL VOC 2012. We then use high quality masks produced by co-segmentation on the retrieved images as well as the target dataset images with image level labels to train segmentation networks. We obtain an IoU score of 56.9 on test set of PASCAL VOC 2012, which reaches the state-of-the-art performance.

...read moreread less

12 citations

Proceedings Article•DOI•

Weakly Supervised Semantic Segmentation Based on Co-segmentation.

[...]

Tong Shen¹, Guosheng Lin¹, Lingqiao Liu¹, Chunhua Shen¹, Ian Reid¹ - Show less +1 more•Institutions (1)

University of Adelaide¹

01 Jan 2017

TL;DR: In this paper, a co-segmentation framework was proposed for weakly supervised semantic segmentation with only image-level labels, which utilizes the internet to retrieve a large number of images and uses a large scale co-semantic segmentation framework to generate masks for the retrieved images.

...read moreread less

Abstract: Training a Fully Convolutional Network (FCN) for semantic segmentation requires a large number of masks with pixel level labelling, which involves a large amount of human labour and time for annotation. In contrast, web images and their image-level labels are much easier and cheaper to obtain. In this work, we propose a novel method for weakly supervised semantic segmentation with only image-level labels. The method utilizes the internet to retrieve a large number of images and uses a large scale co-segmentation framework to generate masks for the retrieved images. We first retrieve images from search engines, e.g. Flickr and Google, using semantic class names as queries, e.g. class names in the dataset PASCAL VOC 2012. We then use high quality masks produced by co-segmentation on the retrieved images as well as the target dataset images with image level labels to train segmentation networks. We obtain an IoU score of 56.9 on test set of PASCAL VOC 2012, which reaches the state-of-the-art performance.

...read moreread less

10 citations

Posted Content•

Learning Multi-level Region Consistency with Dense Multi-label Networks for Semantic Segmentation

[...]

Tong Shen¹, Guosheng Lin², Chunhua Shen¹, Ian Reid¹•Institutions (2)

University of Adelaide¹, Nanyang Technological University²

25 Jan 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a dense multi-label network module is proposed to encourage the region consistency at different levels, which can improve the performance of semantic segmentation systems and remove the noisy and implausible predictions.

...read moreread less

Abstract: Semantic image segmentation is a fundamental task in image understanding. Per-pixel semantic labelling of an image benefits greatly from the ability to consider region consistency both locally and globally. However, many Fully Convolutional Network based methods do not impose such consistency, which may give rise to noisy and implausible predictions. We address this issue by proposing a dense multi-label network module that is able to encourage the region consistency at different levels. This simple but effective module can be easily integrated into any semantic segmentation systems. With comprehensive experiments, we show that the dense multi-label can successfully remove the implausible labels and clear the confusion so as to boost the performance of semantic segmentation systems.

...read moreread less

8 citations

Proceedings Article•DOI•

Learning Multi-level Region Consistency with Dense Multi-label Networks for Semantic Segmentation

[...]

Tong Shen¹, Guosheng Lin², Chunhua Shen¹, Ian Reid¹•Institutions (2)

University of Adelaide¹, Nanyang Technological University²

01 Aug 2017

TL;DR: This work proposes a dense multi-label network module that is able to encourage the region consistency at different levels and can be easily integrated into any semantic segmentation systems.

...read moreread less

Abstract: Semantic image segmentation is a fundamental task in image understanding. Per-pixel semantic labelling of an image benefits greatly from the ability to consider region consistency both locally and globally. However, many Fully Convolutional Network based methods do not impose such consistency, which may give rise to noisy and implausible predictions. We address this issue by proposing a dense multi-label network module that is able to encourage the region consistency at different levels. This simple but effective module can be easily integrated into any semantic segmentation systems. With comprehensive experiments, we show that the dense multi-label can successfully remove the implausible labels and clear the confusion so as to boost the performance of semantic segmentation systems.

...read moreread less

6 citations

Posted Content•

Semantic Segmentation from Limited Training Data

[...]

Anton Milan¹, Trung Pham, K. Vijay, Douglas Morrison, Adam W. Tow, Lingqiao Liu², J. Erskine, R. Grinover, A. Gurman, T. Hunn, N. Kelly-Boxall, Donghoon Lee, M. McTaggart, G. Rallos, A. Razjigaev, T. Rowntree, Tong Shen, R. Smith, S. Wade-McCue, Zheyu Zhuang, Christopher Lehnert³, Guosheng Lin, Ian Reid, Peter Corke, Jürgen Leitner - Show less +21 more•Institutions (3)

Amazon.com¹, University of Adelaide², Queensland University of Technology³

22 Sep 2017-arXiv: Robotics

TL;DR: In contrast to traditional approaches which require large collections of annotated data and many hours of training, the task here was to obtain a robust perception pipeline with only few minutes of data acquisition and training time as discussed by the authors.

...read moreread less

Abstract: We present our approach for robotic perception in cluttered scenes that led to winning the recent Amazon Robotics Challenge (ARC) 2017 Next to small objects with shiny and transparent surfaces, the biggest challenge of the 2017 competition was the introduction of unseen categories In contrast to traditional approaches which require large collections of annotated data and many hours of training, the task here was to obtain a robust perception pipeline with only few minutes of data acquisition and training time To that end, we present two strategies that we explored One is a deep metric learning approach that works in three separate steps: semantic-agnostic boundary detection, patch classification and pixel-wise voting The other is a fully-supervised semantic segmentation approach with efficient dataset collection We conduct an extensive analysis of the two methods on our ARC 2017 dataset Interestingly, only few examples of each class are sufficient to fine-tune even very deep convolutional neural networks for this specific task

...read moreread less

Posted Content•

Efficient Dense Labeling of Human Activity Sequences from Wearables using Fully Convolutional Networks

[...]

Rui Yao, Guosheng Lin, Qinfeng Shi, Damith C. Ranasinghe

20 Feb 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes an efficient algorithm that can predict the label of each sample in a sequence of human activities of arbitrary length using a fully convolutional network and overcomes the problems posed by the sliding window step.

...read moreread less

Abstract: Recognizing human activities in a sequence is a challenging area of research in ubiquitous computing. Most approaches use a fixed size sliding window over consecutive samples to extract features---either handcrafted or learned features---and predict a single label for all samples in the window. Two key problems emanate from this approach: i) the samples in one window may not always share the same label. Consequently, using one label for all samples within a window inevitably lead to loss of information; ii) the testing phase is constrained by the window size selected during training while the best window size is difficult to tune in practice. We propose an efficient algorithm that can predict the label of each sample, which we call dense labeling, in a sequence of human activities of arbitrary length using a fully convolutional network. In particular, our approach overcomes the problems posed by the sliding window step. Additionally, our algorithm learns both the features and classifier automatically. We release a new daily activity dataset based on a wearable sensor with hospitalized patients. We conduct extensive experiments and demonstrate that our proposed approach is able to outperform the state-of-the-arts in terms of classification and label misalignment measures on three challenging datasets: Opportunity, Hand Gesture, and our new dataset.

...read moreread less

Showing papers by "Guosheng Lin published in 2017"