Showing papers in "Image and Vision Computing in 2016"

PDF

Open Access

Journal Article•DOI•

[...]

Christos Sagonas¹, Epameinondas Antonakos¹, Georgios Tzimiropoulos², Stefanos Zafeiriou¹, Maja Pantic³ - Show less +1 more•Institutions (3)

Imperial College London¹, University of Nottingham², University of Twente³

01 Mar 2016-Image and Vision Computing

TL;DR: This paper proposes a semi-automatic annotation technique that was employed to re-annotate most existing facial databases under a unified protocol, and presents the 300 Faces In-The-Wild Challenge (300-W), the first facial landmark localization challenge that was organized twice, in 2013 and 2015.

...read moreread less

672 citations

Journal Article•DOI•

Violence detection using Oriented VIolent Flows

[...]

Yuan Gao¹, Hong Liu¹, Xiaohu Sun¹, Can Wang¹, Yi Liu² - Show less +1 more•Institutions (2)

Peking University¹, Hong Kong University of Science and Technology²

01 Apr 2016-Image and Vision Computing

TL;DR: A novel feature extraction method named Oriented VIolent Flows (OViF), which takes full advantage of the motion magnitude change information in statistical motion orientations, is proposed for practical violence detection in videos.

...read moreread less

185 citations

Journal Article•DOI•

3D-based Deep Convolutional Neural Network for action recognition with depth sequences

[...]

Zhi Liu¹, Chenyang Zhang², Yingli Tian²•Institutions (2)

Chongqing University of Technology¹, City College of New York²

01 Nov 2016-Image and Vision Computing

TL;DR: This paper constructs a 3D-based Deep Convolutional Neural Network to directly learn spatio-temporal features from raw depth sequences, then compute a joint based feature vector named JointVector for each sequence by taking into account the simple position and angle information between skeleton joints.

...read moreread less

145 citations

Journal Article•DOI•

From handcrafted to learned representations for human action recognition

[...]

Xiantong Zhen¹, Ling Shao², Stephen J. Maybank³, Rama Chellappa⁴•Institutions (4)

University of Western Ontario¹, Northumbria University², Birkbeck, University of London³, University of Maryland, College Park⁴

01 Nov 2016-Image and Vision Computing

TL;DR: This work provides a detailed overview of recent advancements in human action representations and provides comprehensive analysis and comparisons between learning-based and handcrafted action representations respectively, so as to inspire action recognition researchers towards the study of both kinds of representation techniques.

...read moreread less

121 citations

Journal Article•DOI•

A survey on heterogeneous face recognition

[...]

Shuxin Ouyang¹, Timothy M. Hospedales¹, Yi-Zhe Song¹, Xueming Li², Chen Change Loy³, Xiaogang Wang³ - Show less +2 more•Institutions (3)

Queen Mary University of London¹, Beijing University of Posts and Telecommunications², The Chinese University of Hong Kong³

01 Dec 2016-Image and Vision Computing

TL;DR: This survey provides a comprehensive review of established techniques and recent developments in HFR, and offers a detailed account of datasets and benchmarks commonly used for evaluation.

...read moreread less

114 citations

Journal Article•DOI•

Approaching human level facial landmark localization by deep learning

[...]

Haoqiang Fan, Erjin Zhou

01 Mar 2016-Image and Vision Computing

TL;DR: The solution to the 300 Faces in the Wild Facial Landmark Localization Challenge is presented, and how to achieve very competitive localization performance with a simple deep learning based system is demonstrated.

...read moreread less

107 citations

Journal Article•DOI•

M3 csr

[...]

Jiankang Deng¹, Qingshan Liu¹, Jing Yang¹, Dacheng Tao²•Institutions (2)

Nanjing University¹, University of Technology, Sydney²

01 Mar 2016-Image and Vision Computing

TL;DR: A multi-view, multi-scale and multi-component cascade shape regression (M3CSR) model for robust face alignment is presented and a component-based shape refinement process is developed to further improve the performance of face alignment.

...read moreread less

55 citations

Journal Article•DOI•

Event-based media processing and analysis

[...]

Christos Tzelepis¹, Zhigang Ma², Vasileios Mezaris, Bogdan Ionescu³, Ioannis Kompatsiaris, Giulia Boato⁴, Nicu Sebe⁴, Shuicheng Yan⁵ - Show less +4 more•Institutions (5)

Queen Mary University of London¹, Carnegie Mellon University², Politehnica University of Bucharest³, University of Trento⁴, National University of Singapore⁵

01 Sep 2016-Image and Vision Computing

TL;DR: This paper extensively review the employed conceptualization of the notion of event in multimedia, the techniques for event representation and modeling, the feature representation and event inference approaches for the problems of event detection in audio, visual, and textual content, and some key event-based multimedia applications and various benchmarking activities.

...read moreread less

50 citations

Journal Article•DOI•

Recent trends in gesture recognition

[...]

Tiziana D'Orazio, Roberto Marani, Vito Renò, Grazia Cicirelli

01 Aug 2016-Image and Vision Computing

TL;DR: This paper analyzes with a new perspective the recent state of-the-art on gesture recognition approaches that exploit both RGB and depth data (RGB-D images) to point out which features and classifiers best work with depth data and how depth information can improve gesture recognition beyond the limit of standard approaches based on solely color images.

...read moreread less

47 citations

Journal Article•DOI•

Deceiving faces: When plastic surgery challenges face recognition

[...]

Michele Nappi¹, Stefano Ricciardi¹, Massimo Tistarelli²•Institutions (2)

University of Salerno¹, University of Sassari²

01 Oct 2016-Image and Vision Computing

TL;DR: A survey of the state of the art on face recognition, starting by an analysis of the diffusion of the facial plastic surgery and describing the key aspects of each of the most statistically relevant treatments available, resumed by a synthetic table.

...read moreread less

45 citations

Journal Article•DOI•

Face recognition for authentication on mobile devices

[...]

Esteban Vazquez-Fernandez¹, Daniel González-Jiménez¹•Institutions (1)

Gradiant (Galician Research and Development Center in Advanced Telecommunications)¹

01 Nov 2016-Image and Vision Computing

TL;DR: The use of face biometric technology is discussed and thoughts on key related issues and concerns: usability, security, robustness against spoofing attacks, and user privacy among others are shared.

...read moreread less

Journal Article•DOI•

Action recognition via spatio-temporal local features

[...]

Xiantong Zhen¹, Ling Shao²•Institutions (2)

University of Sheffield¹, Northumbria University²

01 Jun 2016-Image and Vision Computing

TL;DR: A comprehensive study on local methods for human action recognition based on spatio-temporal local features, which implements these techniques and conducts comparison under unified experimental settings on three widely used benchmarks, i.e., the KTH, UCF-YouTube and HMDB51 datasets.

...read moreread less

Journal Article•DOI•

Dual many-to-one-encoder-based transfer learning for cross-dataset human action recognition

[...]

Tiantian Xu¹, Fan Zhu², Edward K. Wong¹, Yi Fang²•Institutions (2)

New York University¹, New York University Abu Dhabi²

01 Nov 2016-Image and Vision Computing

TL;DR: This work proposes a novel dual many-to-one encoder architecture to extract generalized features by mapping raw features from source and target datasets to the same feature space and achieves over 10% increase in recognition accuracy over recent work.

...read moreread less

Journal Article•DOI•

Visual units and confusion modelling for automatic lip-reading

[...]

Dominic Howell¹, Stephen Cox¹, Barry-John Theobald¹•Institutions (1)

University of East Anglia¹

01 Jul 2016-Image and Vision Computing

TL;DR: An approach to ALR is proposed that acknowledges that this information is missing but assumes that it is substituted or deleted in a systematic way that can be modelled, and a system that learns such a model and then incorporates it into decoding, which is realised as a cascade of weighted finite-state transducers.

...read moreread less

Journal Article•DOI•

Is automatic facial expression recognition of emotions coming to a dead end? The rise of the new kids on the block*

[...]

Hatice Gunes, Hayley Hung

01 Nov 2016-Image and Vision Computing

TL;DR: Hatice Gunes’ work is partially supported the EPSRC under its IDEAS Factory Sandpits call on Digital Personhood (Grant Ref: EP/L00416X/1), and Hayley Hung was partially supported by the Dutch national program COMMIT.

...read moreread less

Journal Article•DOI•

L2,1-based regression and prediction accumulation across views for robust facial landmark detection

[...]

Brais Martinez¹, Michel Valstar¹•Institutions (1)

University of Nottingham¹

01 Mar 2016-Image and Vision Computing

TL;DR: A novel regression method that substitutes the commonly used Least Squares regressor and makes use of the L2,1 norm is proposed, designed to increase the robustness of the regressor to poor initialisations or partial occlusions.

...read moreread less

Journal Article•DOI•

Multi-view facial landmark detector learned by the Structured Output SVM

[...]

Michal Uřičář¹, Vojtech Franc¹, Diego Thomas², Akihiro Sugimoto², Václav Hlaváč¹ - Show less +1 more•Institutions (2)

Czech Technical University in Prague¹, National Institute of Informatics²

01 Mar 2016-Image and Vision Computing

TL;DR: Empirical evaluation on "in the wild" images shows that the proposed detector is competitive with the state-of-the-art methods in terms of speed and accuracy yet it keeps the guarantee of finding a globally optimal estimate in contrast to other methods.

...read moreread less

Journal Article•DOI•

Deep and fast: Deep learning hashing with semi-supervised graph construction

[...]

Jingkuan Song¹, Lianli Gao², Fuhao Zou³, Yan Yan¹, Nicu Sebe¹ - Show less +1 more•Institutions (3)

University of Trento¹, University of Electronic Science and Technology of China², Huazhong University of Science and Technology³

01 Nov 2016-Image and Vision Computing

TL;DR: A semi-supervised deep learning hashing (DLH) method for fast multimedia retrieval that utilizes both visual and label information to learn an optimal similarity graph that can more precisely encode the relationship among training data and then generate the hash codes based on the graph.

...read moreread less

Journal Article•DOI•

Cross-view action recognition by cross-domain learning

[...]

Weizhi Nie¹, An-An Liu¹, Wenhui Li¹, Yuting Su¹•Institutions (1)

Tianjin University¹

01 Nov 2016-Image and Vision Computing

TL;DR: A novel cross-domain learning method to handle action recognition by discovering and sharing common knowledge among different video sets captured in multiple viewpoints and applying the block-wise weighted kernel function to leverage cross-view information.

...read moreread less

Journal Article•DOI•

Presentations and attacks, and spoofs, oh my

[...]

Stephanie Schuckers¹•Institutions (1)

Clarkson University¹

01 Nov 2016-Image and Vision Computing

TL;DR: An overview of this field is given, vocabulary formalized by the recent publication of an ISO standard for biometric “presentation attack detection” is described, and evaluating the performance of systems which incorporate methods to detect and reject presentation attacks is discussed.

...read moreread less

Journal Article•DOI•

Learning to detect video events from zero or very few video examples

[...]

Christos Tzelepis¹, Damianos Galanopoulos, Vasileios Mezaris, Ioannis Patras¹•Institutions (1)

Queen Mary University of London¹

01 Sep 2016-Image and Vision Computing

TL;DR: This work builds event detectors based solely on textual descriptions of the event classes, and learns event detectors from very few positive and related training samples, on a large-scale TRECVID MED video dataset.

...read moreread less

Journal Article•DOI•

Multi-view facial landmark detection by using a 3D shape model

[...]

Jan Cech¹, Vojtech Franc¹, Michal Uřičář¹, Jiří Matas¹•Institutions (1)

Czech Technical University in Prague¹

01 Mar 2016-Image and Vision Computing

TL;DR: An algorithm for accurate localization of facial landmarks coupled with a head pose estimation from a single monocular image is proposed which outperforms several state-of-the-art landmark detectors especially for non-frontal face images.

...read moreread less

Journal Article•DOI•

Multimodal classification of events in social media

[...]

Matthias Zeppelzauer¹, Daniel Schopfhauser²•Institutions (2)

St. Pölten University of Applied Sciences¹, Vienna University of Technology²

01 Sep 2016-Image and Vision Computing

TL;DR: In this paper, the authors provide an extensive study of textual, visual, and multimodal representations for social event classification and investigate the strengths and weaknesses of the modalities and study the synergy effects between them.

...read moreread less

Journal Article•DOI•

Infrared ship target segmentation through integration of multiple feature maps

[...]

Zhaoying Liu¹, Xiangzhi Bai², Changming Sun¹, Fugen Zhou², Yujian Li³ - Show less +1 more•Institutions (3)

Commonwealth Scientific and Industrial Research Organisation¹, Beihang University², Beijing University of Technology³

01 Apr 2016-Image and Vision Computing

TL;DR: This work proposes an adaptive thresholding method to segment each local salient region, and a target selection procedure based on shape features is used to remove background and obtain the true target in IR ship target segmentation.

...read moreread less

Journal Article•DOI•

An automatic 3D point cloud registration method based on regional curvature maps

[...]

Junhua Sun¹, Jie Zhang¹, Guangjun Zhang¹•Institutions (1)

Beihang University¹

01 Dec 2016-Image and Vision Computing

TL;DR: An automatic 3D point cloud registration method based on RCMs is introduced and it is demonstrated that the RCM is discriminative and robust against normal errors and varying point cloud density.

...read moreread less

Journal Article•DOI•

Cross-domain action recognition via collective matrix factorization with graph Laplacian regularization

[...]

Jun Tang¹, Haiqun Jin¹, Tan Shoubiao¹, Dong Liang¹•Institutions (1)

Anhui University¹

01 Nov 2016-Image and Vision Computing

TL;DR: This paper presents a cross-domain action recognition framework by utilizing some labeled data from other data sets as the auxiliary source domain and obtains a graph Laplacian regularization term to enhance the discrimination of learned features.

...read moreread less

Journal Article•DOI•

Using the conflict in Dempster-Shafer evidence theory as a rejection criterion in classifier output combination for 3D human action recognition

[...]

Alexandre Perez¹, Hedi Tabia¹, David Declercq¹, Alain Zanotti•Institutions (1)

École nationale supérieure de l'électronique et de ses applications¹

01 Nov 2016-Image and Vision Computing

TL;DR: A new rejection criterion based on the conflict from the information sources: the classifier outputs is proposed to enhance the recognition accuracy and outperform other state-of-the-art methods.

...read moreread less

Journal Article•DOI•

Highly accurate optical flow estimation on superpixel tree

[...]

Yinlin Hu¹, Rui Song², Yunsong Li¹, Peng Rao², Yangli Wang¹ - Show less +1 more•Institutions (2)

Xidian University¹, Chinese Academy of Sciences²

01 Aug 2016-Image and Vision Computing

TL;DR: An effective and efficient two-level filter-based optical flow algorithm connected by an accurate non-local matching and a refined label selection strategy that is more accurate than the usual winner-takes-all manner are presented.

...read moreread less

Journal Article•DOI•

Robust geometric źp-norm feature pooling for image classification and action recognition

[...]

Teng Li¹, Zhijun Meng², Bingbing Ni³, Jianbing Shen⁴, Meng Wang⁵ - Show less +1 more•Institutions (5)

Anhui University¹, Beihang University², Shanghai Jiao Tong University³, Beijing Institute of Technology⁴, Hefei University of Technology⁵

01 Nov 2016-Image and Vision Computing

TL;DR: This paper proposes to generalize previous pooling methods toward a weighted źp-norm spatial pooling function tailored for class-specific feature spatial distribution, and proposes a simple yet effective self-alignment step during both learning and testing to adaptively adjust the pooling weights for individual images.

...read moreread less

Journal Article•DOI•

Online unsupervised feature learning for visual tracking

[...]

Fayao Liu¹, Chunhua Shen¹, Ian Reid¹, Anton van den Hengel¹•Institutions (1)

University of Adelaide¹

01 Jul 2016-Image and Vision Computing

TL;DR: It is shown that a small dictionary, learned and updated online is as effective and more efficient than a huge dictionary learned offline, which facilitates the advantages of both feature learning and structured output prediction.

...read moreread less