scispace - formally typeset
Search or ask a question

Showing papers in "Image and Vision Computing in 2016"


Journal ArticleDOI
TL;DR: This paper proposes a semi-automatic annotation technique that was employed to re-annotate most existing facial databases under a unified protocol, and presents the 300 Faces In-The-Wild Challenge (300-W), the first facial landmark localization challenge that was organized twice, in 2013 and 2015.

672 citations


Journal ArticleDOI
TL;DR: A novel feature extraction method named Oriented VIolent Flows (OViF), which takes full advantage of the motion magnitude change information in statistical motion orientations, is proposed for practical violence detection in videos.

185 citations


Journal ArticleDOI
TL;DR: This paper constructs a 3D-based Deep Convolutional Neural Network to directly learn spatio-temporal features from raw depth sequences, then compute a joint based feature vector named JointVector for each sequence by taking into account the simple position and angle information between skeleton joints.

145 citations


Journal ArticleDOI
TL;DR: This work provides a detailed overview of recent advancements in human action representations and provides comprehensive analysis and comparisons between learning-based and handcrafted action representations respectively, so as to inspire action recognition researchers towards the study of both kinds of representation techniques.

121 citations


Journal ArticleDOI
TL;DR: This survey provides a comprehensive review of established techniques and recent developments in HFR, and offers a detailed account of datasets and benchmarks commonly used for evaluation.

114 citations


Journal ArticleDOI
TL;DR: The solution to the 300 Faces in the Wild Facial Landmark Localization Challenge is presented, and how to achieve very competitive localization performance with a simple deep learning based system is demonstrated.

107 citations


Journal ArticleDOI

[...]

TL;DR: A multi-view, multi-scale and multi-component cascade shape regression (M3CSR) model for robust face alignment is presented and a component-based shape refinement process is developed to further improve the performance of face alignment.

55 citations


Journal ArticleDOI
TL;DR: This paper extensively review the employed conceptualization of the notion of event in multimedia, the techniques for event representation and modeling, the feature representation and event inference approaches for the problems of event detection in audio, visual, and textual content, and some key event-based multimedia applications and various benchmarking activities.

50 citations


Journal ArticleDOI
TL;DR: This paper analyzes with a new perspective the recent state of-the-art on gesture recognition approaches that exploit both RGB and depth data (RGB-D images) to point out which features and classifiers best work with depth data and how depth information can improve gesture recognition beyond the limit of standard approaches based on solely color images.

47 citations


Journal ArticleDOI
TL;DR: A survey of the state of the art on face recognition, starting by an analysis of the diffusion of the facial plastic surgery and describing the key aspects of each of the most statistically relevant treatments available, resumed by a synthetic table.

45 citations


Journal ArticleDOI
TL;DR: The use of face biometric technology is discussed and thoughts on key related issues and concerns: usability, security, robustness against spoofing attacks, and user privacy among others are shared.

Journal ArticleDOI
TL;DR: A comprehensive study on local methods for human action recognition based on spatio-temporal local features, which implements these techniques and conducts comparison under unified experimental settings on three widely used benchmarks, i.e., the KTH, UCF-YouTube and HMDB51 datasets.

Journal ArticleDOI
TL;DR: This work proposes a novel dual many-to-one encoder architecture to extract generalized features by mapping raw features from source and target datasets to the same feature space and achieves over 10% increase in recognition accuracy over recent work.

Journal ArticleDOI
TL;DR: An approach to ALR is proposed that acknowledges that this information is missing but assumes that it is substituted or deleted in a systematic way that can be modelled, and a system that learns such a model and then incorporates it into decoding, which is realised as a cascade of weighted finite-state transducers.

Journal ArticleDOI
TL;DR: Hatice Gunes’ work is partially supported the EPSRC under its IDEAS Factory Sandpits call on Digital Personhood (Grant Ref: EP/L00416X/1), and Hayley Hung was partially supported by the Dutch national program COMMIT.

Journal ArticleDOI
TL;DR: A novel regression method that substitutes the commonly used Least Squares regressor and makes use of the L2,1 norm is proposed, designed to increase the robustness of the regressor to poor initialisations or partial occlusions.

Journal ArticleDOI
TL;DR: Empirical evaluation on "in the wild" images shows that the proposed detector is competitive with the state-of-the-art methods in terms of speed and accuracy yet it keeps the guarantee of finding a globally optimal estimate in contrast to other methods.

Journal ArticleDOI
TL;DR: A semi-supervised deep learning hashing (DLH) method for fast multimedia retrieval that utilizes both visual and label information to learn an optimal similarity graph that can more precisely encode the relationship among training data and then generate the hash codes based on the graph.

Journal ArticleDOI
TL;DR: A novel cross-domain learning method to handle action recognition by discovering and sharing common knowledge among different video sets captured in multiple viewpoints and applying the block-wise weighted kernel function to leverage cross-view information.

Journal ArticleDOI
TL;DR: An overview of this field is given, vocabulary formalized by the recent publication of an ISO standard for biometric “presentation attack detection” is described, and evaluating the performance of systems which incorporate methods to detect and reject presentation attacks is discussed.

Journal ArticleDOI
TL;DR: This work builds event detectors based solely on textual descriptions of the event classes, and learns event detectors from very few positive and related training samples, on a large-scale TRECVID MED video dataset.

Journal ArticleDOI
TL;DR: An algorithm for accurate localization of facial landmarks coupled with a head pose estimation from a single monocular image is proposed which outperforms several state-of-the-art landmark detectors especially for non-frontal face images.

Journal ArticleDOI
TL;DR: In this paper, the authors provide an extensive study of textual, visual, and multimodal representations for social event classification and investigate the strengths and weaknesses of the modalities and study the synergy effects between them.

Journal ArticleDOI
TL;DR: This work proposes an adaptive thresholding method to segment each local salient region, and a target selection procedure based on shape features is used to remove background and obtain the true target in IR ship target segmentation.

Journal ArticleDOI
TL;DR: An automatic 3D point cloud registration method based on RCMs is introduced and it is demonstrated that the RCM is discriminative and robust against normal errors and varying point cloud density.

Journal ArticleDOI
TL;DR: This paper presents a cross-domain action recognition framework by utilizing some labeled data from other data sets as the auxiliary source domain and obtains a graph Laplacian regularization term to enhance the discrimination of learned features.

Journal ArticleDOI
TL;DR: A new rejection criterion based on the conflict from the information sources: the classifier outputs is proposed to enhance the recognition accuracy and outperform other state-of-the-art methods.

Journal ArticleDOI
TL;DR: An effective and efficient two-level filter-based optical flow algorithm connected by an accurate non-local matching and a refined label selection strategy that is more accurate than the usual winner-takes-all manner are presented.

Journal ArticleDOI
TL;DR: This paper proposes to generalize previous pooling methods toward a weighted źp-norm spatial pooling function tailored for class-specific feature spatial distribution, and proposes a simple yet effective self-alignment step during both learning and testing to adaptively adjust the pooling weights for individual images.

Journal ArticleDOI
TL;DR: It is shown that a small dictionary, learned and updated online is as effective and more efficient than a huge dictionary learned offline, which facilitates the advantages of both feature learning and structured output prediction.