scispace - formally typeset
Search or ask a question
Conference

Asian Conference on Computer Vision 

About: Asian Conference on Computer Vision is an academic conference. The conference publishes majorly in the area(s): Computer science & Feature (computer vision). Over the lifetime, 2847 publications have been published by the conference receiving 51985 citations.


Papers
More filters
Book ChapterDOI
01 Nov 2014
TL;DR: This work proposes A+, an improved variant of Anchored Neighborhood Regression, which combines the best qualities of ANR and SF and builds on the features and anchored regressors from ANR but instead of learning the regressors on the dictionary it uses the full training material, similar to SF.
Abstract: We address the problem of image upscaling in the form of single image super-resolution based on a dictionary of low- and high-resolution exemplars. Two recently proposed methods, Anchored Neighborhood Regression (ANR) and Simple Functions (SF), provide state-of-the-art quality performance. Moreover, ANR is among the fastest known super-resolution methods. ANR learns sparse dictionaries and regressors anchored to the dictionary atoms. SF relies on clusters and corresponding learned functions. We propose A+, an improved variant of ANR, which combines the best qualities of ANR and SF. A+ builds on the features and anchored regressors from ANR but instead of learning the regressors on the dictionary it uses the full training material, similar to SF. We validate our method on standard images and compare with state-of-the-art methods. We obtain improved quality (i.e. 0.2–0.7 dB PSNR better than ANR) and excellent time complexity, rendering A+ the most efficient dictionary-based super-resolution method to date.

1,418 citations

Book ChapterDOI
05 Nov 2012
TL;DR: A framework for automatic modeling, detection, and tracking of 3D objects with a Kinect and shows how to build the templates automatically from 3D models, and how to estimate the 6 degrees-of-freedom pose accurately and in real-time.
Abstract: We propose a framework for automatic modeling, detection, and tracking of 3D objects with a Kinect. The detection part is mainly based on the recent template-based LINEMOD approach [1] for object detection. We show how to build the templates automatically from 3D models, and how to estimate the 6 degrees-of-freedom pose accurately and in real-time. The pose estimation and the color information allow us to check the detection hypotheses and improves the correct detection rate by 13% with respect to the original LINEMOD. These many improvements make our framework suitable for object manipulation in Robotics applications. Moreover we propose a new dataset made of 15 registered, 1100+ frame video sequences of 15 various objects for the evaluation of future competing methods.

1,114 citations

Book ChapterDOI
02 Dec 2018
TL;DR: In this paper, a conditional generative adversarial network (GAN) is used for anomaly detection in a one-class, semi-supervised learning paradigm, where an encoder-decoder-encoder sub-network is employed to map the input image to a lower dimension vector, which is then used to reconstruct the generated output image.
Abstract: Anomaly detection is a classical problem in computer vision, namely the determination of the normal from the abnormal when datasets are highly biased towards one class (normal) due to the insufficient sample size of the other class (abnormal). While this can be addressed as a supervised learning problem, a significantly more challenging problem is that of detecting the unknown/unseen anomaly case that takes us instead into the space of a one-class, semi-supervised learning paradigm. We introduce such a novel anomaly detection model, by using a conditional generative adversarial network that jointly learns the generation of high-dimensional image space and the inference of latent space. Employing encoder-decoder-encoder sub-networks in the generator network enables the model to map the input image to a lower dimension vector, which is then used to reconstruct the generated output image. The use of the additional encoder network maps this generated image to its latent representation. Minimizing the distance between these images and the latent vectors during training aids in learning the data distribution for the normal samples. As a result, a larger distance metric from this learned data distribution at inference time is indicative of an outlier from that distribution—an anomaly. Experimentation over several benchmark datasets, from varying domains, shows the model efficacy and superiority over previous state-of-the-art approaches.

857 citations

Book ChapterDOI
08 Nov 2010
TL;DR: A novel approach to binocular stereo for fast matching of high-resolution images by building a prior on the disparities by forming a triangulation on a set of support points which can be robustly matched, reducing the matching ambiguities of the remaining points.
Abstract: In this paper we propose a novel approach to binocular stereo for fast matching of high-resolution images. Our approach builds a prior on the disparities by forming a triangulation on a set of support points which can be robustly matched, reducing the matching ambiguities of the remaining points. This allows for efficient exploitation of the disparity search space, yielding accurate dense reconstruction without the need for global optimization. Moreover, our method automatically determines the disparity range and can be easily parallelized. We demonstrate the effectiveness of our approach on the large-scale Middlebury benchmark, and show that state-of-the-art performance can be achieved with significant speedups. Computing the left and right disparity maps for a one Megapixel image pair takes about one second on a single CPU core.

818 citations

Book ChapterDOI
08 Nov 2010
TL;DR: This paper proposes a new method, named the Cosine Similarity Metric Learning (CSML) for learning a distance metric for facial verification, which has achieved the highest accuracy in the literature.
Abstract: Face verification is the task of deciding by analyzing face images, whether a person is who he/she claims to be. This is very challenging due to image variations in lighting, pose, facial expression, and age. The task boils down to computing the distance between two face vectors. As such, appropriate distance metrics are essential for face verification accuracy. In this paper we propose a new method, named the Cosine Similarity Metric Learning (CSML) for learning a distance metric for facial verification. The use of cosine similarity in our method leads to an effective learning algorithm which can improve the generalization ability of any given metric. Our method is tested on the state-of-the-art dataset, the Labeled Faces in the Wild (LFW), and has achieved the highest accuracy in the literature.

626 citations

Performance
Metrics
No. of papers from the Conference in previous years
YearPapers
202313
2022282
20211
2020266
20192
2018308