Author
Michael Donoser
Other affiliations: AVL, Graz University of Technology
Bio: Michael Donoser is an academic researcher from Amazon.com. The author has contributed to research in topics: Image segmentation & Object (computer science). The author has an hindex of 21, co-authored 78 publications receiving 2012 citations. Previous affiliations of Michael Donoser include AVL & Graz University of Technology.
Papers published on a yearly basis
Papers
More filters
17 Jun 2006
TL;DR: It is shown that by means of MSER tracking the computational time for the detection of single MSERs can be improved by a factor of 4 to 10 and using a weighted feature vector for data association improves the tracking stability.
Abstract: This paper introduces a tracking method for the well known local MSER (Maximally Stable Extremal Region) detector. The component tree is used as an efficient data structure, which allows the calculation of MSERs in quasi-linear time. It is demonstrated that the tree is able to manage the required data for tracking. We show that by means of MSER tracking the computational time for the detection of single MSERs can be improved by a factor of 4 to 10. Using a weighted feature vector for data association improves the tracking stability. Furthermore, the component tree enables backward tracking which further improves the robustness. The novel MSER tracking algorithm is evaluated on a variety of scenes. In addition, we demonstrate three different applications, tracking of license plates, faces and fibers in paper, showing in all three scenarios improved speed and stability.
299 citations
01 Sep 2009
TL;DR: This paper introduces an unsupervised color segmentation method to segment the input image several times, each time focussing on a different salient part of the image and to subsequently merge all obtained results into one composite segmentation.
Abstract: This paper introduces an unsupervised color segmentation method The underlying idea is to segment the input image several times, each time focussing on a different salient part of the image and to subsequently merge all obtained results into one composite segmentation We identify salient parts of the image by applying affinity propagation clustering to efficiently calculated local color and texture models Each salient region then serves as an independent initialization for a figure/ground segmentation Segmentation is done by minimizing a convex energy functional based on weighted total variation leading to a global optimal solution Each salient region provides an accurate figure/ ground segmentation highlighting different parts of the image These highly redundant results are combined into one composite segmentation by analyzing local segmentation certainty Our formulation is quite general, and other salient region detection algorithms in combination with any semi-supervised figure/ground segmentation approach can be used We demonstrate the high quality of our method on the well-known Berkeley segmentation database Furthermore we show that our method can be used to provide good spatial support for recognition frameworks
262 citations
23 Jun 2013
TL;DR: This paper revisits diffusion processes on affinity graphs for capturing the intrinsic manifold structure defined by pair wise affinity matrices and derives a generic framework for diffusion processes in the scope of retrieval applications.
Abstract: In this paper we revisit diffusion processes on affinity graphs for capturing the intrinsic manifold structure defined by pair wise affinity matrices. Such diffusion processes have already proved the ability to significantly improve subsequent applications like retrieval. We give a thorough overview of the state-of-the-art in this field and discuss obvious similarities and differences. Based on our observations, we are then able to derive a generic framework for diffusion processes in the scope of retrieval applications, where the related work represents specific instances of our generic formulation. We evaluate our framework on several retrieval tasks and are able to derive algorithms that e.\, g.~achieve a 100\% bulls eye score on the popular MPEG7 shape retrieval data set.
208 citations
23 Sep 2009
TL;DR: An efficient, unsupervised clustering method which uses the modified mutual kNN graph as the underlying representation and its performance for the task of shape retrieval is demonstrated.
Abstract: This paper considers two major applications of shape matching algorithms: (a) query-by-example, i e retrieving the most similar shapes from a database and (b) finding clusters of shapes, each represented by a single prototype Our approach goes beyond pairwise shape similarity analysis by considering the underlying structure of the shape manifold, which is estimated from the shape similarity scores between all the shapes within a database We propose a modified mutual kNN graph as the underlying representation and demonstrate its performance for the task of shape retrieval We further describe an efficient, unsupervised clustering method which uses the modified mutual kNN graph for initialization Experimental evaluation proves the applicability of our method, e g by achieving the highest ever reported retrieval score of 93.40% on the well known MPEG-7 database.
133 citations
16 Jun 2012
TL;DR: This work provides feasible generic facade reconstruction by combining low-level classifiers with mid-level object detectors to infer an irregular lattice, which preserves the logical structure of the facade while reducing the search space to a manageable size.
Abstract: High-quality urban reconstruction requires more than multi-view reconstruction and local optimization. The structure of facades depends on the general layout, which has to be optimized globally. Shape grammars are an established method to express hierarchical spatial relationships, and are therefore suited as representing constraints for semantic facade interpretation. Usually inference uses numerical approximations, or hard-coded grammar schemes. Existing methods inspired by classical grammar parsing are not applicable on real-world images due to their prohibitively high complexity. This work provides feasible generic facade reconstruction by combining low-level classifiers with mid-level object detectors to infer an irregular lattice. The irregular lattice preserves the logical structure of the facade while reducing the search space to a manageable size. We introduce a novel method for handling symmetry and repetition within the generic grammar. We show competitive results on two datasets, namely the Paris 2010 and the Graz 50. The former includes only Hausmannian, while the latter includes Classicism, Biedermeier, Historicism, Art Nouveau and post-modern architectural styles.
116 citations
Cited by
More filters
01 Jun 2016
TL;DR: This work introduces Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling, and exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity.
Abstract: Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations, 20 000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.
7,547 citations
TL;DR: This paper investigates two fundamental problems in computer vision: contour detection and image segmentation and presents state-of-the-art algorithms for both of these tasks.
Abstract: This paper investigates two fundamental problems in computer vision: contour detection and image segmentation. We present state-of-the-art algorithms for both of these tasks. Our contour detector combines multiple local cues into a globalization framework based on spectral clustering. Our segmentation algorithm consists of generic machinery for transforming the output of any contour detector into a hierarchical region tree. In this manner, we reduce the problem of image segmentation to that of contour detection. Extensive experimental evaluation demonstrates that both our contour detection and segmentation methods significantly outperform competing algorithms. The automatically generated hierarchical segmentations can be interactively refined by user-specified annotations. Computation at multiple image resolutions provides a means of coupling our system to recognition applications.
5,068 citations
20 Jun 2011
TL;DR: This work proposes a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence, and consistently outperformed existing saliency detection methods.
Abstract: Automatic estimation of salient object regions across images, without any prior assumption or knowledge of the contents of the corresponding scenes, enhances many computer vision and computer graphics applications. We introduce a regional contrast based salient object detection algorithm, which simultaneously evaluates global contrast differences and spatial weighted coherence scores. The proposed algorithm is simple, efficient, naturally multi-scale, and produces full-resolution, high-quality saliency maps. These saliency maps are further used to initialize a novel iterative version of GrabCut, namely SaliencyCut, for high quality unsupervised salient object segmentation. We extensively evaluated our algorithm using traditional salient object detection datasets, as well as a more challenging Internet image dataset. Our experimental results demonstrate that our algorithm consistently outperforms 15 existing salient object detection and segmentation methods, yielding higher precision and better recall rates. We also show that our algorithm can be used to efficiently extract salient object masks from Internet images, enabling effective sketch-based image retrieval (SBIR) via simple shape comparisons. Despite such noisy internet images, where the saliency regions are ambiguous, our saliency guided image retrieval achieves a superior retrieval rate compared with state-of-the-art SBIR methods, and additionally provides important target object region information.
3,653 citations
01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher:
The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.
3,627 citations
TL;DR: It is found that the models designed specifically for salient object detection generally work better than models in closely related areas, which provides a precise definition and suggests an appropriate treatment of this problem that distinguishes it from other problems.
Abstract: We extensively compare, qualitatively and quantitatively, 41 state-of-the-art models (29 salient object detection, 10 fixation prediction, 1 objectness, and 1 baseline) over seven challenging data sets for the purpose of benchmarking salient object detection and segmentation methods. From the results obtained so far, our evaluation shows a consistent rapid progress over the last few years in terms of both accuracy and running time. The top contenders in this benchmark significantly outperform the models identified as the best in the previous benchmark conducted three years ago. We find that the models designed specifically for salient object detection generally work better than models in closely related areas, which in turn provides a precise definition and suggests an appropriate treatment of this problem that distinguishes it from other problems. In particular, we analyze the influences of center bias and scene complexity in model performance, which, along with the hard cases for the state-of-the-art models, provide useful hints toward constructing more challenging large-scale data sets and better saliency models. Finally, we propose probable solutions for tackling several open problems, such as evaluation scores and data set bias, which also suggest future research directions in the rapidly growing field of salient object detection.
1,372 citations