Home
/
Authors
/
Ville Viitaniemi

Author

Ville Viitaniemi

Other affiliations: Helsinki University of Technology

Bio: Ville Viitaniemi is an academic researcher from Aalto University. The author has contributed to research in topics: Image retrieval & TRECVID. The author has an hindex of 12, co-authored 51 publications receiving 762 citations. Previous affiliations of Ville Viitaniemi include Helsinki University of Technology.

Papers published on a yearly basis

2015
2014
2013
2012
2010
2009
2008
2007
2006
2005
2003
2002

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

The 2005 PASCAL visual object classes challenge

[...]

Mark Everingham¹, Andrew Zisserman¹, Christopher Williams², Luc Van Gool, Moray Allan², Christopher M. Bishop³, Olivier Chapelle⁴, Navneet Dalal⁵, Thomas Deselaers⁶, Gyuri Dorkó⁵, Stefan Duffner⁷, J Eichhorn⁴, Jason Farquhar⁸, Mario Fritz⁹, Christophe Garcia⁷, Tom Griffiths², Frédéric Jurie⁵, Daniel Keysers⁶, Markus Koskela¹⁰, Jorma Laaksonen¹⁰, Diane Larlus⁵, Bastian Leibe⁹, Hongying Meng⁸, Hermann Ney⁶, Bernt Schiele⁹, Cordelia Schmid⁵, Edgar Seemann⁹, John Shawe-Taylor⁸, Amos Storkey², Sandor Szedmak⁸, Bill Triggs⁵, Ilkay Ulusoy¹¹, Ville Viitaniemi¹⁰, Jianguo Zhang⁵ - Show less +30 more•Institutions (11)

University of Oxford¹, University of Edinburgh², Microsoft³, Max Planck Society⁴, French Institute for Research in Computer Science and Automation⁵, RWTH Aachen University⁶, Orange S.A.⁷, University of Southampton⁸, Technische Universität Darmstadt⁹, Helsinki University of Technology¹⁰, Middle East Technical University¹¹

11 Apr 2005

TL;DR: The PASCAL Visual Object Classes Challenge (PASCALVOC) as mentioned in this paper was held from February to March 2005 to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects).

...read moreread less

Abstract: The PASCAL Visual Object Classes Challenge ran from February to March 2005. The goal of the challenge was to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). Four object classes were selected: motorbikes, bicycles, cars and people. Twelve teams entered the challenge. In this chapter we provide details of the datasets, algorithms used by the teams, evaluation criteria, and results achieved.

...read moreread less

381 citations

Proceedings Article•DOI•

Spatial extensions to bag of visual words

[...]

Ville Viitaniemi¹, Jorma Laaksonen¹•Institutions (1)

Helsinki University of Technology¹

08 Jul 2009

TL;DR: The experiments confirm that the performance of a BoV system can be greatly enhanced by taking the descriptors' spatial distribution into account and compare two ways for tiling images geometrically: soft tiling approach---proposed here---and the traditional hard tiling technique.

...read moreread less

Abstract: The Bag of Visual Words (BoV) paradigm has successfully been applied to image content analysis tasks such as image classification and object detection. The basic BoV approach overlooks spatial descriptor distribution within images. Here we describe spatial extensions to BoV and experimentally compare them in the VOC2007 benchmark image category detection task. In particular, we compare two ways for tiling images geometrically: soft tiling approach---proposed here---and the traditional hard tiling technique. The experiments also address two methods of fusing information from several tilings of the images: post-classifier fusion and fusion on the level of a SVM kernel.The experiments confirm that the performance of a BoV system can be greatly enhanced by taking the descriptors' spatial distribution into account. The soft tiling technique performs well even with a single tiling mask, whereas multi-mask fusion is necessary for good category detection performance in case of hard tiling. The evaluated fusion mechanisms performed approximately equally well.

...read moreread less

39 citations

Book Chapter•DOI•

Techniques for Image Classification, Object Detection and Object Segmentation

[...]

Ville Viitaniemi¹, Jorma Laaksonen¹•Institutions (1)

Helsinki University of Technology¹

11 Sep 2008

TL;DR: The techniques which the method used to participate in the PASCAL NoE VOC Challenge 2007 image analysis performance evaluation campaign produced comparatively good performance, and the method's segmentation accuracy was the best of all submissions.

...read moreread less

Abstract: In this paper we outline the techniques which we used to participate in the PASCAL NoE VOC Challenge 2007 image analysis performance evaluation campaign. We took part in three of the image analysis competitions: image classification, object detection and object segmentation. In the classification task of the evaluation our method produced comparatively good performance, the 4th best of 19 submissions. In contrast, our detection results were quite modest. Our method's segmentation accuracy was the best of all submissions. Our approach for the classification task is based on fused classifications by numerous global image features, including histograms of local features. The object detection combines similar classification of automatically extracted image segments and the previously obtained scene type classifications. The object segmentations are obtained in a straightforward fashion from the detection results.

...read moreread less

34 citations

Book Chapter•DOI•

Techniques for still image scene classification and object detection

[...]

Ville Viitaniemi¹, Jorma Laaksonen¹•Institutions (1)

Helsinki University of Technology¹

10 Sep 2006

TL;DR: In this article, the interaction between different semantic levels in still image scene classification and object detection problems is considered, where a neural method is used to produce a tentative higher-level semantic scene representation from low-level statistical visual features in a bottom-up fashion, which is then used to refine the lower-level object detection results.

...read moreread less

Abstract: In this paper we consider the interaction between different semantic levels in still image scene classification and object detection problems We present a method where a neural method is used to produce a tentative higher-level semantic scene representation from low-level statistical visual features in a bottom-up fashion This emergent representation is then used to refine the lower-level object detection results We evaluate the proposed method with data from Pascal VOC Challenge 2006 image classification and object detection competition The proposed techniques for exploiting global classification results are found to significantly improve the accuracy of local object detection

...read moreread less

30 citations

Journal Article•DOI•

Evaluating the performance in automatic image annotation: Example case by adaptive fusion of global image features

[...]

Ville Viitaniemi¹, Jorma Laaksonen¹•Institutions (1)

Helsinki University of Technology¹

01 Jul 2007-Signal Processing-image Communication

TL;DR: This work considers two traditional metrics for evaluating performance in automatic image annotation, the normalised score (NS) and the precision/recall (PR) statistics, particularly in connection with a de facto standard 5000 Corel image benchmark annotation task.

...read moreread less

Abstract: In this work we consider two traditional metrics for evaluating performance in automatic image annotation, the normalised score (NS) and the precision/recall (PR) statistics, particularly in connection with a de facto standard 5000 Corel image benchmark annotation task. We also motivate and describe another performance measure, de-symmetrised termwise mutual information (DTMI), as a principled compromise between the two traditional extremes. In addition to discussing the measures theoretically, we correlate them experimentally for a family of annotation system configurations derived from the PicSOM image content analysis framework. Looking at the obtained performance figures, we notice that such kind of a system, based on adaptive fusion of numerous global image features, clearly outperforms the considered methods in literature.

...read moreread less

27 citations

1
2
3
4
…
5
6
7
8
9
10
11

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

The Pascal Visual Object Classes (VOC) Challenge

[...]

Mark Everingham¹, Luc Van Gool², Christopher Williams³, John Winn⁴, Andrew Zisserman⁵ - Show less +1 more•Institutions (5)

University of Leeds¹, Katholieke Universiteit Leuven², University of Edinburgh³, Microsoft⁴, University of Oxford⁵

01 Jun 2010-International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Abstract: The Pascal Visual Object Classes (VOC) challenge is a benchmark in visual object category recognition and detection, providing the vision and machine learning communities with a standard dataset of images and annotation, and standard evaluation procedures. Organised annually from 2005 to present, the challenge and its associated dataset has become accepted as the benchmark for object detection. This paper describes the dataset and evaluation procedure. We review the state-of-the-art in evaluated methods for both classification and detection, analyse whether the methods are statistically different, what they are learning from the images (e.g. the object or its context), and what the methods find easy or confuse. The paper concludes with lessons learnt in the three year history of the challenge, and proposes directions for future improvement and extension.

...read moreread less

15,935 citations

Journal Article•DOI•

LabelMe: A Database and Web-Based Tool for Image Annotation

[...]

Bryan Russell¹, Antonio Torralba¹, Kevin Murphy², William T. Freeman¹•Institutions (2)

Massachusetts Institute of Technology¹, University of British Columbia²

01 May 2008-International Journal of Computer Vision

TL;DR: In this article, a large collection of images with ground truth labels is built to be used for object detection and recognition research, such data is useful for supervised learning and quantitative evaluation.

...read moreread less

Abstract: We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. Such data is useful for supervised learning and quantitative evaluation. To achieve this, we developed a web-based tool that allows easy image annotation and instant sharing of such annotations. Using this annotation tool, we have collected a large dataset that spans many object categories, often containing multiple instances over a wide variety of images. We quantify the contents of the dataset and compare against existing state of the art datasets used for object recognition and detection. Also, we show how to extend the dataset to automatically enhance object labels with WordNet, discover object parts, recover a depth ordering of objects in a scene, and increase the number of labels using minimal user supervision and images from the web.

...read moreread less

3,501 citations

The PASCAL Visual Object Classes Challenge

[...]

Jianguo Zhang

01 Jan 2006

3,012 citations

Proceedings Article•DOI•

Cheap and Fast -- But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks

[...]

Rion Snow¹, Brendan O'Connor, Dan Jurafsky¹, Andrew Y. Ng¹•Institutions (1)

Stanford University¹

25 Oct 2008

TL;DR: This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks.

...read moreread less

Abstract: Human linguistic annotation is crucial for many natural language processing tasks but can be expensive and time-consuming. We explore the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web. We investigate five tasks: affect recognition, word similarity, recognizing textual entailment, event temporal ordering, and word sense disambiguation. For all five, we show high agreement between Mechanical Turk non-expert annotations and existing gold standard labels provided by expert labelers. For the task of affect recognition, we also show that using non-expert labels for training machine learning algorithms can be as effective as using gold standard annotations from experts. We propose a technique for bias correction that significantly improves annotation quality on two tasks. We conclude that many large labeling tasks can be effectively designed and carried out in this method at a fraction of the usual expense.

...read moreread less

2,237 citations

Journal Article•DOI•

Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study

[...]

Jianguo Zhang¹, Marcin Marszalek¹, Svetlana Lazebnik², Cordelia Schmid¹•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of Illinois at Urbana–Champaign²

17 Jun 2006

TL;DR: A large-scale evaluation of an approach that represents images as distributions of features extracted from a sparse set of keypoint locations and learns a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions, the Earth Mover’s Distance and the χ2 distance.

...read moreread less

Abstract: Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracted from a sparse set of keypoint locations and learns a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions, the Earth Movers Distance and the ÷2 distance. We first evaluate the performance of our approach with different keypoint detectors and descriptors, as well as different kernels and classifiers. We then conduct a comparative evaluation with several state-of-the-art recognition methods on 4 texture and 5 object databases. On most of these databases, our implementation exceeds the best reported results and achieves comparable performance on the rest. Finally, we investigate the influence of background correlations on recognition performance.

...read moreread less

1,863 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138

Collapse