Home
/
Authors
/
Anurag Mittal

Author

Anurag Mittal

Other affiliations: Cornell University, Princeton University, Indian Institutes of Technology ...read more

Bio: Anurag Mittal is an academic researcher from Indian Institute of Technology Madras. The author has contributed to research in topics: Object detection & Pose. The author has an hindex of 31, co-authored 97 publications receiving 3961 citations. Previous affiliations of Anurag Mittal include Cornell University & Princeton University.

Topics: Object detection, Pose, Image retrieval, Pixel, Sketch ...read more

Papers published on a yearly basis

2021
2020
2019
2018
2017
2016
2015
2014
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1997

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Motion-based background subtraction using adaptive kernel density estimation

[...]

Anurag Mittal¹, Nikos Paragios²•Institutions (2)

Princeton University¹, École Normale Supérieure²

27 Jun 2004

TL;DR: A new method for the modeling and subtraction of scenes that consist of static or quasi-static structures that exhibits a persistent dynamic behavior in time is proposed and extensive experiments demonstrate the utility and performance of the proposed approach.

...read moreread less

Abstract: Background modeling is an important component of many vision systems. Existing work in the area has mostly addressed scenes that consist of static or quasi-static structures. When the scene exhibits a persistent dynamic behavior in time, such an assumption is violated and detection performance deteriorates. In this paper, we propose a new method for the modeling and subtraction of such scenes. Towards the modeling of the dynamic characteristics, optical flow is computed and utilized as a feature in a higher dimensional space. Inherent ambiguities in the computation of features are addressed by using a data-dependent bandwidth for density estimation using kernels. Extensive experiments demonstrate the utility and performance of the proposed approach.

...read moreread less

648 citations

Journal Article•DOI•

M 2 Tracker: A Multi-View Approach to Segmenting and Tracking People in a Cluttered Scene

[...]

Anurag Mittal¹, Larry S. Davis¹•Institutions (1)

University of Maryland, College Park¹

27 Feb 2003-International Journal of Computer Vision

TL;DR: A system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized surveillance cameras located far from each other and the use of occlusion analysis to combine evidence from different camera pairs is presented.

...read moreread less

Abstract: When occlusion is minimal, a single camera is generally sufficient to detect and track objects. However, when the density of objects is high, the resulting occlusion and lack of visibility suggests the use of multiple cameras and collaboration between them so that an object is detected using information available from all the cameras in the scene. In this paper, we present a system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized surveillance cameras located far from each other. The system is fully automatic, and takes decisions about object detection and tracking using evidence collected from many pairs of cameras. Innovations that help us tackle the problem include a region-based stereo algorithm capable of finding 3D points inside an object knowing only the projections of the object (as a whole) in two views, a segmentation algorithm using bayesian classification and the use of occlusion analysis to combine evidence from different camera pairs. The system has been tested using different densities of people in the scene. This helps us determine the number of cameras required for a particular density of people. Experiments have also been conducted to verify and quantify the efficacy of the occlusion analysis scheme.

...read moreread less

444 citations

Proceedings Article•DOI•

A Generative Model for Zero Shot Learning Using Conditional Variational Autoencoders

[...]

Ashish Mishra¹, M Shiva Krishna Reddy¹, Anurag Mittal¹, Hema A. Murthy¹•Institutions (1)

Indian Institute of Technology Madras¹

01 Jun 2018

TL;DR: In this paper, a conditional variational autoencoder (CVAE) is used to generate the samples from the given attributes and use the generated samples for classification of the unseen classes.

...read moreread less

Abstract: Zero shot learning in Image Classification refers to the setting where images from some novel classes are absent in the training data but other information such as natural language descriptions or attribute vectors of the classes are available. This setting is important in the real world since one may not be able to obtain images of all the possible classes at training. While previous approaches have tried to model the relationship between the class attribute space and the image space via some kind of a transfer function in order to model the image space correspondingly to an unseen class, we take a different approach and try to generate the samples from the given attributes, using a conditional variational autoencoder, and use the generated samples for classification of the unseen classes. By extensive testing on four benchmark datasets, we show that our model outperforms the state of the art, particularly in the more realistic generalized setting, where the training classes can also appear at the test time along with the novel classes.

...read moreread less

256 citations

Proceedings Article•DOI•

Automated feature extraction for early detection of diabetic retinopathy in fundus images

[...]

Saiprasad Ravishankar¹, Arpit Jain², Anurag Mittal³•Institutions (3)

University of Illinois at Urbana–Champaign¹, University of Maryland, College Park², Indian Institute of Technology Madras³

20 Jun 2009

TL;DR: A new constraint for optic disk detection is proposed where the major blood vessels are first detected and the intersection of these are used to find the approximate location of the optic disk.

...read moreread less

Abstract: Automated detection of lesions in retinal images can assist in early diagnosis and screening of a common disease: Diabetic Retinopathy. A robust and computationally efficient approach for the localization of the different features and lesions in a fundus retinal image is presented in this paper. Since many features have common intensity properties, geometric features and correlations are used to distinguish between them. We propose a new constraint for optic disk detection where we first detect the major blood vessels and use the intersection of these to find the approximate location of the optic disk. This is further localized using color properties. We also show that many of the features such as the blood vessels, exudates and microaneurysms and hemorrhages can be detected quite accurately using different morphological operations applied appropriately. Extensive evaluation of the algorithm on a database of 516 images with varied contrast, illumination and disease stages yields 97.1% success rate for optic disk localization, a sensitivity and specificity of 95.7%and 94.2%respectively for exudate detection and 95.1% and 90.5% for microaneurysm/hemorrhage detection. These compare very favorably with existing systems and promise real deployment of these systems.

...read moreread less

232 citations

Book Chapter•DOI•

M2Tracker: A Multi-view Approach to Segmenting and Tracking People in a Cluttered Scene Using Region-Based Stereo

[...]

Anurag Mittal¹, Larry S. Davis¹•Institutions (1)

University of Maryland, College Park¹

28 May 2002

TL;DR: A system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized cameras located far from each other and a scheme for combining evidences gathered from different camera pairs using occlusion analysis so as to obtain a globally optimum detection and tracking of objects.

...read moreread less

Abstract: We present a system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized cameras located far from each other. The system improves upon existing systems in many ways including: (1)We do not assume that a foreground connected component belongs to only one object; rather, we segment the views taking into account color models for the objects and the background. This helps us to not only separate foreground regions belonging to different objects, but to also obtain better background regions than traditional background subtraction methods (as it uses foreground color models in the algorithm). (2) It is fully automatic and does not require any manual input or initializations of any kind. (3) Instead of taking decisions about object detection and tracking from a single view or camera pair, we collect evidences from each pair and combine the evidence to obtain a decision in the end. This helps us to obtain much better detection and tracking as opposed to traditional systems.Several innovations help us tackle the problem. The first is the introduction of a region-based stereo algorithm that is capable of finding 3D points inside an object if we know the regions belonging to the object in two views. No exact point matching is required. This is especially useful in wide baseline camera systems where exact point matching is very difficult due to self-occlusion and a substantial change in viewpoint. The second contribution is the development of a scheme for setting priors for use in segmentation of a view using bayesian classification. The scheme, which assumes knowledge of approximate shape and location of objects, dynamically assigns priors for different objects at each pixel so that occlusion information is encoded in the priors. The third contribution is a scheme for combining evidences gathered from different camera pairs using occlusion analysis so as to obtain a globally optimum detection and tracking of objects.The system has been tested using different density of people in the scene which helps us to determine the number of cameras required for a particular density of people.

...read moreread less

226 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Object tracking: A survey

[...]

Alper Yilmaz¹, Omar Javed, Mubarak Shah²•Institutions (2)

Ohio State University¹, University of Central Florida²

25 Dec 2006-ACM Computing Surveys

TL;DR: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

Abstract: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

5,318 citations

Book•

Computer Vision: Algorithms and Applications

[...]

Richard Szeliski

30 Sep 2010

TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.

...read moreread less

Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.

...read moreread less

4,146 citations

On robust estimation of the location parameter

[...]

Frederick R. Forst

01 Jan 1980

3,652 citations

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

[...]

David Forsyth, Jean Ponce

01 Jan 2004

TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.

...read moreread less

Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

...read moreread less

3,627 citations

Journal Article•DOI•

A survey of advances in vision-based human motion capture and analysis

[...]

Thomas B. Moeslund¹, Adrian Hilton², Volker Krüger³•Institutions (3)

Aalborg University¹, University of Surrey², Aalborg University – Copenhagen³

01 Nov 2006-Computer Vision and Image Understanding

TL;DR: This survey reviews recent trends in video-based human capture and analysis, as well as discussing open problems for future research to achieve automatic visual analysis of human movement.

...read moreread less

2,738 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse