Home
/
Authors
/
Navneet Dalal

Author

Navneet Dalal

Other affiliations: Siemens, French Institute for Research in Computer Science and Automation

Bio: Navneet Dalal is an academic researcher from Google. The author has contributed to research in topics: Object detection & Optical flow. The author has an hindex of 17, co-authored 31 publications receiving 32776 citations. Previous affiliations of Navneet Dalal include Siemens & French Institute for Research in Computer Science and Automation.

Papers

PDF

Open Access

More filters

Patent•

Systems and methods for person recognition data management

[...]

Akshay R. Bapat¹, George Alban Heitz¹, Rizwan Chaudhry¹, Navneet Dalal¹, James Edward Stewart¹, Jennifer Bush¹, Joe Delone Venters¹, Kara Gates¹, Timothy Butler¹, Yohannes Kifle¹, Anton Bastov¹ - Show less +7 more•Institutions (1)

Google¹

14 Aug 2017

TL;DR: In this paper, the authors present a method for recognizing persons in video streams, which includes: (1) obtaining images collected by video cameras in a smart home environment, each image including a detected person; (2) obtaining personally identifiable information of the detected person, generated from analysis of the image; (3) grouping the images into a first group of a plurality of groups based on the personal information; (4) receiving from a user a request to remove a first image from the first group; and (5) in response to the request, disassociating the

...read moreread less

Abstract: The various implementations described herein include systems and methods for recognizing persons in video streams. In one aspect, a method includes: (1) obtaining images collected by video cameras in a smart home environment, each image including a detected person; (2) for each image, obtaining personally identifiable information of the detected person, the personally identifiable information generated from analysis of the image; (3) grouping the images into a first group of a plurality of groups based on the personally identifiable information, each group of the plurality of groups representing a unique person; (4) receiving from a user a request to remove a first image from the first group; and (5) in response to the request: (a) removing the first image from the first group; and (b) disassociating the corresponding personally identifiable information from the first group.

...read moreread less

12 citations

Patent•

Systems and methods of person recognition in video streams

[...]

Google¹

14 Aug 2017

TL;DR: In this paper, the first person is not known to the computing system, and a user is asked to classify first person as a stranger, or delete the stored personally identifiable information.

...read moreread less

Abstract: The various implementations described herein include systems and methods for recognizing persons in video streams. In one aspect, a method includes: (1) obtaining a live video stream; (2) detecting person(s) in the stream; and (3) determining, from analysis of the live video stream, personally identifiable information of the detected person(s); (4) determining, based on the personally identifiable information, that the first person is not known to the computing system; (5) in accordance with the determination that the first person is not known: (a) storing the personally identifiable information; and (b) requesting a user to classify the first person; and (6) in accordance with (i) a determination that a predetermined amount of time has elapsed since the request was transmitted and a response was not received, or (ii) a determination that a response was received classifying the first person as a stranger, deleting the stored personally identifiable information.

...read moreread less

11 citations

Patent•

Photorealistic Recommendation of Clothing and Apparel Based on Detected Web Browser Input and Content Tag Analysis

[...]

Navneet Dalal¹, Salih Burak Gokturk¹, Lorant Toth¹, Munjal Shah¹•Institutions (1)

Google¹

17 Dec 2010

TL;DR: In this article, a system and method for recommending clothing or apparel to a user is presented, where a user's activity is detected in order to identify a set of items that are of interest to the user.

...read moreread less

Abstract: A system and method for recommending clothing or apparel to a user. Activity of a user is detected in order to identify a set of items that are of interest to the user. One or more recommendation parameters may be determined for the used based at least in part on the individual items of clothing/apparel that are of interest to the user. Clothing/apparel content is selected for display to the user based on the recommendation parameters.

...read moreread less

7 citations

Journal Article•DOI•

Image interpolation for virtual sports scenarios

[...]

Tomás Rodríguez, Ian Reid¹, Radu Horaud², Navneet Dalal², Marcelo Goetz - Show less +1 more•Institutions (2)

University of Oxford¹, French Institute for Research in Computer Science and Automation²

01 Sep 2005

TL;DR: The EVENTS project attempts to apply state of the art view interpolation to the field of professional sports to populate a wide scenario such as a stadium with a number of cameras and, via computer vision, to produce photo-realistic moving or static images from virtual viewpoints.

...read moreread less

Abstract: View interpolation has been explored in the scientific community as a means to avoid the complexity of full 3D in the construction of photo-realistic interactive scenarios. EVENTS project attempts to apply state of the art view interpolation to the field of professional sports. The aim is to populate a wide scenario such as a stadium with a number of cameras and, via computer vision, to produce photo-realistic moving or static images from virtual viewpoints, i.e where there is no physical camera. EVENTS proposes an innovative view interpolation scheme based on the Joint View Triangulation algorithm developed by the project participants. Joint View Triangulation is combined within the EVENTS framework with new initiatives in the field of multiple view layered representation, automatic seed matching, image-based rendering, tracking occluding layers and constrained scene analysis. The computer vision software has been implemented on top of a novel high performance computing platform with the aim to achieve real-time interpolation.

...read moreread less

6 citations

Patent•

Image recognition system for use in analysing images of objects and applications thereof

[...]

Salih Burak Gokturk, Baris Sumengen, Diem Vu, Navneet Dalal, Dan Chiao, Jacquie Phillips, Mark Moran, Vincent Vanhoucke, Azhar Khan, Xiaofan Lin, Munjal Shah, Andrew Miller, Dragomir Anguelov - Show less +9 more

07 Nov 2007

TL;DR: In this article, the authors programmatically analyze each of a plurality of images in order to determine one or more visual characteristics about an item shown in each of the images, and then a search operation is performed to identify items that have a visual characteristic that satisfies at least some of the search criteria.

...read moreread less

Abstract: Embodiments programmatically analyze each of a plurality of images in order to determine one or more visual characteristics about an item shown in each of the plurality of images. Data is stored corresponding to the one or more visual characteristics. An interface in is provided for which a user is able to specify one or more search criteria. In response to receiving the one or more search criteria, a search operation is performed to identify one or more items that have a visual characteristic that satisfies at least some of the one or more search criteria.

...read moreread less

6 citations

1
…
2
3
4
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Going deeper with convolutions

[...]

Christian Szegedy¹, Wei Liu², Yangqing Jia¹, Pierre Sermanet¹, Scott Reed³, Dragomir Anguelov¹, Dumitru Erhan¹, Vincent Vanhoucke¹, Andrew Rabinovich - Show less +5 more•Institutions (3)

Google¹, University of North Carolina at Chapel Hill², University of Michigan³

07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read moreread less

40,257 citations

Book Chapter•DOI•

Microsoft COCO: Common Objects in Context

[...]

Tsung-Yi Lin¹, Michael Maire², Serge Belongie¹, James Hays, Pietro Perona², Deva Ramanan³, Piotr Dollár⁴, C. Lawrence Zitnick⁴ - Show less +4 more•Institutions (4)

Cornell University¹, California Institute of Technology², University of California, Irvine³, Microsoft⁴

06 Sep 2014

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Abstract: We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance segmentations to aid in precise object localization. Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. With a total of 2.5 million labeled instances in 328k images, the creation of our dataset drew upon extensive crowd worker involvement via novel user interfaces for category detection, instance spotting and instance segmentation. We present a detailed statistical analysis of the dataset in comparison to PASCAL, ImageNet, and SUN. Finally, we provide baseline performance analysis for bounding box and segmentation detection results using a Deformable Parts Model.

...read moreread less

30,462 citations

Proceedings Article•DOI•

You Only Look Once: Unified, Real-Time Object Detection

[...]

Joseph Redmon¹, Santosh K. Divvala², Ross Girshick³, Ali Farhadi²•Institutions (3)

University of Washington¹, Allen Institute for Artificial Intelligence², Facebook³

27 Jun 2016

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Abstract: We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

27,256 citations

Proceedings Article•DOI•

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

[...]

Ross Girshick¹, Jeff Donahue¹, Trevor Darrell¹, Jitendra Malik¹•Institutions (1)

University of California, Berkeley¹

23 Jun 2014

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Abstract: Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.

...read moreread less

21,729 citations

Proceedings Article•DOI•

Feature Pyramid Networks for Object Detection

[...]

Tsung-Yi Lin¹, Piotr Dollár², Ross Girshick², Kaiming He², Bharath Hariharan², Serge Belongie¹ - Show less +2 more•Institutions (2)

Cornell University¹, Facebook²

21 Jul 2017

TL;DR: This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.

...read moreread less

Abstract: Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But pyramid representations have been avoided in recent object detectors that are based on deep convolutional networks, partially because they are slow to compute and memory intensive. In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost. A top-down architecture with lateral connections is developed for building high-level semantic feature maps at all scales. This architecture, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications. Using a basic Faster R-CNN system, our method achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles, surpassing all existing single-model entries including those from the COCO 2016 challenge winners. In addition, our method can run at 5 FPS on a GPU and thus is a practical and accurate solution to multi-scale object detection. Code will be made publicly available.

...read moreread less

16,727 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse