scispace - formally typeset
Search or ask a question
Author

Venkatesh N. Murthy

Other affiliations: Tata Consultancy Services, Siemens
Bio: Venkatesh N. Murthy is an academic researcher from University of Massachusetts Amherst. The author has contributed to research in topics: Deep learning & Automatic image annotation. The author has an hindex of 8, co-authored 8 publications receiving 377 citations. Previous affiliations of Venkatesh N. Murthy include Tata Consultancy Services & Siemens.

Papers
More filters
Proceedings ArticleDOI
22 Jun 2015
TL;DR: It is demonstrated that word embedding vectors perform better than binary vectors as a representation of the tags associated with an image and the CCA model is compared to a simple CNN based linear regression model, which allows the CNN layers to be trained using back-propagation.
Abstract: We propose simple and effective models for the image annotation that make use of Convolutional Neural Network (CNN) features extracted from an image and word embedding vectors to represent their associated tags. Our first set of models is based on the Canonical Correlation Analysis (CCA) framework that helps in modeling both views - visual features (CNN feature) and textual features (word embedding vectors) of the data. Results on all three variants of the CCA models, namely linear CCA, kernel CCA and CCA with k-nearest neighbor (CCA-KNN) clustering, are reported. The best results are obtained using CCA-KNN which outperforms previous results on the Corel-5k and the ESP-Game datasets and achieves comparable results on the IAPRTC-12 dataset. In our experiments we evaluate CNN features in the existing models which bring out the advantages of it over dozens of handcrafted features. We also demonstrate that word embedding vectors perform better than binary vectors as a representation of the tags associated with an image. In addition we compare the CCA model to a simple CNN based linear regression model, which allows the CNN layers to be trained using back-propagation.

128 citations

Journal ArticleDOI
TL;DR: In this paper , a pose estimation toolbox for multi-animal tracking is presented, which integrates the ability to predict an animal's identity to assist tracking in case of occlusions.
Abstract: Estimating the pose of multiple animals is a challenging computer vision problem: frequent interactions cause occlusions and complicate the association of detected keypoints to the correct individuals, as well as having highly similar looking animals that interact more closely than in typical multi-human scenarios. To take up this challenge, we build on DeepLabCut, an open-source pose estimation toolbox, and provide high-performance animal assembly and tracking-features required for multi-animal scenarios. Furthermore, we integrate the ability to predict an animal's identity to assist tracking (in case of occlusions). We illustrate the power of this framework with four datasets varying in complexity, which we release to serve as a benchmark for future algorithm development.

105 citations

Patent
31 Aug 2010
TL;DR: In this paper, a system for vehicle security, personalization, and cardiac activity monitoring of a driver is presented, wherein electrocardiography of the driver is monitored and registered which is used for identification of a person entering in the vehicle and personalization of vehicle based on user preferences thereby act as intruder detection towards vehicle security.
Abstract: The present invention provides a system for vehicle security, personalization, and cardiac activity monitoring of a driver wherein electrocardiography of a driver is monitored and registered which is used for identification of a person entering in the vehicle and personalization of vehicle based on user preferences thereby act as intruder detection towards vehicle security. In addition to registration the present invention also monitors cardiac activity of driver in a continuous and real time fashion without any intrusion to driver with the facility of generation of alert and making emergency call.

100 citations

Proceedings ArticleDOI
01 Jun 2016
TL;DR: The proposed Deep Decision Network provides an insight into the data by identifying the group of classes that are hard to classify and require more attention when compared to others and has the ability to make early decisions thus making it suitable for timesensitive applications.
Abstract: In this paper, we present a novel Deep Decision Network (DDN) that provides an alternative approach towards building an efficient deep learning network. During the learning phase, starting from the root network node, DDN automatically builds a network that splits the data into disjoint clusters of classes which would be handled by the subsequent expert networks. This results in a tree-like structured network driven by the data. The proposed method provides an insight into the data by identifying the group of classes that are hard to classify and require more attention when compared to others. DDN also has the ability to make early decisions thus making it suitable for timesensitive applications. We validate DDN on two publicly available benchmark datasets: CIFAR-10 and CIFAR-100 and it yields state-of-the-art classification performance on both the datasets. The proposed algorithm has no limitations to be applied to any generic classification problems.

82 citations

Journal ArticleDOI
TL;DR: In this article , a pose estimation toolbox for multi-animal tracking is presented, which integrates the ability to predict an animal's identity to assist tracking in case of occlusions.
Abstract: Estimating the pose of multiple animals is a challenging computer vision problem: frequent interactions cause occlusions and complicate the association of detected keypoints to the correct individuals, as well as having highly similar looking animals that interact more closely than in typical multi-human scenarios. To take up this challenge, we build on DeepLabCut, an open-source pose estimation toolbox, and provide high-performance animal assembly and tracking-features required for multi-animal scenarios. Furthermore, we integrate the ability to predict an animal's identity to assist tracking (in case of occlusions). We illustrate the power of this framework with four datasets varying in complexity, which we release to serve as a benchmark for future algorithm development.

75 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: It is demonstrated that a deep neural network can significantly improve optical microscopy, enhancing its spatial resolution over a large field-of-view and depth of field, and can be used to design computational imagers that get better and better as they continue to image specimen and establish new transformations among different modes of imaging.
Abstract: We demonstrate that a deep neural network can significantly improve optical microscopy, enhancing its spatial resolution over a large field-of-view and depth-of-field. After its training, the only input to this network is an image acquired using a regular optical microscope, without any changes to its design. We blindly tested this deep learning approach using various tissue samples that are imaged with low-resolution and wide-field systems, where the network rapidly outputs an image with remarkably better resolution, matching the performance of higher numerical aperture lenses, also significantly surpassing their limited field-of-view and depth-of-field. These results are transformative for various fields that use microscopy tools, including e.g., life sciences, where optical microscopy is considered as one of the most widely used and deployed techniques. Beyond such applications, our presented approach is broadly applicable to other imaging modalities, also spanning different parts of the electromagnetic spectrum, and can be used to design computational imagers that get better and better as they continue to image specimen and establish new transformations among different modes of imaging.

428 citations

Journal ArticleDOI
20 Nov 2017
TL;DR: In this paper, a deep neural network was used to improve optical microscopy, enhancing its spatial resolution over a large field of view and depth of field. But, the only input to this network is an image acquired using a regular optical microscope, without any changes to its design.
Abstract: We demonstrate that a deep neural network can significantly improve optical microscopy, enhancing its spatial resolution over a large field of view and depth of field. After its training, the only input to this network is an image acquired using a regular optical microscope, without any changes to its design. We blindly tested this deep learning approach using various tissue samples that are imaged with low-resolution and wide-field systems, where the network rapidly outputs an image with better resolution, matching the performance of higher numerical aperture lenses and also significantly surpassing their limited field of view and depth of field. These results are significant for various fields that use microscopy tools, including, e.g., life sciences, where optical microscopy is considered as one of the most widely used and deployed techniques. Beyond such applications, the presented approach might be applicable to other imaging modalities, also spanning different parts of the electromagnetic spectrum, and can be used to design computational imagers that get better as they continue to image specimens and establish new transformations among different modes of imaging.

377 citations

Journal ArticleDOI
TL;DR: In this paper, the authors provide water resources scientists and hydrologists with a simple technical overview, trans-disciplinary progress update, and a source of inspiration about the relevance of deep learning to water.
Abstract: Deep learning (DL), a new-generation of artificial neural network research, has transformed industries, daily lives and various scientific disciplines in recent years. DL represents significant progress in the ability of neural networks to automatically engineer problem-relevant features and capture highly complex data distributions. I argue that DL can help address several major new and old challenges facing research in water sciences such as inter-disciplinarity, data discoverability, hydrologic scaling, equifinality, and needs for parameter regionalization. This review paper is intended to provide water resources scientists and hydrologists in particular with a simple technical overview, trans-disciplinary progress update, and a source of inspiration about the relevance of DL to water. The review reveals that various physical and geoscientific disciplines have utilized DL to address data challenges, improve efficiency, and gain scientific insights. DL is especially suited for information extraction from image-like data and sequential data. Techniques and experiences presented in other disciplines are of high relevance to water research. Meanwhile, less noticed is that DL may also serve as a scientific exploratory tool. A new area termed 'AI neuroscience,' where scientists interpret the decision process of deep networks and derive insights, has been born. This budding sub-discipline has demonstrated methods including correlation-based analysis, inversion of network-extracted features, reduced-order approximations by interpretable models, and attribution of network decisions to inputs. Moreover, DL can also use data to condition neurons that mimic problem-specific fundamental organizing units, thus revealing emergent behaviors of these units. Vast opportunities exist for DL to propel advances in water sciences.

260 citations

Proceedings ArticleDOI
Jianzhong He1, Shiliang Zhang1, Ming Yang, Yanhu Shan, Tiejun Huang1 
15 Jun 2019
TL;DR: In this paper, the authors propose a Bi-Directional Cascade Network (BDCN) structure, where an individual layer is supervised by labeled edges at its specific scale, rather than directly applying the same supervision to all CNN outputs.
Abstract: Exploiting multi-scale representations is critical to improve edge detection for objects at different scales. To extract edges at dramatically different scales, we propose a Bi-Directional Cascade Network (BDCN) structure, where an individual layer is supervised by labeled edges at its specific scale, rather than directly applying the same supervision to all CNN outputs. Furthermore, to enrich multi-scale representations learned by BDCN, we introduce a Scale Enhancement Module (SEM) which utilizes dilated convolution to generate multi-scale features, instead of using deeper CNNs or explicitly fusing multi-scale edge maps. These new approaches encourage the learning of multi-scale representations in different layers and detect edges that are well delineated by their scales. Learning scale dedicated layers also results in compact network with a fraction of parameters. We evaluate our method on three datasets, i.e., BSDS500, NYUDv2, and Multicue, and achieve ODS Fmeasure of 0.828, 1.3% higher than current state-of-the art on BSDS500.

204 citations

Proceedings ArticleDOI
05 Apr 2017
TL;DR: This article proposed a deep layer cascade (LC) method to improve the accuracy and speed of semantic segmentation, which treats a single deep model as a cascade of several sub-models, and progressively feed forward harder regions to the next sub-model for processing.
Abstract: We propose a novel deep layer cascade (LC) method to improve the accuracy and speed of semantic segmentation. Unlike the conventional model cascade (MC) that is composed of multiple independent models, LC treats a single deep model as a cascade of several sub-models. Earlier sub-models are trained to handle easy and confident regions, and they progressively feed-forward harder regions to the next sub-model for processing. Convolutions are only calculated on these regions to reduce computations. The proposed method possesses several advantages. First, LC classifies most of the easy regions in the shallow stage and makes deeper stage focuses on a few hard regions. Such an adaptive and difficulty-aware learning improves segmentation performance. Second, LC accelerates both training and testing of deep network thanks to early decisions in the shallow stage. Third, in comparison to MC, LC is an end-to-end trainable framework, allowing joint learning of all sub-models. We evaluate our method on PASCAL VOC and Cityscapes datasets, achieving state-of-the-art performance and fast speed.

194 citations