Deep GoogLeNet Features for Visual Object Tracking

doi:10.1109/ICIINFS.2018.8721317

Home
/
Papers
/
Deep GoogLeNet Features for Visual Object Tracking

Proceedings Article•DOI•

Deep GoogLeNet Features for Visual Object Tracking

P. Aswathy¹, Siddhartha¹, Deepak Mishra¹•Institutions (1)

Indian Institute of Space Science and Technology¹

01 Dec 2018-pp 60-66

TL;DR: This study demonstrates for the first time, the viability of features extracted from deep layers of GoogLeNet CNN architecture for the purpose of object tracking, and integrated Goog LeNet features in a discriminative correlation filter based tracking framework.

read less

Abstract: Convolutional Neural Network (CNN) has recently become very popular in visual object tracking due to their strong feature representation capabilities. Almost all of the CNN based trackers currently use the features extracted from shallow convolutional layers of VGGNet architecture. This paper presents an investigation of the impact of deep convolutional layer features in an object tracking framework. In this study, we demonstrate for the first time, the viability of features extracted from deep layers of GoogLeNet CNN architecture for the purpose of object tracking. We integrated GoogLeNet features in a discriminative correlation filter based tracking framework. Our experimental results show that the GoogLeNet features provides significant computational advantages over the conventionally used VGGNet features, without much compromise on the tracking performance. It was observed that features obtained from inception modules of GoogLeNet have high depths. Further, Principal Component Analysis (PCA) was employed to reduce the dimensionality of the extracted features. This greatly reduces the computational cost and thus improve the speed of the tracking process. Extensive evaluation have been performed on three benchmark datasets: OTB, ALOV300++ and VOT2016 datasets and its performances are measured in terms of metrics like F-score, One Pass Evaluation, robustness and accuracy.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Within the lack of chest COVID-19 X-ray dataset: A novel detection model based on GAN and deep transfer learning

[...]

Mohamed Loey¹, Florentin Smarandache², Nour Eldeen M. Khalifa³•Institutions (3)

Banha University¹, University of New Mexico², Cairo University³

01 Apr 2020-Symmetry

TL;DR: The main idea is to collect all the possible images for COVID-19 that exists until the writing of this research and use the GAN network to generate more images to help in the detection of this virus from the available X-rays images with the highest accuracy possible.

...read moreread less

Abstract: The coronavirus (COVID-19) pandemic is putting healthcare systems across the world under unprecedented and increasing pressure according to the World Health Organization (WHO). With the advances in computer algorithms and especially Artificial Intelligence, the detection of this type of virus in the early stages will help in fast recovery and help in releasing the pressure off healthcare systems. In this paper, a GAN with deep transfer learning for coronavirus detection in chest X-ray images is presented. The lack of datasets for COVID-19 especially in chest X-rays images is the main motivation of this scientific study. The main idea is to collect all the possible images for COVID-19 that exists until the writing of this research and use the GAN network to generate more images to help in the detection of this virus from the available X-rays images with the highest accuracy possible. The dataset used in this research was collected from different sources and it is available for researchers to download and use it. The number of images in the collected dataset is 307 images for four different types of classes. The classes are the COVID-19, normal, pneumonia bacterial, and pneumonia virus. Three deep transfer models are selected in this research for investigation. The models are the Alexnet, Googlenet, and Restnet18. Those models are selected for investigation through this research as it contains a small number of layers on their architectures, this will result in reducing the complexity, the consumed memory and the execution time for the proposed model. Three case scenarios are tested through the paper, the first scenario includes four classes from the dataset, while the second scenario includes 3 classes and the third scenario includes two classes. All the scenarios include the COVID-19 class as it is the main target of this research to be detected. In the first scenario, the Googlenet is selected to be the main deep transfer model as it achieves 80.6% in testing accuracy. In the second scenario, the Alexnet is selected to be the main deep transfer model as it achieves 85.2% in testing accuracy, while in the third scenario which includes two classes (COVID-19, and normal), Googlenet is selected to be the main deep transfer model as it achieves 100% in testing accuracy and 99.9% in the validation accuracy. All the performance measurement strengthens the obtained results through the research.

...read moreread less

391 citations

Cites methods from "Deep GoogLeNet Features for Visual ..."

...The used deep transfer learning CNN models investigated in this research are Alexnet [29], Resnet18 [39], Googlenet [60], The mentioned CNN models had a few numbers of layers if it is compared to large CNN models such as Xception [40], Densenet [42], and Inceptionresnet [61] which consist of 71, 201 and 164 layers accordingly....
[...]

Journal Article•DOI•

One-shot Cluster-Based Approach for the Detection of COVID-19 from Chest X-ray Images.

[...]

V. N. Manjunath Aradhya¹, Mufti Mahmud², Devanur S. Guru³, Basant Agarwal⁴, M. Shamim Kaiser⁵ - Show less +1 more•Institutions (5)

Sri Jayachamarajendra College of Engineering¹, Nottingham Trent University², University of Mysore³, Indian Institutes of Information Technology⁴, Jahangirnagar University⁵

02 Mar 2021-Cognitive Computation

TL;DR: In this paper, a cluster-based one-shot learning is introduced for detecting COVID-19 from chest X-ray images, which has an advantage of learning from a few samples against learning from many samples in case of deep leaning architectures.

...read moreread less

Abstract: Coronavirus disease (COVID-19) has infected over more than 28.3 million people around the globe and killed 913K people worldwide as on 11 September 2020. With this pandemic, to combat the spreading of COVID-19, effective testing methodologies and immediate medical treatments are much required. Chest X-rays are the widely available modalities for immediate diagnosis of COVID-19. Hence, automation of detection of COVID-19 from chest X-ray images using machine learning approaches is of greater demand. A model for detecting COVID-19 from chest X-ray images is proposed in this paper. A novel concept of cluster-based one-shot learning is introduced in this work. The introduced concept has an advantage of learning from a few samples against learning from many samples in case of deep leaning architectures. The proposed model is a multi-class classification model as it classifies images of four classes, viz., pneumonia bacterial, pneumonia virus, normal, and COVID-19. The proposed model is based on ensemble of Generalized Regression Neural Network (GRNN) and Probabilistic Neural Network (PNN) classifiers at decision level. The effectiveness of the proposed model has been demonstrated through extensive experimentation on a publicly available dataset consisting of 306 images. The proposed cluster-based one-shot learning has been found to be more effective on GRNN and PNN ensembled model to distinguish COVID-19 images from that of the other three classes. It has also been experimentally observed that the model has a superior performance over contemporary deep learning architectures. The concept of one-shot cluster-based learning is being first of its kind in literature, expected to open up several new dimensions in the field of machine learning which require further researching for various applications.

...read moreread less

40 citations

Posted Content•DOI•

One Shot Cluster Based Approach for the Detection of COVID-19 from Chest X-Ray Images

[...]

V. N. Manjunath Aradhya, Mufti Mahmud, Basant Agarwal, Devanur S. Guru, M. Shamim Kaiser - Show less +1 more

27 Jul 2020

TL;DR: Experiments conducted with publicly available chest x-ray images demonstrate that the proposed one shot cluster based approach for the accurate detection of COVID-19 accurately with high precision outperformed many of the convolutional neural network based existing methods proposed in the literature.

...read moreread less

31 citations

Cites result from "Deep GoogLeNet Features for Visual ..."

...From the Table 1 it is evident that the proposed idea with only 2 samples (one from each class) we were able to achieve 100% detection rate and the same has been compared with well-known deep learning models such as AlexNet [24], GoogLeNet [25] and ResNet [26]....
[...]

Journal Article•DOI•

Deep learning-based vehicle occupancy detection in an open parking lot using thermal camera

[...]

Vijay Paidi, Hasan Fleyeh, Roger G. Nyberg

01 Oct 2020-Iet Intelligent Transport Systems

TL;DR: Yolo, Yolo-conv, GoogleNet and ResNet18 are computationally efficient detectors which took less processing time and are suitable for real-time detection while Resnet50 was computationally expensive.

...read moreread less

Abstract: Parking has been a common problem over several years in many cities around the globe. The search for parking space leads to congestion, frustration and increased air pollution. Information of vacant parking space would facilitate to reduce congestion and subsequent air pollution. Therefore, the aim of the study is to acquire vehicle occupancy in an open parking lot using deep learning. Thermal camera was used to collect videos during varying environmental conditions and frames from these videos were extracted to prepare the dataset. The frames in the dataset were manually labelled as there were no pre-labelled thermal images available. Vehicle detection with deep learning algorithms was implemented to perform multi-object detection. Multiple deep learning networks such as Yolo, Yolo-conv, GoogleNet, ReNet18 and ResNet50 with varying layers and architectures were evaluated on vehicle detection. ResNet18 performed better than other detectors which had an average precision of 96.16 and log-average miss rate of 19.40. The detected results were compared with a template of parking spaces to identify vehicle occupancy information. Yolo, Yolo-conv, GoogleNet and ResNet18 are computationally efficient detectors which took less processing time and are suitable for real-time detection while Resnet50 was computationally expensive.

...read moreread less

18 citations

Journal Article•DOI•

A complex junction recognition method based on GoogLeNet model

[...]

Chengming Li¹, Chengming Li², Honggang Zhang¹, Pengda Wu, Yong Yin², Liu Sichao² - Show less +2 more•Institutions (2)

Shandong University of Technology¹, Shandong University of Science and Technology²

01 Dec 2020-Transactions in Gis

TL;DR: Experiments based on OSM data from Tianjin city, China, revealed that compared with state‐of‐the‐art methods, the proposed method effectively identified more types of complex junctions and achieved a significantly higher identification accuracy.

...read moreread less

11 citations

1
2
3
4
…
5

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Going deeper with convolutions

[...]

Christian Szegedy¹, Wei Liu², Yangqing Jia¹, Pierre Sermanet¹, Scott Reed³, Dragomir Anguelov¹, Dumitru Erhan¹, Vincent Vanhoucke¹, Andrew Rabinovich - Show less +5 more•Institutions (3)

Google¹, University of North Carolina at Chapel Hill², University of Michigan³

07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read moreread less

40,257 citations

Journal Article•DOI•

ImageNet Large Scale Visual Recognition Challenge

[...]

Olga Russakovsky¹, Jia Deng², Hao Su¹, Jonathan Krause¹, Sanjeev Satheesh¹, Sean Ma¹, Zhiheng Huang¹, Andrej Karpathy¹, Aditya Khosla³, Michael S. Bernstein¹, Alexander C. Berg⁴, Li Fei-Fei¹ - Show less +8 more•Institutions (4)

Stanford University¹, University of Michigan², Massachusetts Institute of Technology³, University of North Carolina at Chapel Hill⁴

01 Dec 2015-International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Abstract: The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the 5 years of the challenge, and propose future directions and improvements.

...read moreread less

30,811 citations

Posted Content•

Rich feature hierarchies for accurate object detection and semantic segmentation

[...]

Ross Girshick¹, Jeff Donahue¹, Trevor Darrell¹, Jitendra Malik¹•Institutions (1)

University of California, Berkeley¹

11 Nov 2013-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.

...read moreread less

Abstract: Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012---achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also compare R-CNN to OverFeat, a recently proposed sliding-window detector based on a similar CNN architecture. We find that R-CNN outperforms OverFeat by a large margin on the 200-class ILSVRC2013 detection dataset. Source code for the complete system is available at this http URL.

...read moreread less

13,081 citations

"Deep GoogLeNet Features for Visual ..." refers background in this paper

...Deep convolutional neural networks have clearly shown excellent performance in object recognition and object detection problems [6], [7], [21], and are therefore of interest for visual object tracking....
[...]

Journal Article•DOI•

Object tracking: A survey

[...]

Alper Yilmaz¹, Omar Javed, Mubarak Shah²•Institutions (2)

Ohio State University¹, University of Central Florida²

25 Dec 2006-ACM Computing Surveys

TL;DR: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

Abstract: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

5,318 citations

Journal Article•DOI•

High-Speed Tracking with Kernelized Correlation Filters

[...]

João F. Henriques¹, Rui Caseiro¹, Pedro Martins¹, Jorge Batista¹•Institutions (1)

University of Coimbra¹

01 Mar 2015-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new kernelized correlation filter is derived, that unlike other kernel algorithms has the exact same complexity as its linear counterpart, which is called dual correlation filter (DCF), which outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite being implemented in a few lines of code.

...read moreread less

Abstract: The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies—any overlapping pixels are constrained to be the same. Based on this simple observation, we propose an analytic model for datasets of thousands of translated patches. By showing that the resulting data matrix is circulant, we can diagonalize it with the discrete Fourier transform, reducing both storage and computation by several orders of magnitude. Interestingly, for linear regression our formulation is equivalent to a correlation filter, used by some of the fastest competitive trackers. For kernel regression, however, we derive a new kernelized correlation filter (KCF), that unlike other kernel algorithms has the exact same complexity as its linear counterpart. Building on it, we also propose a fast multi-channel extension of linear correlation filters, via a linear kernel, which we call dual correlation filter (DCF). Both KCF and DCF outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite running at hundreds of frames-per-second, and being implemented in a few lines of code (Algorithm 1). To encourage further developments, our tracking framework was made open-source.

...read moreread less

4,994 citations

"Deep GoogLeNet Features for Visual ..." refers methods in this paper

...Some of the popularly used features are Histogram of Oriented gradients (HOG) [16], Color names [13] and CNN features [8]–[10]....
[...]
...Feature representations such as HOG [16], Color Names etc. [13], [18] have been successfully employed in DCF based tracking frameworks....
[...]
...Till 2015, most of the trackers used the hand-crafted appearance features, such as HOG and color names for modelling the target object....
[...]
...Feature representations such as HOG [16], Color Names etc....
[...]
...Some of the popularly used features are Histogram of Oriented gradients (HOG) [16], Color names [13] and CNN features [8]–[10]....
[...]