Object Detection Techniques: Overview and Performance Comparison

doi:10.1109/ISSPIT47144.2019.9001879

Home
/
Papers
/
Object Detection Techniques: Overview and Performance Comparison

Proceedings Article•DOI•

Object Detection Techniques: Overview and Performance Comparison

Mohammed Noman¹, Vladimir Stankovic¹, A. Tawfik²•Institutions (2)

University of Strathclyde¹, Ajman University of Science and Technology²

04 Nov 2019-pp 1-5

TL;DR: This paper will discuss both methods of object detection and compare them in terms of accuracy, complexity and practicality, and show advantages and also limitations of each method, and possibilities for improvement.

read less

Abstract: Object detection algorithms are improving by the minute. There are many common libraries or application program interfaces (APIs) to use. The most two common ones are Microsoft Azure Cloud object detection and Google Tensorflow object detection. The first is an online-network based API, while the second is an offline machine-based API. Both have their advantages and disadvantages. A direct comparison between the most common object detection methods helps in finding the best solution for advance system integration. This paper will discuss both methods and compare them in terms of accuracy, complexity and practicality. It will show advantages and also limitations of each method, and possibilities for improvement.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Portable offline indoor object recognition system for the visually impaired

[...]

Mohammed Noman¹, Vladimir Stankovic¹, A. Tawfik²•Institutions (2)

University of Strathclyde¹, Ajman University of Science and Technology²

23 Sep 2020-Cogent engineering

TL;DR: The performance comparison with two existing systems shows that the proposed system provides a very close performance to the benchmarks with advantages of portability easy-to-use and no requirement for cloud services.

...read moreread less

Abstract: This article presents an indoor assistive system that addresses the challenges faced by visually impaired individuals. The proposed system helps the visually impaired individuals to move indoor and...

...read moreread less

5 citations

Cites background or methods from "Object Detection Techniques: Overvi..."

...Our prior work presented in (Noman et al., 2019), reports a detailed comparison of the performance of two existing systems, Azure Cloud and Microsoft Kinect, used as benchmark in this study....
[...]
...A detailed performance study of these two existing systems can be found in (Noman et al., 2019)....
[...]

Proceedings Article•DOI•

Drone-based Autonomous Human Identification for Search and Rescue Missions in Real-time

[...]

K. Jayalath¹, S.R. Munasinghe¹•Institutions (1)

University of Moratuwa¹

11 Aug 2021

TL;DR: In this article, a Tensorflow neural network is used as the object detector, which processes images in real-time and report human objects if detected on the ground, and the drone, once detects signs of human presence in realtime aerial videos, autonomously navigates towards the suspicious location to get a better view to verify human presence.

...read moreread less

Abstract: Drones can be very useful in search and rescue missions primarily due to the aerial imaging capability. Drones can assist ground teams searching for a missing person, and law enforcement officers in crowd control. But there are instances where the observer on the ground control station misses the subtle information on the video feed coming from the drone. In fact, it is a very difficult job for the observer to be so vigilant in viewing a lot of images looking for a sign of a human in those images. This paper presents a drone-based human identification system to make search and rescue missions more effective. The drone, once detects signs of human presence in real-time aerial videos, autonomously navigates towards the suspicious location to get a better view to verify human presence. The drone has been made fully autonomous with operator override as an option. It processes images and sends selected frames for the operator for verification and issuance of instructions for follow-up actions. A custom built Tensorflow neural network is used as the object detector, which processes images in real-time and report human objects if detected on the ground. A single board computer, while running the real-time image processing, reads GPS location also and generates flight commands to the flight controller for autonomous navigation.

...read moreread less

4 citations

Proceedings Article•DOI•

Pests & weed control autonomous robot using machine vision

[...]

Vivek K K¹, Sidharth R¹, Rohit P¹, Vishagan S¹, Peeyush K. P¹ - Show less +1 more•Institutions (1)

Amrita Vishwa Vidyapeetham¹

04 Aug 2021

TL;DR: In this paper, an autonomous robot which uses deep learning algorithm to find the difference between weed and plant so we could use the pesticides wisely and remove the weeds properly with less human effort and decided to make this robot connected through cloud(internet) to make the robot lite and simple.

...read moreread less

Abstract: To create a future in which both people and nature can survive? This is the biggest question of our time! In the next few decades, we need to do something and find a sustainable coexistence on the earth. During the past years, we have grown with nature and learned to tame the wild, during the process, our population boomed so did our demands. So, the ecological footprints tend to rise. This can be addressed by upgrading to efficient food production and reducing our consumption of meat. We would require far less land and resources for ourselves and leave more land for the grasslands and reduce deforestation. So, we decided to take a small step forward for efficient food production in agriculture by helping the farmers to reduce the usage of pesticides and improve the efficiency of production. Hence, we planned an autonomous robot which uses deep learning algorithm to find the difference between weed and plant so we could use the pesticides wisely and remove the weeds properly with less human effort and we decided to make this robot connected through cloud(internet) to make the robot lite and simple. The algorithm helps to identify the pests and the crops precisely and the process is actuated via internet.

...read moreread less

1 citations

Journal Article•DOI•

Analysis of Image Thresholding Algorithms for Automated Machine Learning Training Data Generation

[...]

Tristan G. Creek, Barry G. Mullins

02 Mar 2022-Proceedings of the ... international conference on information warfare and security

TL;DR: The ability to record detailed distance data with cell phones facilitates the production of high-quality three-dimensional scans in a discrete manner which directly threatens the security of private compounds, so it behooves the organizations in charge ofprivate compounds to detect LiDAR activity.

...read moreread less

Abstract: Secured compounds often safeguard physical layout details of both internal and external facilities, but these details are at risk due to the growing inclusion of Light Detection and Ranging (LiDAR) sensors in consumer off-the-shelf (COTS) technology such as cell phones. The ability to record detailed distance data with cell phones facilitates the production of high-quality three-dimensional scans in a discrete manner which directly threatens the security of private compounds. Therefore, it behooves the organizations in charge of private compounds to detect LiDAR activity. Many security cameras already detect LiDAR sources as generic light sources in specific conditions, but further analysis must identify these light sources as LiDAR sources in order to alert an organization of a potential security incident. Testing confirms the feasibility of identifying some LiDAR sources based on the color and intensity of light shined directly into a camera sensor, but this analysis proves inadequate for cell phone LiDAR. However, the unique intensity and pattern characteristics of cell phone LiDAR reflected off a surface can potentially be identified by an object identification machine learning model. In order to train a model to identify a LiDAR object, we must first produce a training dataset containing marked and labelled LiDAR objects. To do this, we apply an image thresholding algorithm to isolate the LiDAR object in an image to calculate its bounding box. The image thresholding algorithm directly affects the bounding box accuracy, so we test two different algorithms and find that Otsu’s image thresholding algorithm performs best, resulting in 99.5% accurate bounding boxes.

...read moreread less

Journal Article•DOI•

Determination of Non-Digestible Parts in Dairy Cattle Feces Using U-NET and F-CRN Architectures

[...]

Cevher Özden, Mutlu Bulut, Demet Çanga Boğa, Mustafa Boga

01 Jan 2023-Veterinary sciences

TL;DR: In this paper , two different neural network architectures based on Fully Convolutional Regression Networks (FCRN) and U-Shaped CNN for Image Segmentation (U-Net) were used to detect the number of undigested grains in dropping images.

...read moreread less

Abstract: Simple Summary This study employs Fully Convolutional Regression Networks (FCRN) and U-Shaped Convolutional Network for Image Segmentation (U-Net) architectures tailored to the dataset containing dropping images of dairy cows collected from three different private dairy farms in Nigde. The main purpose of this study is to detect the number of undigested grains in dropping images in order to give some useful feedback to raiser. It is a novel study that uses two different regression neural networks on object counting in dropping images. To our knowledge, it is the first study that counts objects in dropping images and provides information of how effectively dairy cows digest their daily rations. Abstract Deep learning algorithms can now be used to identify, locate, and count items in an image thanks to advancements in image processing technology. The successful application of image processing technology in different fields has attracted much attention in the field of agriculture in recent years. This research was done to ascertain the number of indigestible cereal grains in animal feces using an image processing method. In this study, a regression-based way of object counting was used to predict the number of cereal grains in the feces. For this purpose, we have developed two different neural network architectures based upon Fully Convolutional Regression Networks (FCRN) and U-Net. The images used in the study were obtained from three different dairy cows enterprises operating in Nigde Province. The dataset consists of the 277 distinct dropping images of dairy cows in the farm. According to findings of the study, both models yielded quite acceptable prediction accuracy with U-Net providing slightly better prediction with a MAE value of 16.69 in the best case, compared to 23.65 MAE value of FCRN with the same batch.

...read moreread less

References

PDF

Open Access

More filters

Book Chapter•DOI•

Microsoft COCO: Common Objects in Context

[...]

Tsung-Yi Lin¹, Michael Maire², Serge Belongie¹, James Hays, Pietro Perona², Deva Ramanan³, Piotr Dollár⁴, C. Lawrence Zitnick⁴ - Show less +4 more•Institutions (4)

Cornell University¹, California Institute of Technology², University of California, Irvine³, Microsoft⁴

06 Sep 2014

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Abstract: We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance segmentations to aid in precise object localization. Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. With a total of 2.5 million labeled instances in 328k images, the creation of our dataset drew upon extensive crowd worker involvement via novel user interfaces for category detection, instance spotting and instance segmentation. We present a detailed statistical analysis of the dataset in comparison to PASCAL, ImageNet, and SUN. Finally, we provide baseline performance analysis for bounding box and segmentation detection results using a Deformable Parts Model.

...read moreread less

30,462 citations

"Object Detection Techniques: Overvi..." refers background in this paper

...Common Objects in Context (COCO) [8] SSD-based model is one of them....
[...]

Proceedings Article•DOI•

You Only Look Once: Unified, Real-Time Object Detection

[...]

Joseph Redmon¹, Santosh K. Divvala², Ross Girshick³, Ali Farhadi²•Institutions (3)

University of Washington¹, Allen Institute for Artificial Intelligence², Facebook³

27 Jun 2016

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Abstract: We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

27,256 citations

"Object Detection Techniques: Overvi..." refers methods in this paper

...To overcome these issues, two architectures were recently proposed: YOLO (You Only Look Once) [4] and SSD (Single Shot Multi-Box Detector) [5]....
[...]

Journal Article•DOI•

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

[...]

Shaoqing Ren¹, Kaiming He², Ross Girshick³, Jian Sun²•Institutions (3)

University of Science and Technology of China¹, Microsoft², Facebook³

01 Jun 2017-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

Abstract: State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features—using the recently popular terminology of neural networks with ’attention’ mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3] , our detection system has a frame rate of 5 fps ( including all steps ) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

...read moreread less

26,458 citations

"Object Detection Techniques: Overvi..." refers methods in this paper

...Faster-R-CNN has the highest raw accuracy, and YOLO has worse accuracy compared to Faster-R-CNN and SSD, while SSD is very close to R-CNN in accuracy....
[...]
...Some more recent improvements of the algorithm include Fast-R-CNN [2] and Faster-R-CNN [3]....
[...]

Posted Content•

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

[...]

Shaoqing Ren¹, Kaiming He², Ross Girshick³, Jian Sun²•Institutions (3)

University of Science and Technology of China¹, Microsoft², Facebook³

04 Jun 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

...read moreread less

Abstract: State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features---using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

...read moreread less

23,183 citations

Proceedings Article•DOI•

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

[...]

Ross Girshick¹, Jeff Donahue¹, Trevor Darrell¹, Jitendra Malik¹•Institutions (1)

University of California, Berkeley¹

23 Jun 2014

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Abstract: Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.

...read moreread less

21,729 citations