scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Object Detection Techniques: Overview and Performance Comparison

TL;DR: This paper will discuss both methods of object detection and compare them in terms of accuracy, complexity and practicality, and show advantages and also limitations of each method, and possibilities for improvement.
Abstract: Object detection algorithms are improving by the minute. There are many common libraries or application program interfaces (APIs) to use. The most two common ones are Microsoft Azure Cloud object detection and Google Tensorflow object detection. The first is an online-network based API, while the second is an offline machine-based API. Both have their advantages and disadvantages. A direct comparison between the most common object detection methods helps in finding the best solution for advance system integration. This paper will discuss both methods and compare them in terms of accuracy, complexity and practicality. It will show advantages and also limitations of each method, and possibilities for improvement.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The performance comparison with two existing systems shows that the proposed system provides a very close performance to the benchmarks with advantages of portability easy-to-use and no requirement for cloud services.
Abstract: This article presents an indoor assistive system that addresses the challenges faced by visually impaired individuals. The proposed system helps the visually impaired individuals to move indoor and...

5 citations


Cites background or methods from "Object Detection Techniques: Overvi..."

  • ...Our prior work presented in (Noman et al., 2019), reports a detailed comparison of the performance of two existing systems, Azure Cloud and Microsoft Kinect, used as benchmark in this study....

    [...]

  • ...A detailed performance study of these two existing systems can be found in (Noman et al., 2019)....

    [...]

Proceedings ArticleDOI
11 Aug 2021
TL;DR: In this article, a Tensorflow neural network is used as the object detector, which processes images in real-time and report human objects if detected on the ground, and the drone, once detects signs of human presence in realtime aerial videos, autonomously navigates towards the suspicious location to get a better view to verify human presence.
Abstract: Drones can be very useful in search and rescue missions primarily due to the aerial imaging capability. Drones can assist ground teams searching for a missing person, and law enforcement officers in crowd control. But there are instances where the observer on the ground control station misses the subtle information on the video feed coming from the drone. In fact, it is a very difficult job for the observer to be so vigilant in viewing a lot of images looking for a sign of a human in those images. This paper presents a drone-based human identification system to make search and rescue missions more effective. The drone, once detects signs of human presence in real-time aerial videos, autonomously navigates towards the suspicious location to get a better view to verify human presence. The drone has been made fully autonomous with operator override as an option. It processes images and sends selected frames for the operator for verification and issuance of instructions for follow-up actions. A custom built Tensorflow neural network is used as the object detector, which processes images in real-time and report human objects if detected on the ground. A single board computer, while running the real-time image processing, reads GPS location also and generates flight commands to the flight controller for autonomous navigation.

4 citations

Proceedings ArticleDOI
Vivek K K1, Sidharth R1, Rohit P1, Vishagan S1, Peeyush K. P1 
04 Aug 2021
TL;DR: In this paper, an autonomous robot which uses deep learning algorithm to find the difference between weed and plant so we could use the pesticides wisely and remove the weeds properly with less human effort and decided to make this robot connected through cloud(internet) to make the robot lite and simple.
Abstract: To create a future in which both people and nature can survive? This is the biggest question of our time! In the next few decades, we need to do something and find a sustainable coexistence on the earth. During the past years, we have grown with nature and learned to tame the wild, during the process, our population boomed so did our demands. So, the ecological footprints tend to rise. This can be addressed by upgrading to efficient food production and reducing our consumption of meat. We would require far less land and resources for ourselves and leave more land for the grasslands and reduce deforestation. So, we decided to take a small step forward for efficient food production in agriculture by helping the farmers to reduce the usage of pesticides and improve the efficiency of production. Hence, we planned an autonomous robot which uses deep learning algorithm to find the difference between weed and plant so we could use the pesticides wisely and remove the weeds properly with less human effort and we decided to make this robot connected through cloud(internet) to make the robot lite and simple. The algorithm helps to identify the pests and the crops precisely and the process is actuated via internet.

1 citations

Journal ArticleDOI
TL;DR: The ability to record detailed distance data with cell phones facilitates the production of high-quality three-dimensional scans in a discrete manner which directly threatens the security of private compounds, so it behooves the organizations in charge ofprivate compounds to detect LiDAR activity.
Abstract: Secured compounds often safeguard physical layout details of both internal and external facilities, but these details are at risk due to the growing inclusion of Light Detection and Ranging (LiDAR) sensors in consumer off-the-shelf (COTS) technology such as cell phones. The ability to record detailed distance data with cell phones facilitates the production of high-quality three-dimensional scans in a discrete manner which directly threatens the security of private compounds. Therefore, it behooves the organizations in charge of private compounds to detect LiDAR activity. Many security cameras already detect LiDAR sources as generic light sources in specific conditions, but further analysis must identify these light sources as LiDAR sources in order to alert an organization of a potential security incident. Testing confirms the feasibility of identifying some LiDAR sources based on the color and intensity of light shined directly into a camera sensor, but this analysis proves inadequate for cell phone LiDAR. However, the unique intensity and pattern characteristics of cell phone LiDAR reflected off a surface can potentially be identified by an object identification machine learning model. In order to train a model to identify a LiDAR object, we must first produce a training dataset containing marked and labelled LiDAR objects. To do this, we apply an image thresholding algorithm to isolate the LiDAR object in an image to calculate its bounding box. The image thresholding algorithm directly affects the bounding box accuracy, so we test two different algorithms and find that Otsu’s image thresholding algorithm performs best, resulting in 99.5% accurate bounding boxes.
Journal ArticleDOI
TL;DR: In this paper , two different neural network architectures based on Fully Convolutional Regression Networks (FCRN) and U-Shaped CNN for Image Segmentation (U-Net) were used to detect the number of undigested grains in dropping images.
Abstract: Simple Summary This study employs Fully Convolutional Regression Networks (FCRN) and U-Shaped Convolutional Network for Image Segmentation (U-Net) architectures tailored to the dataset containing dropping images of dairy cows collected from three different private dairy farms in Nigde. The main purpose of this study is to detect the number of undigested grains in dropping images in order to give some useful feedback to raiser. It is a novel study that uses two different regression neural networks on object counting in dropping images. To our knowledge, it is the first study that counts objects in dropping images and provides information of how effectively dairy cows digest their daily rations. Abstract Deep learning algorithms can now be used to identify, locate, and count items in an image thanks to advancements in image processing technology. The successful application of image processing technology in different fields has attracted much attention in the field of agriculture in recent years. This research was done to ascertain the number of indigestible cereal grains in animal feces using an image processing method. In this study, a regression-based way of object counting was used to predict the number of cereal grains in the feces. For this purpose, we have developed two different neural network architectures based upon Fully Convolutional Regression Networks (FCRN) and U-Net. The images used in the study were obtained from three different dairy cows enterprises operating in Nigde Province. The dataset consists of the 277 distinct dropping images of dairy cows in the farm. According to findings of the study, both models yielded quite acceptable prediction accuracy with U-Net providing slightly better prediction with a MAE value of 16.69 in the best case, compared to 23.65 MAE value of FCRN with the same batch.
References
More filters
Book ChapterDOI
06 Sep 2014
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.
Abstract: We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance segmentations to aid in precise object localization. Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. With a total of 2.5 million labeled instances in 328k images, the creation of our dataset drew upon extensive crowd worker involvement via novel user interfaces for category detection, instance spotting and instance segmentation. We present a detailed statistical analysis of the dataset in comparison to PASCAL, ImageNet, and SUN. Finally, we provide baseline performance analysis for bounding box and segmentation detection results using a Deformable Parts Model.

30,462 citations


"Object Detection Techniques: Overvi..." refers background in this paper

  • ...Common Objects in Context (COCO) [8] SSD-based model is one of them....

    [...]

Proceedings ArticleDOI
27 Jun 2016
TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Abstract: We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

27,256 citations


"Object Detection Techniques: Overvi..." refers methods in this paper

  • ...To overcome these issues, two architectures were recently proposed: YOLO (You Only Look Once) [4] and SSD (Single Shot Multi-Box Detector) [5]....

    [...]

Journal ArticleDOI
TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Abstract: State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features—using the recently popular terminology of neural networks with ’attention’ mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3] , our detection system has a frame rate of 5 fps ( including all steps ) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

26,458 citations


"Object Detection Techniques: Overvi..." refers methods in this paper

  • ...Faster-R-CNN has the highest raw accuracy, and YOLO has worse accuracy compared to Faster-R-CNN and SSD, while SSD is very close to R-CNN in accuracy....

    [...]

  • ...Some more recent improvements of the algorithm include Fast-R-CNN [2] and Faster-R-CNN [3]....

    [...]

Posted Content
TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.
Abstract: State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features---using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

23,183 citations

Proceedings ArticleDOI
23 Jun 2014
TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.
Abstract: Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.

21,729 citations

Trending Questions (1)
What are some other object detection techniques that are used to achieve organisational transformation?

The paper does not mention any other object detection techniques that are used to achieve organizational transformation.