scispace - formally typeset
Search or ask a question
Journal ArticleDOI

GPSD: generative parking spot detection using multi-clue recovery model

TL;DR: Wang et al. as discussed by the authors proposed a multi-clue recovery model to reconstruct parking spots, which can reach more than 80% accuracy in most test cases, compared with several existing algorithms, and the experimental result shows that it has a higher accuracy than others.
Abstract: Due to various complex environmental factors and parking scenes, there are more stringent requirements for automatic parking than the manual one. The existing auto-parking technology is based on space or plane dimension, where the former usually ignores the ground parking spot lines which may cause parking at a wrong position, while the latter often costs a lot of time in object classification which may decreases the algorithm applicability. In this paper, we propose a Generative Parking Spot Detection algorithm which uses a multi-clue recovery model to reconstruct parking spots. In the proposed method, we firstly dismantle the parking spot geometrically for marking the location of its corresponding corners and then use a micro-target recognition network to find corners from the ground image taken by car cameras. After these, we use the multi-clue model to correct the fully pairing map so that the reliable true parking spot can be recovered correctly. The proposed algorithm is compared with several existing algorithms, and the experimental result shows that it has a higher accuracy than others which can reach more than 80% in most test cases.
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, a strategy for generating the shortest parking path based on a bidirectional breadth-first search algorithm combined with a modified Bellman-Ford algorithm is proposed for automatic parking systems.
Abstract: The development of the automobile industry and the increase in car ownership has brought great traffic pressures to the city, among which, the difficulty of parking has become a serious problem to the majority of drivers. An automatic parking system can help drivers to complete parking operation or automatic parking task, and a decision control system is an important part of automatic parking system. In this paper, a strategy for generating the shortest parking path based on a bidirectional breadth-first search algorithm combined with a modified Bellman–Ford algorithm is proposed for automatic parking systems. Experimental results show that this scheme can improve the performance of an automatic parking system, especially in a complex environment.

3 citations

Book ChapterDOI
TL;DR: Wang et al. as mentioned in this paper proposed a power line detection method based on feature fusion deep learning network, which makes full use of the fusion features, which is combined with the inherent features and auxiliary information of aerial power line images.
Abstract: AbstractNowadays, the network of transmission lines is gradually spreading all over the world. With the popularization of UAV and helicopter applications, it is of great significance for low-altitude safety aircraft to detect power lines in advance and implement obstacle avoidance. The Power Line Detection (PLD) in a complex background environment is particularly important. In order to solve the problem of false detection of power lines caused by complex background images, a PLD method based on feature fusion deep learning network is proposed in this paper. Firstly, in view of the problems of low accuracy and poor generalization by using the traditional PLD in complex background environments, a rough extraction module that makes full use of the fusion features is constructed, which is combined with the inherent features and auxiliary information of aerial power line images. Secondly, an output fusion module is constructed, the weights of which are actively learned in the network training session. Finally, the fusion module fuses the decisions of different depths for output. The experimental results show that the proposed method can effectively improve the accuracy of power line detection.KeywordsDeep learningFeature fusionPower Line DetectionAuxiliary information
Journal ArticleDOI
TL;DR: Experimental results show that the improved “feature bag” and “spatial pyramid matching” algorithms on the basis of 3D feature extraction algorithm can effectively utilize the spatial information of three-dimensional images and achieve satisfactory results in the classification and recognition of human magnetic resonance images.
Abstract: Image classification and recognition has a very wide range of applications in computer vision, which involves many fields, such as image retrieval, image analysis, and robot positioning. Especially with the rise of brain science and cognitive science research, as well as the increasing diversification of imaging means, three-dimensional image data mainly based on magnetic resonance image plays an increasingly important role in image classification and recognition, especially in medical image classification and recognition. However, due to the high dimensional characteristics of human magnetic resonance images, human readability is reduced. Therefore, classification and recognition of 3-dimensional images is still a challenge. In order to better extract local features from images and effectively use their spatial information, this paper improved the “feature bag” and “spatial pyramid matching” algorithms on the basis of 3D feature extraction algorithm and proposed an image classification framework based on 3D feature extraction algorithm. Firstly, the multiresolution “3D spatial pyramid” algorithm, the multiscale image segmentation and image representation method, and the SVM classifier and feature fusion method are described. Secondly, the gender information contained in the magnetic resonance images is classified and recognized on the three databases selected in the experiment. Experimental results show that this method can effectively utilize the spatial information of three-dimensional images and achieve satisfactory results in the classification and recognition of human magnetic resonance images.
References
More filters
Proceedings ArticleDOI
27 Jun 2016
TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Abstract: We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

27,256 citations

Posted Content
TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.
Abstract: State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features---using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

23,183 citations

Book ChapterDOI
08 Oct 2016
TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.
Abstract: We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape. Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. SSD is simple relative to methods that require object proposals because it completely eliminates proposal generation and subsequent pixel or feature resampling stages and encapsulates all computation in a single network. This makes SSD easy to train and straightforward to integrate into systems that require a detection component. Experimental results on the PASCAL VOC, COCO, and ILSVRC datasets confirm that SSD has competitive accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference. For \(300 \times 300\) input, SSD achieves 74.3 % mAP on VOC2007 test at 59 FPS on a Nvidia Titan X and for \(512 \times 512\) input, SSD achieves 76.9 % mAP, outperforming a comparable state of the art Faster R-CNN model. Compared to other single stage methods, SSD has much better accuracy even with a smaller input image size. Code is available at https://github.com/weiliu89/caffe/tree/ssd.

19,543 citations

Proceedings ArticleDOI
Tsung-Yi Lin1, Priya Goyal2, Ross Girshick2, Kaiming He2, Piotr Dollár2 
07 Aug 2017
TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
Abstract: The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. To evaluate the effectiveness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors.

12,161 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: YOLO9000 as discussed by the authors is a state-of-the-art real-time object detection system that can detect over 9000 object categories in real time using a novel multi-scale training method, offering an easy tradeoff between speed and accuracy.
Abstract: We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both novel and drawn from prior work. The improved model, YOLOv2, is state-of-the-art on standard detection tasks like PASCAL VOC and COCO. Using a novel, multi-scale training method the same YOLOv2 model can run at varying sizes, offering an easy tradeoff between speed and accuracy. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like Faster RCNN with ResNet and SSD while still running significantly faster. Finally we propose a method to jointly train on object detection and classification. Using this method we train YOLO9000 simultaneously on the COCO detection dataset and the ImageNet classification dataset. Our joint training allows YOLO9000 to predict detections for object classes that dont have labelled detection data. We validate our approach on the ImageNet detection task. YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite only having detection data for 44 of the 200 classes. On the 156 classes not in COCO, YOLO9000 gets 16.0 mAP. YOLO9000 predicts detections for more than 9000 different object categories, all in real-time.

9,132 citations