scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Real-Time Vehicle Counting, Speed Estimation, and Classification System Based on Virtual Detection Zone and YOLO

TL;DR: In this article, a real-time traffic monitoring system based on a virtual detection zone, Gaussian mixture model (GMM), and You Only Look Once (YOLO) is presented.
Abstract: In recent years, vehicle detection and classification have become essential tasks of intelligent transportation systems, and real-time, accurate vehicle detection from image and video data for traffic monitoring remains challenging. The most noteworthy challenges are real-time system operation to accurately locate and classify vehicles in traffic flows and working around total occlusions that hinder vehicle tracking. For real-time traffic monitoring, we present a traffic monitoring approach that overcomes the abovementioned challenges by employing convolutional neural networks that utilize You Only Look Once (YOLO). A real-time traffic monitoring system has been developed, and it has attracted significant attention from traffic management departments. Digitally processing and analyzing these videos in real time is crucial for extracting reliable data on traffic flow. Therefore, this study presents a real-time traffic monitoring system based on a virtual detection zone, Gaussian mixture model (GMM), and YOLO to increase the vehicle counting and classification efficiency. GMM and a virtual detection zone are used for vehicle counting, and YOLO is used to classify vehicles. Moreover, the distance and time traveled by a vehicle are used to estimate the speed of the vehicle. In this study, the Montevideo Audio and Video Dataset (MAVD), the GARM Road-Traffic Monitoring data set (GRAM-RTM), and our collection data sets are used to verify the proposed method. Experimental results indicate that the proposed method with YOLOv4 achieved the highest classification accuracy of 98.91% and 99.5% in MAVD and GRAM-RTM data sets, respectively. Moreover, the proposed method with YOLOv4 also achieves the highest classification accuracy of 99.1%, 98.6%, and 98% in daytime, night time, and rainy day, respectively. In addition, the average absolute percentage error of vehicle speed estimation with the proposed method is about 7.6%.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: In this paper , the performance parameters of a permanent magnet synchronous motor meeting the requirements of vehicle operation are estimated by analyzing the parameters of electric vehicle, and the motor is modeled based on the motor design software Motor-CAD.
Abstract: With the popularity of electric vehicles, electric vehicles have become a hot research topic, and the performance of permanent magnet synchronous motor has become increasingly prominent in recent years. Therefore, the application of permanent magnet synchronous motor in electric vehicles has become a possibility. Based on the analysis of the existing permanent magnet synchronous motor technology for electric vehicles, this paper will start from the motor body design. In this paper, the performance parameters of permanent magnet synchronous motor meeting the requirements of vehicle operation are estimated by analyzing the parameters of electric vehicle, and the motor is modeled based on the motor design software Motor-CAD. In this paper, the parametric analysis module in the software is used to optimize the parameters of motor permanent magnet, stator slot width, air gap magnetic density, and air gap length. Finally, this paper uses OptiSLang software to simulate and analyze the permanent magnet synchronous motor. It improves the driving performance of automobile motor and reduces energy consumption, so that the permanent magnet synchronous motor can better meet the requirements of electric vehicles. The experimental results show that the optimal range of motor split ratio is large, in the range of 0.85~0.9.

1 citations

Proceedings ArticleDOI
28 Apr 2023
TL;DR: In this paper , the authors proposed a methodology for video processing of vehicles using YOLO, which provides suitable data of traffic flow and traffic during peak hours on road, which can be used for the counting and detection of vehicles.
Abstract: The Growth in use of vehicles on the road is a matter of worry for management authorities as it needs a faster and reliable method to manage traffic and its data. The detection of vehicles and count is very imperative in traffic studies. Ultrasonic sensors, Loop detectors these kinds of conventional sensors may damage the road surface. These sensors are required to be installed in urban areas which may increases the cost of work. Surveillance video cameras are being generally used for the purpose of traffic monitoring it provides the videos which can be used for the counting and detection of vehicles. The vehicle counting process delivers suitable data of traffic flow and traffic during peak hours on road. This project proposes the methodology for video processing of vehicles using YOLO.
Journal ArticleDOI
TL;DR: In this article , a spotted hyena optimizer with deep learning-enabled vehicle counting and classification (SHODL-VCC) model was proposed for the ITSs to achieve accurate vehicle detection and counting from traffic videos.
Abstract: <abstract> <p>Traffic surveillance systems are utilized to collect and monitor the traffic condition data of the road networks. This data plays a crucial role in a variety of applications of the Intelligent Transportation Systems (ITSs). In traffic surveillance, it is challenging to achieve accurate vehicle detection and count the vehicles from traffic videos. The most notable difficulties include real-time system operations for precise classification, identification of the vehicles' location in traffic flows and functioning around total occlusions that hamper the vehicle tracking process. Conventional video-related vehicle detection techniques such as optical flow, background subtraction and frame difference have certain limitations in terms of efficiency or accuracy. Therefore, the current study proposes to design the spotted hyena optimizer with deep learning-enabled vehicle counting and classification (SHODL-VCC) model for the ITSs. The aim of the proposed SHODL-VCC technique lies in accurate counting and classification of the vehicles in traffic surveillance. To achieve this, the proposed SHODL-VCC technique follows a two-stage process that includes vehicle detection and vehicle classification. Primarily, the presented SHODL-VCC technique employs the RetinaNet object detector to identify the vehicles. Next, the detected vehicles are classified into different class labels using the deep wavelet auto-encoder model. To enhance the vehicle detection performance, the spotted hyena optimizer algorithm is exploited as a hyperparameter optimizer, which considerably enhances the vehicle detection rate. The proposed SHODL-VCC technique was experimentally validated using different databases. The comparative outcomes demonstrate the promising vehicle classification performance of the SHODL-VCC technique in comparison with recent deep learning approaches.</p> </abstract>
Journal ArticleDOI
20 Feb 2023-Data
TL;DR: In this paper , a dataset of public objects in an uncontrolled environment for navigation aiding is created. But the dataset contains three classes of objects which commonly exist on pavements in the city and was verified that the dataset was of high quality for object detection and distance estimation.
Abstract: Computer vision is a new approach to navigation aiding that assists visually impaired people to travel independently. A deep learning-based solution implemented on a portable device that uses a monocular camera to capture public objects could be a low-cost and handy navigation aid. By recognizing public objects in the street and estimating their distance from the user, visually impaired people are able to avoid obstacles in the outdoor environment and walk safely. In this paper, we created a dataset of public objects in an uncontrolled environment for navigation aiding. The dataset contains three classes of objects which commonly exist on pavements in the city. It was verified that the dataset was of high quality for object detection and distance estimation, and was ultimately utilized as a navigation aid solution.
Proceedings ArticleDOI
01 Jul 2022
TL;DR: In this paper , the authors identify the nearest fuel stations in the user's vicinity and subsequently group them by proximity and arrange them in ascending order of vehicular density to solve large waiting time inconveniences and avoid congestion.
Abstract: The paper tries to identify the nearest fuel stations in the user's vicinity and subsequently group them by proximity and arrange them in ascending order of vehicular density to solve large waiting time inconveniences and avoid congestion. A list of the services and the list of fuels available at the respective stations is also displayed. Calculation of vehicular density is done on the YOLO (You Only Look Once) Algorithm. Region of interest is cropped from the live video feed from the station. The number of vehicles at the gas station is updated on the real-time database. The user can make use of the model through the convenience of a mobile app which is in sync with the real time database and a filtered list is then analyzed for facilities available and displayed to the user to calculate the optimal proximal gas station and thereby, avoiding the hassles involved in queues.
References
More filters
Journal ArticleDOI
TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Abstract: State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features—using the recently popular terminology of neural networks with ’attention’ mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3] , our detection system has a frame rate of 5 fps ( including all steps ) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

26,458 citations

Book ChapterDOI
08 Oct 2016
TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.
Abstract: We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape. Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. SSD is simple relative to methods that require object proposals because it completely eliminates proposal generation and subsequent pixel or feature resampling stages and encapsulates all computation in a single network. This makes SSD easy to train and straightforward to integrate into systems that require a detection component. Experimental results on the PASCAL VOC, COCO, and ILSVRC datasets confirm that SSD has competitive accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference. For \(300 \times 300\) input, SSD achieves 74.3 % mAP on VOC2007 test at 59 FPS on a Nvidia Titan X and for \(512 \times 512\) input, SSD achieves 76.9 % mAP, outperforming a comparable state of the art Faster R-CNN model. Compared to other single stage methods, SSD has much better accuracy even with a smaller input image size. Code is available at https://github.com/weiliu89/caffe/tree/ssd.

19,543 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.
Abstract: Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But pyramid representations have been avoided in recent object detectors that are based on deep convolutional networks, partially because they are slow to compute and memory intensive. In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost. A top-down architecture with lateral connections is developed for building high-level semantic feature maps at all scales. This architecture, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications. Using a basic Faster R-CNN system, our method achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles, surpassing all existing single-model entries including those from the COCO 2016 challenge winners. In addition, our method can run at 5 FPS on a GPU and thus is a practical and accurate solution to multi-scale object detection. Code will be made publicly available.

16,727 citations

Journal ArticleDOI
TL;DR: In this article, the authors systematically summarize methodologies and discuss challenges for deep multi-modal object detection and semantic segmentation in autonomous driving and provide an overview of on-board sensors on test vehicles, open datasets, and background information for object detection.
Abstract: Recent advancements in perception for autonomous driving are driven by deep learning. In order to achieve robust and accurate scene understanding, autonomous vehicles are usually equipped with different sensors (e.g. cameras, LiDARs, Radars), and multiple sensing modalities can be fused to exploit their complementary properties. In this context, many methods have been proposed for deep multi-modal perception problems. However, there is no general guideline for network architecture design, and questions of “what to fuse”, “when to fuse”, and “how to fuse” remain open. This review paper attempts to systematically summarize methodologies and discuss challenges for deep multi-modal object detection and semantic segmentation in autonomous driving. To this end, we first provide an overview of on-board sensors on test vehicles, open datasets, and background information for object detection and semantic segmentation in autonomous driving research. We then summarize the fusion methodologies and discuss challenges and open questions. In the appendix, we provide tables that summarize topics and methods. We also provide an interactive online platform to navigate each reference: https://boschresearch.github.io/multimodalperception/ .

674 citations

Journal ArticleDOI
TL;DR: The fusion algorithm improves the robustness of the environment perception system and provides accurate environment perception information for the decision-making system and control system of autonomous vehicles.
Abstract: Radar and camera information fusion sensing methods are used to solve the inherent shortcomings of the single sensor in severe weather. Our fusion scheme uses radar as the main hardware and camera as the auxiliary hardware framework. At the same time, the Mahalanobis distance is used to match the observed values of the target sequence. Data fusion based on the joint probability function method. Moreover, the algorithm was tested using actual sensor data collected from a vehicle, performing real-time environment perception. The test results show that radar and camera fusion algorithms perform better than single sensor environmental perception in severe weather, which can effectively reduce the missed detection rate of autonomous vehicle environment perception in severe weather. The fusion algorithm improves the robustness of the environment perception system and provides accurate environment perception information for the decision-making system and control system of autonomous vehicles.

102 citations