scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Vehicle detection method with low-carbon technology in haze weather based on deep neural network

01 Jan 2022-International Journal of Low-carbon Technologies (International Journal of Low-carbon Technologies)-Vol. 17, pp 1151-1157
TL;DR: Zhang et al. as discussed by the authors improved the YOLOv3 algorithm by adding an attention module in the feature extraction and fusion stage to better focus on potential information and improve the detection accuracy in haze weather.
Abstract: Abstract Vehicle detection based on deep learning achieves excellent results in normal environments, but it is still challenging to detect objects in low-quality picture obtained in hazy weather. Existing methods tend to ignore favorable latent information and it is difficult to balance speed and accuracy, etc. Therefore, the existing deep neural network is studied, and the YOLOv3 algorithm is improved based on ResNet. Aiming at the problem of low utilization of shallow features, DensNet is added in the feature extraction stage to reduce feature loss and increase utilization. An attention module is added in the feature extraction and fusion stage to better focus on potential information and improve the detection accuracy in haze weather. In view of the difficulty of vehicle detection in haze weather, focal loss is introduced to give more weights to difficult samples, balance the number of difficult and easy samples and improve detection accuracy. The experimental results show that the recognition accuracy of the improved network for vehicles reaches 75%, which proves the effectiveness of the method.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed an improved You Only Look Once version 5 (YOLOv5) algorithm to reduce carbon emissions by adding an adaptive attention module to the Neck head and a spatial pyramid pooling module before its P3 and P4 outputs.
Abstract: Aiming at the decrease in the accuracy of traffic sign recognition due to dim light in the night environment, this paper proposes an improved you only look once version 5 (YOLOv5) algorithm to reduce carbon emissions. An improved adaptive histogram equalization method is designed to adjust the brightness and contrast of the image and highlight the detail information of traffic signs. In response to the higher requirements of the driving assistance system on the recognition model processing speed, the model is lightened and the standard convolution method of the backbone network is designed as a depth-separable convolution method, which greatly reduces the number of model parameters. To address the problem of feature loss during model learning, an improved feature pyramid AAM SPPF path aggregation network (AS-PAN) structure is proposed to enhance the learning capability of the model by adding an adaptive attention module to the Neck head and a spatial pyramid pooling module before its P3 and P4 outputs. Finally, the traditional non-maximum suppression (NMS) generates prediction frames by replacing the traditional NMS with weighted frame fusion weighted boxes fusion (WBF), which changes all possibility target frames from discard to fusion. Experiments demonstrate that the improved algorithm achieves improved detection accuracy, decreased processing time for a single image and low carbon emissions in the traffic sign recognition process compared with the original YOLOv5 algorithm in a self-built nighttime environment dataset.
Journal ArticleDOI
TL;DR: In this article , a comparative analysis of different You Only Look Once (YOLO) methodologies, including YOLOv5, YOLOAv6, and YOLAv7, for object detection in mixed traffic under degraded hazy conditions is presented.
Abstract: Vehicle detection in degraded hazy conditions poses significant challenges in computer vision. It is difficult to detect objects accurately under hazy conditions because vision is reduced, and color and texture information is distorted. This research paper presents a comparative analysis of different YOLO (You Only Look Once) methodologies, including YOLOv5, YOLOv6, and YOLOv7, for object detection in mixed traffic under degraded hazy conditions. The accuracy of object detection algorithms can be significantly impacted by hazy weather, so creating reliable models is critical. An open-source dataset of footage obtained from security cameras installed on traffic signals is used for this study to evaluate the performance of these algorithms. The dataset includes various traffic objects under varying haze levels, providing a diverse range of atmospheric conditions encountered in real-world scenarios. The experiments illustrate that the YOLO-based techniques are effective at detecting objects in degraded hazy conditions and give information about how well they perform in comparison. The findings help object detection models operate more accurately and consistently under adverse weather conditions.
Journal ArticleDOI
TL;DR: In this paper , the authors investigated the You Only Look Once (YOLO) algorithm and proposed an enhanced YOLOv4 for real-time target detection in inclement weather conditions.
Abstract: As a crucial component of the autonomous driving task, the vehicle target detection algorithm directly impacts driving safety, particularly in inclement weather situations, where the detection precision and speed are significantly decreased. This paper investigated the You Only Look Once (YOLO) algorithm and proposed an enhanced YOLOv4 for real-time target detection in inclement weather conditions. The algorithm uses the Anchor-free approach to tackle the problem of YOLO preset anchor frame and poor fit. It better adapts to the detected target size, making it suitable for multi-scale target identification. The improved FPN network transmits feature maps to unanchored frames to expand the model's sensory field and maximize the utilization of model feature data. Decoupled head detecting head to increase the precision of target category and location prediction. The experimental dataset BDD-IW was created by extracting specific labeled photos from the BDD100K dataset and fogging some of them to test the proposed method's practical implications in terms of detection precision and speed in Inclement weather conditions. The proposed method is compared to advanced target detection algorithms in this dataset. Experimental results indicated that the proposed method achieved a mean average precision of 60.3%, which is 5.8 percentage points higher than the original YOLOv4; the inference speed of the algorithm is enhanced by 4.5 fps compared to the original, reaching a real-time detection speed of 69.44 fps. The robustness test results indicated that the proposed model has considerably improved the capacity to recognize targets in inclement weather conditions and has achieved high precision in real-time detection.
References
More filters
Journal ArticleDOI
Tsung-Yi Lin1, Priya Goyal1, Ross Girshick1, Kaiming He1, Piotr Dollár1 
TL;DR: Focal loss as discussed by the authors focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training, which improves the accuracy of one-stage detectors.
Abstract: The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. To evaluate the effectiveness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors. Code is at: https://github.com/facebookresearch/Detectron .

5,734 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a comprehensive study and evaluation of existing single image dehazing algorithms, using a new large-scale benchmark consisting of both synthetic and real-world hazy images, called Realistic Single-Image DEhazing (RESIDE).
Abstract: We present a comprehensive study and evaluation of existing single-image dehazing algorithms, using a new large-scale benchmark consisting of both synthetic and real-world hazy images, called REalistic Single-Image DEhazing (RESIDE). RESIDE highlights diverse data sources and image contents, and is divided into five subsets, each serving different training or evaluation purposes. We further provide a rich variety of criteria for dehazing algorithm evaluation, ranging from full-reference metrics to no-reference metrics and to subjective evaluation, and the novel task-driven evaluation. Experiments on RESIDE shed light on the comparisons and limitations of the state-of-the-art dehazing algorithms, and suggest promising future directions.

922 citations

Book ChapterDOI
23 Aug 2020
TL;DR: Dynamic R-CNN as discussed by the authors adjusts the label assignment criteria (IoU threshold) and the shape of regression loss function (parameters of SmoothL1 Loss) automatically based on the statistics of proposals during training.
Abstract: Although two-stage object detectors have continuously advanced the state-of-the-art performance in recent years, the training process itself is far from crystal. In this work, we first point out the inconsistency problem between the fixed network settings and the dynamic training procedure, which greatly affects the performance. For example, the fixed label assignment strategy and regression loss function cannot fit the distribution change of proposals and thus are harmful to training high quality detectors. Consequently, we propose Dynamic R-CNN to adjust the label assignment criteria (IoU threshold) and the shape of regression loss function (parameters of SmoothL1 Loss) automatically based on the statistics of proposals during training. This dynamic design makes better use of the training samples and pushes the detector to fit more high quality samples. Specifically, our method improves upon ResNet-50-FPN baseline with 1.9% AP and 5.5% AP\(_{90}\) on the MS COCO dataset with no extra overhead. Codes and models are available at https://github.com/hkzhang95/DynamicRCNN.

281 citations

Proceedings ArticleDOI
06 Jun 2021
TL;DR: Zhang et al. as discussed by the authors proposed an efficient shuffle attention (SA) module, which adopts Shuffle Units to combine two types of attention mechanisms effectively, i.e., spatial attention and channel attention.
Abstract: Attention mechanisms, which enable a neural network to accurately focus on all the relevant elements of the input, have become an essential component to improve the performance of deep neural networks. There are mainly two attention mechanisms widely used in computer vision studies, spatial attention and channel attention, which aim to capture the pixel-level pairwise relationship and channel dependency, respectively. Although fusing them together may achieve better performance than their individual implementations, it will inevitably increase the computational overhead. In this paper, we propose an efficient Shuffle Attention (SA) module to address this issue, which adopts Shuffle Units to combine two types of attention mechanisms effectively. Specifically, SA first groups channel dimensions into multiple sub-features before processing them in parallel. Then, for each sub-feature, SA utilizes a Shuffle Unit to depict feature dependencies in both spatial and channel dimensions. After that, all sub-features are aggregated and a "channel shuffle" operator is adopted to enable information communication between different sub-features. The proposed SA module is efficient yet effective, e.g., the parameters and computations of SA against the backbone ResNet50 are 300 vs. 25.56M and 2.76e-3 GFLOPs vs. 4.12 GFLOPs, respectively, and the performance boost is more than 1.34% in terms of Top-1 accuracy. Extensive experimental results on common-used benchmarks, including ImageNet-1k for classification, MS COCO for object detection, and instance segmentation, demonstrate that the proposed SA outperforms the current SOTA methods significantly by achieving higher accuracy while having lower model complexity.

228 citations

Journal ArticleDOI
TL;DR: A deep learning method based on neighbors for travel time estimation (TTE), called the Nei-TTE method, which captures the characteristics of each segment and utilizes the trajectory characteristics of adjacent segments as the road network topology and speed interact.
Abstract: With the development of the Internet of Things and big data technology, the intelligent transportation system is becoming the main development direction of future transportation systems. The time required for a given trajectory in a transportation system can be accurately estimated using the trajectory data of the taxis in a city. This is a very challenging task. Although historical data have been used in existing research, excessive use of trajectory information in historical data or inaccurate neighbor trajectory information does not allow for a better prediction accuracy of the query trajectory. In this article, we propose a deep learning method based on neighbors for travel time estimation (TTE), called the Nei-TTE method. We divide the entire trajectory into multiple disjoint segments and use the historical trajectory data approximated at the time level. Our model captures the characteristics of each segment and utilizes the trajectory characteristics of adjacent segments as the road network topology and speed interact. We use velocity features to effectively represent adjacent segment structures. The experiments on the Porto dataset show that the experimental results of our model are significantly better than those of the existing models.

128 citations