scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Driving Lane Detection on Smartphones using Deep Neural Networks

TL;DR: DeepLane is presented, a system that leverages the back camera of a windshield-mounted smartphone to provide an accurate estimate of the vehicle’s current lane and can detect a vehicle's lane position with an accuracy of over 90%, and is implemented as an Android app.
Abstract: Current smartphone-based navigation applications fail to provide lane-level information due to poor GPS accuracy. Detecting and tracking a vehicle’s lane position on the road assists in lane-level navigation. For instance, it would be important to know whether a vehicle is in the correct lane for safely making a turn, or whether the vehicle’s speed is compliant with a lane-specific speed limit. Recent efforts have used road network information and inertial sensors to estimate lane position. While inertial sensors can detect lane shifts over short windows, it would suffer from error accumulation over time. In this article, we present DeepLane, a system that leverages the back camera of a windshield-mounted smartphone to provide an accurate estimate of the vehicle’s current lane. We employ a deep learning--based technique to classify the vehicle’s lane position. DeepLane does not depend on any infrastructure support such as lane markings and works even when there are no lane markings, a characteristic of many roads in developing regions. We perform extensive evaluation of DeepLane on real-world datasets collected in developed and developing regions. DeepLane can detect a vehicle’s lane position with an accuracy of over 90%, and we have implemented DeepLane as an Android app.
Citations
More filters
Proceedings ArticleDOI
15 Nov 2021
TL;DR: Li et al. as discussed by the authors proposed a novel adversarial attack framework based on which the attacker can easily fool LiDAR semantic segmentation by placing some simple objects (e.g., cardboard and road signs) at some locations in the physical space.
Abstract: Today, most autonomous vehicles (AVs) rely on LiDAR (Light Detection and Ranging) perception to acquire accurate information about their immediate surroundings. In LiDAR-based perception systems, semantic segmentation plays a critical role as it can divide LiDAR point clouds into meaningful regions according to human perception and provide AVs with semantic understanding of the driving environments. However, an implicit assumption for existing semantic segmentation models is that they are performed in a reliable and secure environment, which may not be true in practice. In this paper, we investigate adversarial attacks against LiDAR semantic segmentation in autonomous driving. Specifically, we propose a novel adversarial attack framework based on which the attacker can easily fool LiDAR semantic segmentation by placing some simple objects (e.g., cardboard and road signs) at some locations in the physical space. We conduct extensive real-world experiments to evaluate the performance of our proposed attack framework. The experimental results show that our attack can achieve more than 90% success rate in real-world driving environments. To the best of our knowledge, this is the first study on physically realizable adversarial attacks against LiDAR point cloud semantic segmentation with real-world evaluations.

17 citations

Journal ArticleDOI
TL;DR: This paper proposes a road health monitoring system using sensors that learns deep learning based classifiers, which runs on resource constraint devices such as smartphone, to identify the type of the road.
Abstract: Road health monitoring is a prominent area of research in the transportation system to ensure a safer and smoother flow of traffic It helps to determine the type of road, including smooth, rough, zigzag, etc Though there exists a significant amount of work towards road health monitoring, that requires high-end machine or cloud support to perform road classification In this paper, we propose a road health monitoring system using sensors The system learns deep learning based classifiers, which runs on resource constraint devices such as smartphone, to identify the type of the road The system optimizes the deep neural network model based on the available resources on the resource constraint device The optimization is solved for selecting the appropriate model version that matches the current resource availability of the device We also evaluate the proposed system on the collected sensory dataset to study the impact of background services and residual energy of the device on accuracy and inference time

15 citations

Journal ArticleDOI
TL;DR: Experimental results show that the proposed real-time road perception method combined with the 5G-V2X framework has a faster processing speed and can sense road conditions robustly under various complex actual conditions.
Abstract: The Internet of Vehicles and information security are key components of a smart city. Real-time road perception is one of the most difficult tasks. Traditional detection methods require manual adjustment of parameters, which is difficult, and is susceptible to interference from object occlusion, light changes, and road wear. Designing a robust road perception algorithm is still challenging. On this basis, we combine artificial intelligence algorithms and the 5G-V2X framework to propose a real-time road perception method. First, an improved model based on Mask R-CNN is implemented to improve the accuracy of detecting lane line features. Then, the linear and polynomial fitting methods of feature points in different fields of view are combined. Finally, the optimal parameter equation of the lane line can be obtained. We tested our method in complex road scenes. Experimental results show that, combined with 5G-V2X, this method ultimately has a faster processing speed and can sense road conditions robustly under various complex actual conditions.

10 citations

Journal ArticleDOI
01 Jan 2021
TL;DR: An integrated method for lane detection and vehicle detection, which tries to make a real-time analysis of vehicle video for identifying lane, detecting, and tracking forward vehicles is proposed.
Abstract: In this paper, we propose an integrated method for lane detection and vehicle detection, which tries to make a real-time analysis of vehicle video for identifying lane, detecting, and tracking forward vehicles. In lane detection, image preprocessing, line detection based on improved Hough transform, and straight-line model reconstruction are used. For vehicle detection, preprocessing, vehicle shadow merging based on the improved search algorithm, regions of interest (ROI) demarcation, lane determination, and vehicle tracking are used. The experiment results show that the time it takes to process an image is about 25ms. Additionally, the lane detection rate of vehicles driving on a structured road is approximately 98%, and the vehicle detection rate of the closest forward vehicle is approximately 81%.

7 citations


Cites methods from "Driving Lane Detection on Smartphon..."

  • ...[4-6] use depth neural network to train data set and extract lane features....

    [...]

Journal ArticleDOI
TL;DR: A novel decoupled two-step efficient generative model (EGM), which contains a conditional reference generator and a conditional adversarial transformer, and a new strategy, augmentation of adversarial labels, to produce dynamic target labels and enhance the exploration ability of EGM.
Abstract: Unrestricted adversarial examples allow the attacker to start attacks without given clean samples, which are quite aggressive and threatening. However, existing works for generating unrestricted adversary examples are quite inefficient and cannot achieve a high success rate. In this article, we explore an end-to-end and effective solution for unrestricted adversary example generation. To stabilize the training process and make our generative model converge to satisfactory results, we design a novel decoupled two-step efficient generative model (EGM), which contains a conditional reference generator and a conditional adversarial transformer. The former is responsible for generating reference samples from noises and source classes. The latter is responsible for converting the reference sample into adversarial examples corresponding to target classes. To improve the success rate, we design a new strategy, augmentation of adversarial labels to produce dynamic target labels and enhance the exploration ability of EGM. Such a strategy can be also applied to existing attacks to improve their attack success rates, which is of independent interest. We conduct extensive experiments to evaluate our proposed model and demonstrate the necessity of decoupling the generation process in EGM. Experimental results show our EGM is much faster and achieves a higher success rate than the state-of-the-art attacks.

5 citations

References
More filters
Proceedings ArticleDOI
27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

123,388 citations

Proceedings Article
03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

73,978 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

40,257 citations

Journal ArticleDOI
TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.
Abstract: We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network [1] . The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the widely adopted FCN [2] and also with the well known DeepLab-LargeFOV [3] , DeconvNet [4] architectures. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. SegNet was primarily motivated by scene understanding applications. Hence, it is designed to be efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than other competing architectures and can be trained end-to-end using stochastic gradient descent. We also performed a controlled benchmark of SegNet and other architectures on both road scenes and SUN RGB-D indoor scene segmentation tasks. These quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures. We also provide a Caffe implementation of SegNet and a web demo at http://mi.eng.cam.ac.uk/projects/segnet/ .

13,468 citations


"Driving Lane Detection on Smartphon..." refers background or methods in this paper

  • ...The encoder network in SegNet compresses the input image into a low-resolution feature map....

    [...]

  • ...In this work, we employ SegNet [14], which is a popular pixel-wise image segmentation technique based on a deep encoder-decoder architecture....

    [...]

  • ...On these classes SegNet performs significantly better than identifying pixels for vehicles in a scene....

    [...]

  • ...SegNet is trained on a dataset that has pixel-wise annotations for various objects such as buildings, vehicles, roads, pavement, poles, and traffic signs....

    [...]

  • ...Recent advances in deep learning have enabled pixel-wise segmentation of an image, not just to identify different objects but also their precise contours [14]....

    [...]

Posted Content
TL;DR: This paper quantifies the generality versus specificity of neurons in each layer of a deep convolutional neural network and reports a few surprising results, including that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.
Abstract: Many deep neural networks trained on natural images exhibit a curious phenomenon in common: on the first layer they learn features similar to Gabor filters and color blobs. Such first-layer features appear not to be specific to a particular dataset or task, but general in that they are applicable to many datasets and tasks. Features must eventually transition from general to specific by the last layer of the network, but this transition has not been studied extensively. In this paper we experimentally quantify the generality versus specificity of neurons in each layer of a deep convolutional neural network and report a few surprising results. Transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task, which was expected, and (2) optimization difficulties related to splitting networks between co-adapted neurons, which was not expected. In an example network trained on ImageNet, we demonstrate that either of these two issues may dominate, depending on whether features are transferred from the bottom, middle, or top of the network. We also document that the transferability of features decreases as the distance between the base task and target task increases, but that transferring features even from distant tasks can be better than using random features. A final surprising result is that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.

4,663 citations


"Driving Lane Detection on Smartphon..." refers background or methods in this paper

  • ...Recent works have shown that pre-trained models have a strong ability to generalize to images outside the ImageNet dataset via transfer learning [22, 36]....

    [...]

  • ...To circumvent this, we employ transfer learning [29, 31, 36], wherein models trained for one task capture relations in the data that can be reused for different problems in the same domain....

    [...]