scispace - formally typeset
Search or ask a question
Author

Hu Cao

Other affiliations: Hunan University
Bio: Hu Cao is an academic researcher from Technische Universität München. The author has contributed to research in topics: Neuromorphic engineering & Convolutional neural network. The author has an hindex of 5, co-authored 11 publications receiving 120 citations. Previous affiliations of Hu Cao include Hunan University.

Papers
More filters
Journal ArticleDOI
TL;DR: It is expected that this article will serve as a starting point for new researchers and engineers in the autonomous driving field and provide a bird's-eye view to both neuromorphic vision and autonomous driving research communities.
Abstract: As a bio-inspired and emerging sensor, an event-based neuromorphic vision sensor has a different working principle compared to the standard frame-based cameras, which leads to promising properties of low energy consumption, low latency, high dynamic range (HDR), and high temporal resolution It poses a paradigm shift to sense and perceive the environment by capturing local pixel-level light intensity changes and producing asynchronous event streams Advanced technologies for the visual sensing system of autonomous vehicles from standard computer vision to event-based neuromorphic vision have been developed In this tutorial-like article, a comprehensive review of the emerging technology is given First, the course of the development of the neuromorphic vision sensor that is derived from the understanding of biological retina is introduced The signal processing techniques for event noise processing and event data representation are then discussed Next, the signal processing algorithms and applications for event-based neuromorphic vision in autonomous driving and various assistance systems are reviewed Finally, challenges and future research directions are pointed out It is expected that this article will serve as a starting point for new researchers and engineers in the autonomous driving field and provide a bird's-eye view to both neuromorphic vision and autonomous driving research communities

162 citations

Journal ArticleDOI
TL;DR: The first neuromorphic vision based multivehicle detection and tracking system in ITS is proposed and the performance of the system is evaluated with a dataset recorded by a neuromorph vision sensor mounted on a highway bridge.
Abstract: Neuromorphic vision sensor is a new passive sensing modality and a frameless sensor with a number of advantages over traditional cameras. Instead of wastefully sending entire images at fixed frame rate, neuromorphic vision sensor only transmits the local pixel-level changes caused by the movement in a scene at the time they occur. This results in advantageous characteristics, in terms of low energy consumption, high dynamic range, sparse event stream, and low response latency, which can be very useful in intelligent perception systems for modern intelligent transportation system (ITS) that requires efficient wireless data communication and low power embedded computing resources. In this paper, we propose the first neuromorphic vision based multivehicle detection and tracking system in ITS. The performance of the system is evaluated with a dataset recorded by a neuromorphic vision sensor mounted on a highway bridge. We performed a preliminary multivehicle tracking-by-clustering study using three classical clustering approaches and four tracking approaches. Our experiment results indicate that, by making full use of the low latency and sparse event stream, we could easily integrate an online tracking-by-clustering system running at a high frame rate, which far exceeds the real-time capabilities of traditional frame-based cameras. If the accuracy is prioritized, the tracking task can also be performed robustly at a relatively high rate with different combinations of algorithms. We also provide our dataset and evaluation approaches serving as the first neuromorphic benchmark in ITS and hopefully can motivate further research on neuromorphic vision sensors for ITS solutions.

42 citations

Posted Content
TL;DR: Wang et al. as mentioned in this paper proposed a pure transformer-based U-shaped Encoder-Decoder architecture with skip-connections for local-global semantic feature learning for medical image segmentation.
Abstract: In the past few years, convolutional neural networks (CNNs) have achieved milestones in medical image analysis. Especially, the deep neural networks based on U-shaped architecture and skip-connections have been widely applied in a variety of medical image tasks. However, although CNN has achieved excellent performance, it cannot learn global and long-range semantic information interaction well due to the locality of the convolution operation. In this paper, we propose Swin-Unet, which is an Unet-like pure Transformer for medical image segmentation. The tokenized image patches are fed into the Transformer-based U-shaped Encoder-Decoder architecture with skip-connections for local-global semantic feature learning. Specifically, we use hierarchical Swin Transformer with shifted windows as the encoder to extract context features. And a symmetric Swin Transformer-based decoder with patch expanding layer is designed to perform the up-sampling operation to restore the spatial resolution of the feature maps. Under the direct down-sampling and up-sampling of the inputs and outputs by 4x, experiments on multi-organ and cardiac segmentation tasks demonstrate that the pure Transformer-based U-shaped Encoder-Decoder network outperforms those methods with full-convolution or the combination of transformer and convolution. The codes and trained models will be publicly available at this https URL.

34 citations

Journal ArticleDOI
TL;DR: This work proposes to develop pedestrian detectors that unlock the potential of the event data by leveraging multi-cue information and different fusion strategies, and introduces three different event-stream encoding methods based on Frequency, Surface of Active Event (SAE) and Leaky Integrate-and-Fire (LIF).
Abstract: Neuromorphic vision sensors are bio-inspired cameras that naturally capture the dynamics of a scene with ultra-low latency, filtering out redundant information with low power consumption. Few works are addressing the object detection with this sensor. In this work, we propose to develop pedestrian detectors that unlock the potential of the event data by leveraging multi-cue information and different fusion strategies. To make the best out of the event data, we introduce three different event-stream encoding methods based on Frequency, Surface of Active Event (SAE) and Leaky Integrate-and-Fire (LIF). We further integrate them into the state-of-the-art neural network architectures with two fusion approaches: the channel-level fusion of the raw feature space and decision-level fusion with the probability assignments. We present a qualitative and quantitative explanation why different encoding methods are chosen to evaluate the pedestrian detection and which method performs the best. We demonstrate the advantages of the decision-level fusion via leveraging multi-cue event information and show that our approach performs well on a self-annotated event-based pedestrian dataset with 8,736 event frames. This work paves the way of more fascinating perception applications with neuromorphic vision sensors.

27 citations

Journal ArticleDOI
TL;DR: A parking slot detection method that uses directional entrance line regression and classification based on a deep convolutional neural network (DCNN) to make it robust and simple and achieves a real-time detection speed of 13 ms per frame on Titan Xp.
Abstract: Due to the complex visual environment and incomplete display of parking slots on around-view images, vision-based parking slot detection is a major challenge. Previous studies in this field mostly use the existing models to solve the problem, the steps of which are cumbersome. In this paper, we propose a parking slot detection method that uses directional entrance line regression and classification based on a deep convolutional neural network (DCNN) to make it robust and simple. For parking slots with different shapes and observed from different angles, we represent the parking slot as a directional entrance line. Subsequently, we design a DCNN detector to simultaneously obtain the type, position, length, and direction of the entrance line. After that, the complete parking slot can be easily inferred using the detection results and prior geometric information. To verify our method, we conduct experiments on the public ps2.0 dataset and self-annotated parking slot dataset with 2,135 images. The results show that our method not only outperforms state-of-the-art competitors with a precision rate of 99.68% and a recall rate of 99.41% on the ps2.0 dataset but also performs a satisfying generalization on the self-annotated dataset. Moreover, it achieves a real-time detection speed of 13 ms per frame on Titan Xp. By converting the parking slot into a directional entrance line, the specially designed DCNN detector can quickly and effectively detect various types of parking slots.

18 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: It is expected that this article will serve as a starting point for new researchers and engineers in the autonomous driving field and provide a bird's-eye view to both neuromorphic vision and autonomous driving research communities.
Abstract: As a bio-inspired and emerging sensor, an event-based neuromorphic vision sensor has a different working principle compared to the standard frame-based cameras, which leads to promising properties of low energy consumption, low latency, high dynamic range (HDR), and high temporal resolution It poses a paradigm shift to sense and perceive the environment by capturing local pixel-level light intensity changes and producing asynchronous event streams Advanced technologies for the visual sensing system of autonomous vehicles from standard computer vision to event-based neuromorphic vision have been developed In this tutorial-like article, a comprehensive review of the emerging technology is given First, the course of the development of the neuromorphic vision sensor that is derived from the understanding of biological retina is introduced The signal processing techniques for event noise processing and event data representation are then discussed Next, the signal processing algorithms and applications for event-based neuromorphic vision in autonomous driving and various assistance systems are reviewed Finally, challenges and future research directions are pointed out It is expected that this article will serve as a starting point for new researchers and engineers in the autonomous driving field and provide a bird's-eye view to both neuromorphic vision and autonomous driving research communities

162 citations

Journal ArticleDOI
TL;DR: In this article, a two-compartment leaky integrate-and-fire neuron with partially segregated dendrites is used to solve the credit assignment problem, and a dynamic fixed-point representation method and piecewise linear approximation approach are presented.
Abstract: A critical challenge in neuromorphic computing is to present computationally efficient algorithms of learning. When implementing gradient-based learning, error information must be routed through the network, such that each neuron knows its contribution to output, and thus how to adjust its weight. This is known as the credit assignment problem. Exactly implementing a solution like backpropagation involves weight sharing, which requires additional bandwidth and computations in a neuromorphic system. Instead, models of learning from neuroscience can provide inspiration for how to communicate error information efficiently, without weight sharing. Here we present a novel dendritic event-based processing (DEP) algorithm, using a two-compartment leaky integrate-and-fire neuron with partially segregated dendrites that effectively solves the credit assignment problem. In order to optimize the proposed algorithm, a dynamic fixed-point representation method and piecewise linear approximation approach are presented, while the synaptic events are binarized during learning. The presented optimization makes the proposed DEP algorithm very suitable for implementation in digital or mixed-signal neuromorphic hardware. The experimental results show that spiking representations can rapidly learn, achieving high performance by using the proposed DEP algorithm. We find the learning capability is affected by the degree of dendritic segregation, and the form of synaptic feedback connections. This study provides a bridge between the biological learning and neuromorphic learning, and is meaningful for the real-time applications in the field of artificial intelligence.

114 citations

Journal ArticleDOI
Wei Liu1, Xin Xia1, Lu Xiong1, Lu Yishi1, Gao Letian1, Zhuoping Yu1 
TL;DR: In this article, a kinematic model-based VSA estimation method is proposed by fusing information from a global navigation satellite system (GNSS) and an inertial measurement unit (IMU).
Abstract: Vehicle slip angle (VSA) estimation is of paramount importance for connected automated vehicle dynamic control, especially in critical lateral driving scenarios. In this paper, a novel kinematic-model-based VSA estimation method is proposed by fusing information from a global navigation satellite system (GNSS) and an inertial measurement unit (IMU). First, to reject the gravity components induced by the vehicle roll and pitch, a vehicle attitude angle observer based on the square-root cubature Kalman filter (SCKF) is designed to estimate the roll and pitch. A novel feedback mechanism based on the vehicle intrinsic information (the steering angle and wheel speed) for the pitch and roll is designed. Then, the integration of the reverse smoothing and grey prediction is adopted to compensate for the cumulative velocity errors during the relatively low sampling interval of the GNSS. Moreover, the GNSS signal delay has been addressed by an estimation-prediction integrated framework. Finally, the results confirm that the proposed method can estimate the VSA under both the slalom and double lane change (DLC) scenarios.

85 citations

Posted Content
TL;DR: A comprehensive overview of applying deep learning methods in various medical image analysis tasks can be found in this article, where the authors highlight the latest progress and contributions of state-of-the-art unsupervised and semi-supervised deep learning in medical images, which are summarized based on different application scenarios.
Abstract: Deep learning has become the mainstream technology in computer vision, and it has received extensive research interest in developing new medical image processing algorithms to support disease detection and diagnosis. As compared to conventional machine learning technologies, the major advantage of deep learning is that models can automatically identify and recognize representative features through the hierarchal model architecture, while avoiding the laborious development of hand-crafted features. In this paper, we reviewed and summarized more than 200 recently published papers to provide a comprehensive overview of applying deep learning methods in various medical image analysis tasks. Especially, we emphasize the latest progress and contributions of state-of-the-art unsupervised and semi-supervised deep learning in medical images, which are summarized based on different application scenarios, including lesion classification, segmentation, detection, and image registration. Additionally, we also discussed the major technical challenges and suggested the possible solutions in future research efforts.

78 citations

Journal ArticleDOI
TL;DR: In this article, the authors claimed that emerging autonomous driving technologies are claimed to create new opportunities for realizing smart and sustainable urban mobility initiatives, and some studies identified shared shared features among autonomous vehicles.

78 citations