scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A real-time person tracking system based on SiamMask network for intelligent video surveillance

28 Jul 2021-Journal of Real-time Image Processing (Springer Berlin Heidelberg)-Vol. 18, Iss: 5, pp 1803-1814
TL;DR: A real-time person tracking and segmentation system is introduced in this work, using an overhead camera perspective, and the SiamMask algorithm delivers good results, with a tracking accuracy of 95% and a comparison is performed with other tracking algorithms.
Abstract: Real-time video surveillance systems are widely deployed in various environments, including public areas, commercial buildings, and public infrastructures. Person detection is a key and crucial task in different video surveillance applications, such as person detection, segmentation, and tracking. Researchers presented different image processing and artificial intelligence-based approaches (including machine and deep learning) for person detection and tracking, but mainly comprised of frontal view camera perspective. A real-time person tracking and segmentation system is introduced in this work, using an overhead camera perspective. The system applied a deep learning-based algorithm, i.e., SiamMask, a simple, versatile, fast, and surpassing other real-time tracking algorithms. The algorithm also performs segmentation of the target person by combining a mask branch to the fully convolutional twin neural network for target or person tracking. First, the person video sequences are obtained from an overhead perspective, and then additional training is performed with the help of transfer learning. Finally, a comparison is performed with other tracking algorithms. The SiamMask algorithm delivers good results, with a tracking accuracy of 95%.
Citations
More filters
Journal ArticleDOI
TL;DR: A smart and sustainable conceptual framework that leverages cloud computing, IoT devices, and artificial intelligence to process and obtain necessary information is introduced that provides digital analytics and saves results in decentralized cloud repositories through blockchain technology to promote various applications.
Abstract: Advancements in digital technologies, such as the Internet of Things (IoT), fog/edge/cloud computing, and cyber‐physical systems have revolutionized a broad spectrum of smart city applications. The significant contributions and rapid developments of advanced artificial intelligence‐based technologies and approaches, like, machine learning and deep learning, which are applied for extracting accurate information from extensive data, perform a potential role in IoT applications. Moreover, blockchain technology's fast adoption also contributes a significant role in the development of the new digital smart city ecosystem. Thus, artificial intelligence and blockchain technology convergence revolutionize smart city infrastructures to establish sustainable ecosystems for IoT applications. Nevertheless, these advancements and technological improvements also provide both opportunities and challenges for developing sustainable IoT applications. This paper aims to examine the convergence of blockchain technology and artificial intelligence, a unique driver towards technological transformation in intelligent and sustainable IoT applications. We mainly discussed the advantages of blockchain technology that might promote the advancement and development of sustainable IoT applications. On the basis of the discussion, we introduced a smart and sustainable conceptual framework that leverages cloud computing, IoT devices, and artificial intelligence to process and obtain necessary information. The system provides digital analytics and saves results in decentralized cloud repositories through blockchain technology to promote various applications. Moreover, the layer‐based architecture allows a sustainable incentive structure, which can possibly assist secure and protected smart city applications. We reviewed the enhanced solutions, summing up the key points that can be applied for generating various artificial intelligence and blockchain‐based systems. Also, we discussed the issues that still remain open and our future research goals; that can introduce new ideas and future guidelines for sustainable IoT applications.

27 citations

Journal ArticleDOI
TL;DR: In this paper, a real-time, efficient system in which a deep learning-based model U-Net is explored for multiple object segmentation in aerial drone images is presented. And the experimental results demonstrate that data augmentation improves the model's performance by achieving a segmentation accuracy of 92, 93, and 95% with base architectures VGG-16, ResNet-50, and MobileNet, respectively.
Abstract: Real-time object detection and segmentation are considered as one of the fundamental but challenging problems in remote sensing and surveillance applications (including satellite and aerial). Consequently, it performs a crucial role in various management and monitoring applications and has received notable attention in recent years. This paper aims to present a real-time, efficient system in which a deep learning-based model U-Net is explored for multiple object segmentation in aerial drone images. We perform data augmentation and apply transfer learning to enhance the model efficiency. We experimented U-Net segmentation model with different base architectures, including VGG 16, ResNet-50, and MobileNet, and compare their performance. We also compare the results U-Net segmentation model with different base architectures and concludes that the U-Net (MobileNet) achieves good results. The experimental results demonstrate that data augmentation improves the model’s performance by achieving a segmentation accuracy of 92%, 93%, and 95% with base architectures VGG-16, ResNet-50, and MobileNet, respectively.

8 citations

Journal ArticleDOI
TL;DR: The convergence of machine learning with image processing is useful in a variety of security applications such as homeland security, surveillance applications, identity authentication, and so on as mentioned in this paper, which has led to new research opportunities in this area.
Abstract: The advent of machine learning techniques and image processing techniques has led to new research opportunities in this area. Machine learning has enabled automatic extraction and analysis of information from images. The convergence of machine learning with image processing is useful in a variety of security applications. Image processing plays a significant role in physical as well as digital security. Physical security applications include homeland security, surveillance applications, identity authentication, and so on. Digital security implies protecting digital data. Techniques like digital watermarking, network security, and steganography enable digital security.

4 citations

Journal ArticleDOI
TL;DR: In this paper , a multi-stage extension of the Recurrent Convolution Neural Network (RCNN) model was used to detect the infection of COVID-19. And the system achieved a mean Average Precision (mAP) rate of 0.94.

2 citations

References
More filters
Book ChapterDOI
08 Oct 2016
TL;DR: A basic tracking algorithm is equipped with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video and achieves state-of-the-art performance in multiple benchmarks.
Abstract: The problem of arbitrary object tracking has traditionally been tackled by learning a model of the object’s appearance exclusively online, using as sole training data the video itself. Despite the success of these methods, their online-only approach inherently limits the richness of the model they can learn. Recently, several attempts have been made to exploit the expressive power of deep convolutional networks. However, when the object to track is not known beforehand, it is necessary to perform Stochastic Gradient Descent online to adapt the weights of the network, severely compromising the speed of the system. In this paper we equip a basic tracking algorithm with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video. Our tracker operates at frame-rates beyond real-time and, despite its extreme simplicity, achieves state-of-the-art performance in multiple benchmarks.

2,936 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: The Siamese region proposal network (Siamese-RPN) is proposed which is end-to-end trained off-line with large-scale image pairs for visual object tracking and consists of SiAMESe subnetwork for feature extraction and region proposal subnetwork including the classification branch and regression branch.
Abstract: Visual object tracking has been a fundamental topic in recent years and many deep learning based trackers have achieved state-of-the-art performance on multiple benchmarks. However, most of these trackers can hardly get top performance with real-time speed. In this paper, we propose the Siamese region proposal network (Siamese-RPN) which is end-to-end trained off-line with large-scale image pairs. Specifically, it consists of Siamese subnetwork for feature extraction and region proposal subnetwork including the classification branch and regression branch. In the inference phase, the proposed framework is formulated as a local one-shot detection task. We can pre-compute the template branch of the Siamese subnetwork and formulate the correlation layers as trivial convolution layers to perform online tracking. Benefit from the proposal refinement, traditional multi-scale test and online fine-tuning can be discarded. The Siamese-RPN runs at 160 FPS while achieving leading performance in VOT2015, VOT2016 and VOT2017 real-time challenges.

2,016 citations

Posted Content
TL;DR: In this paper, a fully-convolutional Siamese network is trained end-to-end on the ILSVRC15 dataset for object detection in video, which achieves state-of-the-art performance.
Abstract: The problem of arbitrary object tracking has traditionally been tackled by learning a model of the object's appearance exclusively online, using as sole training data the video itself. Despite the success of these methods, their online-only approach inherently limits the richness of the model they can learn. Recently, several attempts have been made to exploit the expressive power of deep convolutional networks. However, when the object to track is not known beforehand, it is necessary to perform Stochastic Gradient Descent online to adapt the weights of the network, severely compromising the speed of the system. In this paper we equip a basic tracking algorithm with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video. Our tracker operates at frame-rates beyond real-time and, despite its extreme simplicity, achieves state-of-the-art performance in multiple benchmarks.

1,613 citations

Proceedings ArticleDOI
01 Jun 2019
TL;DR: This method improves the offline training procedure of popular fully-convolutional Siamese approaches for object tracking by augmenting their loss with a binary segmentation task, and operates online, producing class-agnostic object segmentation masks and rotated bounding boxes at 55 frames per second.
Abstract: In this paper we illustrate how to perform both visual object tracking and semi-supervised video object segmentation, in real-time, with a single simple approach. Our method, dubbed SiamMask, improves the offline training procedure of popular fully-convolutional Siamese approaches for object tracking by augmenting their loss with a binary segmentation task. Once trained, SiamMask solely relies on a single bounding box initialisation and operates online, producing class-agnostic object segmentation masks and rotated bounding boxes at 55 frames per second. Despite its simplicity, versatility and fast speed, our strategy allows us to establish a new state-of-the-art among real-time trackers on VOT-2018, while at the same time demonstrating competitive performance and the best speed for the semi-supervised video object segmentation task on DAVIS-2016 and DAVIS-2017.

1,162 citations

Journal ArticleDOI
TL;DR: The background of deep visual tracking is introduced, including the fundamental concepts of visual tracking and related deep learning algorithms, and the existing deep-learning-based trackers are categorize into three classes according to network structure, network function and network training.

473 citations