scispace - formally typeset
Search or ask a question
Author

Aijun Shi

Bio: Aijun Shi is an academic researcher from Shandong University. The author has contributed to research in topics: Object detection & Analytics. The author has an hindex of 1, co-authored 1 publications receiving 6 citations.

Papers
More filters
Proceedings ArticleDOI
01 Feb 2019
TL;DR: Deep learning methods are used, using the state-of-the-art framework for instance segmentation, called Mask R-CNN, to train the fine-tuning network on datasets, which can efficiently detect objects in a video image while simultaneously generating a high-quality segmentation mask for each instance.
Abstract: With the rapid development of information technology, video surveillance system has become a key part in the security and protection system of modern cities. Especially in prisons, surveillance cameras could be found almost everywhere. However, with the continuous expansion of the surveillance network, surveillance cameras not only bring convenience, but also produce a massive amount of monitoring data, which poses huge challenges to storage, analytics and retrieval. The smart monitoring system equipped with intelligent video analytics technology can monitor as well as pre-alarm abnormal events or behaviours, which is a hot research direction in the field of surveillance. This paper combines deep learning methods, using the state-of-the-art framework for instance segmentation, called Mask R-CNN, to train the fine-tuning network on our datasets, which can efficiently detect objects in a video image while simultaneously generating a high-quality segmentation mask for each instance. The experiment show that our network is simple to train and easy to generalize to other datasets, and the mask average precision is nearly up to 98.5% on our own datasets.

11 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this paper, an AI-powered threat detector for smart surveillance cameras, called Hawk-Eye, is presented, which can be deployed on centralized servers hosted in the cloud and locally on the surveillance cameras at the network edge.
Abstract: With recent advances in both AI and IoT capabilities, it is possible than ever to implement surveillance systems that can automatically identify people who might represent a potential security threat to the public in real-time. Imagine a surveillance camera system that can detect various on-body weapons, masked faces, suspicious objects and traffic. This system could transform surveillance cameras from passive sentries into active observers which would help in preventing a possible mass shooting in a school, stadium or mall. In this paper, we present a prototype implementation of such systems, Hawk-Eye, an AI-powered threat detector for smart surveillance cameras. Hawk-Eye can be deployed on centralized servers hosted in the cloud, as well as locally on the surveillance cameras at the network edge. Deploying AI-enabled surveillance applications at the edge enables the initial analysis of the captured images to take place on-site, which reduces the communication overheads and enables swift security actions. At the cloud side, we built a Mask R-CNN model that can detect suspicious objects in an image captured by a camera at the edge. The model can generate a high-quality segmentation mask for each object instance in the image, along with the confidence percentage and classification time. The camera side used a Raspberry Pi 3 device, Intel Neural Compute Stick 2 (NCS 2), and Logitech C920 webcam. At the camera side, we built a CNN model that can consume a stream of images directly from an on-site webcam, classify them, and displays the results to the user via a GUI-friendly interface. A motion detection module is developed to capture images automatically from the video when a new motion is detected. Finally, we evaluated our system using various performance metrics such as classification time and accuracy. Our experimental results showed an average overall prediction accuracy of 94% on our dataset.

27 citations

Book ChapterDOI
01 Jan 2021
TL;DR: In this article, the authors used the Nvidia Jetson Nano board to compute convolutional neural network algorithm for the facial recognition process, which can detect the intruder and inform the security within seconds.
Abstract: Facial recognition system is used widely to identify and verify the person’s face from image or video source. With the continuous expansion of the surveillance system, surveillance cameras not only bring convenience, but also produce a massive amount of monitoring data, which poses huge challenges to storage, analytics, and retrieval. The smart monitoring system equipped with intelligent video analytics technology can monitor as well as pre-alarm abnormal events or behaviors. Here, propose system will detect the intruder and inform the security within seconds. The Nvidia Jetson Nano board will be used to compute convolutional neural network algorithm for the facial recognition process. The basic idea will be to use this system where a database can be stored of the existing faces. The system will then take the data from the surveillance camera and run facial recognition algorithm on it. It will match all the faces with the ones already stored in the database and if it finds any face which is new, it will send an alert to the security personnel. This will help to increase the security of the place where there are many people gathered at a time, for example, schools, colleges, universities, etc.

3 citations

Proceedings ArticleDOI
23 Sep 2022
TL;DR: In this paper , a comparison of existing frameworks and datasets which are related to video-type datasets only is presented, and the pros and cons of current existing works and further research directions based on existing works are provided.
Abstract: Currently, video human behaviour recognition is the most foundational task of computer vision. The conventional recognition frameworks are building based on the images only, with current the wide usage of surveillance video as well as human behaviours are increasingly related to temporal information, video-based behaviour recognition has been widely studied by current researchers. This paper addresses multiple current research works and does the comparison between the variants of these algorithms. The methodologies are considered to be separated into traditional recognition and deep learning based methods which have reached the highest accuracy during current research works. Our paper does the comparison of existing frameworks and datasets which are related to video-type datasets only. We investigated multiple types of neural networks which are utilized for behaviour information extraction, as well as the challenges facing for our conventional and current methods. The deep learning based methodology for extracting both spatialtemporal information has involved plenty of frameworks. We compared the pros and cons of current existing works and provide the further research directions based on existing works

1 citations

Proceedings ArticleDOI
23 Sep 2022
TL;DR: In this article , a comparison of existing frameworks and datasets which are related to video-type datasets only is presented, and the pros and cons of current existing works and further research directions based on existing works are provided.
Abstract: Currently, video human behaviour recognition is the most foundational task of computer vision. The conventional recognition frameworks are building based on the images only, with current the wide usage of surveillance video as well as human behaviours are increasingly related to temporal information, video-based behaviour recognition has been widely studied by current researchers. This paper addresses multiple current research works and does the comparison between the variants of these algorithms. The methodologies are considered to be separated into traditional recognition and deep learning based methods which have reached the highest accuracy during current research works. Our paper does the comparison of existing frameworks and datasets which are related to video-type datasets only. We investigated multiple types of neural networks which are utilized for behaviour information extraction, as well as the challenges facing for our conventional and current methods. The deep learning based methodology for extracting both spatialtemporal information has involved plenty of frameworks. We compared the pros and cons of current existing works and provide the further research directions based on existing works

1 citations