scispace - formally typeset
Search or ask a question

Showing papers on "Smart camera published in 2021"


Journal ArticleDOI
TL;DR: A collaborative robotics framework that can assist in the detection and tracking of multiple objects in top view surveillance is presented and the generalization performance is investigated through testing the models on various sequences of top view data set.
Abstract: Collaborative Robotics is one of the high-interest research topics in the area of academia and industry. It has been progressively utilized in numerous applications, particularly in intelligent surveillance systems. It allows the deployment of smart cameras or optical sensors with computer vision techniques, which may serve in several object detection and tracking tasks. These tasks have been considered challenging and high-level perceptual problems, frequently dominated by relative information about the environment, where main concerns such as occlusion, illumination, background, object deformation, and object class variations are commonplace. In order to show the importance of top view surveillance, a collaborative robotics framework has been presented. It can assist in the detection and tracking of multiple objects in top view surveillance. The framework consists of a smart robotic camera embedded with the visual processing unit. The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization. The detection models are further combined with different tracking algorithms, including GOTURN, MEDIANFLOW, TLD, KCF, MIL, and BOOSTING. These algorithms, along with detection models, help to track and predict the trajectories of detected objects. The pre-trained models are employed; therefore, the generalization performance is also investigated through testing the models on various sequences of top view data set. The detection models achieved maximum True Detection Rate 93% to 90% with a maximum 0.6% False Detection Rate. The tracking results of different algorithms are nearly identical, with tracking accuracy ranging from 90% to 94%. Furthermore, a discussion has been carried out on output results along with future guidelines.

47 citations


Proceedings ArticleDOI
Yiman Zhang1, Hanting Chen1, Xinghao Chen1, Yiping Deng1, Chunjing Xu1, Yunhe Wang1 
20 Jun 2021
TL;DR: Hu et al. as mentioned in this paper proposed a data-free compression approach for single image super-resolution (SISR) task, which analyzes the relationship between the outputs and inputs from the pre-trained network and explores a generator with a series of loss functions for maximally capturing useful information.
Abstract: Convolutional network compression methods require training data for achieving acceptable results, but training data is routinely unavailable due to some privacy and transmission limitations. Therefore, recent works focus on learning efficient networks without original training data, i.e., data-free model compression. Wherein, most of existing algorithms are developed for image recognition or segmentation tasks. In this paper, we study the data-free compression approach for single image super-resolution (SISR) task which is widely used in mobile phones and smart cameras. Specifically, we analyze the relationship between the outputs and inputs from the pre-trained network and explore a generator with a series of loss functions for maximally capturing useful information. The generator is then trained for synthesizing training samples which have similar distribution to that of the original data. To further alleviate the training difficulty of the student network using only the synthetic data, we introduce a progressive distillation scheme. Experiments on various datasets and architectures demonstrate that the pro-posed method is able to be utilized for effectively learning portable student networks without the original data, e.g., with 0.16dB PSNR drop on Set5 for ×2 super resolution. Code will be available at https://github.com/huawei-noah/Data-Efficient-Model-Compression.

39 citations


Journal ArticleDOI
TL;DR: A survey of methods suitable for porting deep neural networks on resource-limited devices, especially for smart cameras, and introduces the methods to enhance networks structures as well as neural architecture search techniques.
Abstract: Over the past, deep neural networks have proved to be an essential element for developing intelligent solutions. They have achieved remarkable performances at a cost of deeper layers and millions of parameters. Therefore utilising these networks on limited resource platforms for smart cameras is a challenging task. In this context, models need to be (i) accelerated and (ii) memory efficient without significantly compromising on performance. Numerous works have been done to obtain smaller, faster and accurate models. This paper presents a survey of methods suitable for porting deep neural networks on resource-limited devices, especially for smart cameras. These methods can be roughly divided in two main sections. In the first part, we present compression techniques. These techniques are categorized into: knowledge distillation, pruning, quantization, hashing, reduction of numerical precision and binarization. In the second part, we focus on architecture optimization. We introduce the methods to enhance networks structures as well as neural architecture search techniques. In each of their parts, we describe different methods, and analyse them. Finally, we conclude this paper with a discussion on these methods.

32 citations


Journal ArticleDOI
TL;DR: A novel application of the image-to-world homography which gives the monocular vision system the efficacy of counting vehicles by lane and estimating vehicle length and speed in real-world units.
Abstract: Cameras have been widely used in traffic operations. While many technologically smart camera solutions in the market can be integrated into Intelligent Transport Systems (ITS) for automated detection, monitoring and data generation, many Network Operations (a.k.a Traffic Control) Centres still use legacy camera systems as manual surveillance devices. In this paper, we demonstrate effective use of these older assets by applying computer vision techniques to extract traffic data from videos captured by legacy cameras. In our proposed vision-based pipeline, we adopt recent state-of-the-art object detectors and transfer-learning to detect vehicles, pedestrians, and cyclists from monocular videos. By weakly calibrating the camera, we demonstrate a novel application of the image-to-world homography which gives our monocular vision system the efficacy of counting vehicles by lane and estimating vehicle length and speed in real-world units. Our pipeline also includes a module which combines a convolutional neural network (CNN) classifier with projective geometry information to classify vehicles. We have tested it on videos captured at several sites with different traffic flow conditions and compared the results with the data collected by piezoelectric sensors. Our experimental results show that the proposed pipeline can process 60 frames per second for pre-recorded videos and yield high-quality metadata for further traffic analysis.

30 citations


Proceedings ArticleDOI
15 Nov 2021
TL;DR: In this article, a decentralized federated learning framework for the neighborhood and a gradient selection mechanism to reduce the number of aggregated gradients and the frequency of gradient broadcasting to achieve communication-efficient and high prediction results is proposed.
Abstract: The fast-growing trend of Internet of Things (IoT) has provided its users with opportunities to improve user experience such as voice assistants, smart cameras, and home energy management systems. Such smart home applications often require large numbers of diverse training data to accomplish a robust model. As single user may not have enough data to train such a model, users intent to collaboratively train their collected data in order to achieve better performance in such applications, which raise the concern of data privacy protection. Existing approaches for collaborative training need to aggregate data or intermediate model training updates in the cloud to perform load forecasting, which could directly or indirectly cause personal data leakage, alongside with significant communication bandwidth and extra cloud service monetary cost. In this paper, to ensure the performance of smart home applications as well as the protection of user data privacy, we introduce the decentralized federated learning framework for the neighborhood and show the study on residential building load forecasting application as an example. We present PriResi, a privacy-preserved, communication-efficient and cloud-service-free load forecasting system to solve the above problems in a residential building. We first introduce a decentralized federated learning framework, which allows the residents to process all collected data locally on the edge by broadcasting the model updates between the smart home agent in each residence. Second, we propose a gradient selection mechanism to reduce the number of aggregated gradients and the frequency of gradient broadcasting to achieve communication-efficient and high prediction results. The real-word dataset experiments show that our method can achieve 97% of load forecasting accuracy while preserving residences' privacy. We believe that our proposed decentralized federated learning framework can be widely used in other smart home applications as well.

25 citations


Journal ArticleDOI
TL;DR: An artificial intelligence network-based smart camera system prototype, which tracks social distance using a bird’s-eye perspective, which can be integrated into public spaces within the “sustainable smart cities,” the scope that the world is on the verge of a change.

23 citations


Journal ArticleDOI
TL;DR: In this paper, a smart thermography camera is designed and its application in the diagnosis of electrical equipment is investigated in the field of diagnosing electrical equipment, where the defect assessment indicators are mainly the hot spot temperature and the relative temperature difference, and the analyzing process used to calculate the indicators is usually off-site.
Abstract: The thermography camera is widely used to inspect electrical equipment. The existing defect assessment indicators are mainly the hot spot temperature and the relative temperature difference (RTD), and the analyzing process used to calculate the indicators is usually off-site. A smart thermography camera is designed and its application in the diagnosis of electrical equipment is investigated in this article. For the camera, the regional RTD is calculated automatically based on equipment detection and image registration algorithms, and the defects can be judged according to the existing criteria. Firstly, the regional RTD and its implementation method are proposed to reduce the error when the hot spot is unmeasurable. Then the object detection method based on the convolutional neural network (CNN) is modified dedicated to infrared (IR) images. Over 12 000 historical images and 18 000 labels are used for training and tests. The modified model can identify 16 classes of substation equipment with 86.4% mAP. With low latency, the inference speed of this on-site smart camera reaches 30 frames/s. The test results show that the similarity of diagnosis result between our method and the criteria is 92.2%.

23 citations


Proceedings ArticleDOI
13 Feb 2021
TL;DR: In this paper, the authors proposed a system solution that is larger, higher power and more costly than the conventional CMOS Image Sensors (CIS) that only output the raw data of the captured image.
Abstract: Within the Internet of Things (IoT) market, retail, smart city, and so on, the need for camera products which have Artificial Intelligence (AI) processing capabilities is growing. AI processing capability on such edge devices solves some issues of cloud-only computing systems, such as latency, cloud communication, processing cost, and privacy concerns. The market demand for smart cameras with AI processing capabilities includes small size, low cost, low power and ease of installation. However, conventional CMOS Image Sensors (CIS) only output the raw data of the captured image. Therefore, when developing a smart camera that has AI processing capabilities, it is necessary to utilize ICs that include an image signal processor (ISP), CNN processing, DRAM and so on. Unfortunately, this results in system solution that is larger, higher power and more costly.

20 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a novel system for remote detecting COVID-19 patients based on artificial intelligence technology and internet of things (IoT) in order to stop the virus spreading at an early stage.
Abstract: In this paper, we will propose a novel system for remote detecting COVID-19 patients based on artificial intelligence technology and internet of things (IoT) in order to stop the virus spreading at an early stage. In this work, we will focus on connecting several sensors to work together as a system that can discover people infected with the Coronavirus remotely, this will reduce the spread of the disease. The proposed system consists of several devices called smart medical sensors such as: pulse, thermal monitoring, and blood sensors. The system is working sequentially starting by pulse sensor and end by blood sensor including an algorithm to manage the data given from sensors. The pulse sensor is devoted to acquire a high quality data using a smartphone equipped by a mobile dermatoscope with 20× magnification. The processing is used RGB color system to perform moving window to segment regions of interest (ROIs) as inputs of the heart rate estimation algorithm. The heart rate (HR) estimation is then given by computing the dominant frequency by identifying the most prominent peak of the discrete Fourier transform (DFT) technique. The thermal monitoring is used for fever detection using a smart camera that can provide an optimum solution for fever detection. The infrared sensor can quickly measure surface temperature without making any contact with a person's skin. A blood sensor is used to measure percentages of white, red blood (WBCs, RBCs) volume and platelets non-invasively using the bioimpedance analysis and independent component analysis (ICA). The proposed sensor consists of two electrodes which can be used to send the current to the earlobe and measure the produced voltage. A mathematical model was modified to describe the impedance of earlobe in different frequencies (i.e., low, medium, and high). The COMSOL model is used to simulate blood electrical properties and frequencies to measure WBCs, RBCs and Platelets volume. These devices are collected to work automatically without user interaction for remote checking the coronavirus patients. The proposed system is experimented by six examples to prove its applicability and efficiency.

17 citations


Journal ArticleDOI
TL;DR: The experimental results show that elite-instance-based matching can effectively save up to 70% of source samples on average that need to be transmitted for initial model training, and it further improves the accuracy of existing one-to-many and many- to-one e2e transfer learning.
Abstract: Due to advances in edge computing, image recognition via smart cameras (hereafter referred to as edge cameras) has facilitated the development of unmanned stores. However, it is very time consuming and costly to collect labeled data for training initial models for an edge camera. Although existing transfer learning can speed up model training, it must depend on a powerful centralized server with considerable human intervention, thus hindering the development of autonomous and collaborative edge learning. To address this issue, our study proposes direct edge-to-edge (e2e) collaborative transfer learning with three key technologies. The first is elite-instance-based matching to utilize and transmit only representative images for transfer learning to decrease network cost among edge cameras. The second is one-to-many e2e transfer learning, which can increase the knowledge reusability of a single source camera to build multiple target models. The last is many-to-one e2e transfer learning, which enables a target edge camera to reuse knowledge from multiple sources to further decrease the effort in labeled data collection. The experimental results show that elite-instance-based matching can effectively save up to 70% of source samples on average that need to be transmitted for initial model training, and it further improves the accuracy of existing one-to-many and many-to-one e2e transfer learning.

15 citations


Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a hybrid fuzzy convolutional neural network (HFCNN) for defect detection of solar photovoltaic (PV) cells, which integrates fuzzy logic and convolution operations at microscopic level.
Abstract: In the intelligent manufacturing process of solar photovoltaic (PV) cells, the automatic defect detection system using the Industrial Internet of Things (IIoT) smart cameras and sensors cooperated in IIoT has become a promising solution. Many works have been devoted to defect detection of PV cells in a data-driven way. However, because of the subjectivity and fuzziness of human annotation, the data contains a high quantity of noise and unpredictable uncertainties, which creates great difficulties in automatic defect detection. To address this problem, we propose a novel architecture named fuzzy convolution, which integrates fuzzy logic and convolution operations at microscopic level. Combining the proposed fuzzy convolution with the regular convolution, we build a network called Hybrid Fuzzy Convolutional Neural Network (HFCNN). Compared with convolutional neural networks (CNNs), HFCNN can address the uncertainties of PV cell data to improve the accuracy with fewer parameters, making it possible to apply our method in smart cameras. Experimental results on a public dataset show the superiority of our proposed method compared with CNNs.

Journal ArticleDOI
23 Apr 2021-Sensors
TL;DR: In this article, the authors presented a smart video surveillance system executing AI algorithms in low power consumption embedded devices, where the selected architecture for the edge node is based on a UpSquared2 device that includes a vision processor unit (VPU) capable of accelerating the AI CNN inference.
Abstract: New processing methods based on artificial intelligence (AI) and deep learning are replacing traditional computer vision algorithms. The more advanced systems can process huge amounts of data in large computing facilities. In contrast, this paper presents a smart video surveillance system executing AI algorithms in low power consumption embedded devices. The computer vision algorithm, typical for surveillance applications, aims to detect, count and track people’s movements in the area. This application requires a distributed smart camera system. The proposed AI application allows detecting people in the surveillance area using a MobileNet-SSD architecture. In addition, using a robust Kalman filter bank, the algorithm can keep track of people in the video also providing people counting information. The detection results are excellent considering the constraints imposed on the process. The selected architecture for the edge node is based on a UpSquared2 device that includes a vision processor unit (VPU) capable of accelerating the AI CNN inference. The results section provides information about the image processing time when multiple video cameras are connected to the same edge node, people detection precision and recall curves, and the energy consumption of the system. The discussion of results shows the usefulness of deploying this smart camera node throughout a distributed surveillance system.

Journal ArticleDOI
TL;DR: In this article, a machine learning-based technique using an algorithm known as variational dynamic Bayesian algorithm which helps to obtain a HMM with the number of parameters and model states which are optimized for the prediction of a DDoS attack.
Abstract: Significant growth of smart home devices is adopted, which provides security, convenience, and energy efficiency for users, in recent years. As an instance, consider a secured smart camera that detects movements of unauthorized objects, whereas the fire accidents can be detected by smoke sensors. Though, a surface for new cyber threats is some of the most recent examples that are open up in this field. Furthermore, recent examples also show that the distributed denial of service (DDoS) attacks are performed by misusing the smart devices which are hacked violating the privacy rules. In this proposed work, the application of machine learning in an environment of smart homes is explored so that the anomalous activities that occur can be identified. The sensor data at the network level is trained using a model based on the Markov chain known as hidden Markov model (HMM) and is created using smart devices and multiple sensors from a testbed. The model generated using HMM achieves 97% accuracy while detecting potential anomalies where attacks are indicated. In this approach, we construct the model and differentiate the analysed results with existing techniques. The starting step for securing IoT networks is intrusion detection and the next step is the intruders prediction which provides an active defence against the incoming attacks. The model employs an algorithm that does not depend on specific domain knowledge. The work has achieved an improvement in prediction accuracy of 5% for an alert category over the current variable length methods of Markov chain intrusion prediction, as they provided information more for a possible defence. The DDoS attack is considered as a coordinated attack mainly carried out based on a large scale depending on whether the target system’s resources or services are available. Thus, the novel approach also describes a novel machine learning-based technique using an algorithm known as variational dynamic Bayesian algorithm which helps to obtain a HMM with the number of parameters and model states which are optimized for the prediction of a DDoS attack. This procedure conquers the speed of a moderate combination HMM approach.

Journal ArticleDOI
05 Jan 2021
TL;DR: In this article, a new fitness function was developed in the genetic algorithm, which can consider the constraint conditions in terms of free obstacle, path length, path smoothness, and the visibility of the objective of interest in advance during the camera roaming.
Abstract: The main goal of intelligent camera path planning is to determine an optimal pathway that proceeds from the starting position to the target position under several constraint conditions in the given environment. Genetic algorithm-based method has found wide application in path optimization problem in the intelligent camera community recently. Because the roaming environments are very complex, the planning path of the intelligent camera should meet other constraint conditions in addition to the path length constraint and the obstacle-free constraint. In this study, a new fitness function was developed in the genetic algorithm, which can consider the constraint conditions in terms of free obstacle, path length, path smoothness, and the visibility of the objective of interest in advance during the camera roaming. In addition, a new evolving operator was introduced into the genetic algorithm, so that the number of iteration can be significantly reduced, and thus, the efficiency of the genetic algorithm can be improved. Experimental results show that the proposed genetic algorithm can obtain a high-quality path under multi-constraint conditions for intelligent camera with less numbers of iteration as compared with several conventional methods.

Journal ArticleDOI
TL;DR: Experimental results show that the use of a multi-level edge computing architecture helps in managing the generated data and the machine learning algorithms help in addressing issues like data redundancy and privacy-preserving in end-to-end communication.
Abstract: Roadside cameras in an Intelligent Transportation System (ITS) are used for various purposes, e.g., monitoring the speed of vehicles, violations of laws, and detection of suspicious activities in parking lots, streets, and side roads. These cameras generate big multimedia data, and as a result, the ITS faces challenges like data management, redundancy, and privacy breaching in end-to-end communication. To solve these challenges, we propose a framework, called SPEED, based on a multi-level edge computing architecture and machine learning algorithms. In this framework, data captured by end-devices, e.g., smart cameras, is distributed among multiple Level-One Edge Devices (LOEDs) to deal with data management issue and minimize packet drop due to buffer overflowing on end-devices and LOEDs. The data is forwarded from LOEDs to Level-Two Edge Devices (LTEDs) in a compressed sensed format. The LTEDs use an online Least-Squares Support-Vector Machines (LS-SVMs) model to determine distribution characteristics and index values of compressed sensed data to preserve its privacy during transmission between LTEDs and High-Level Edge Devices (HLEDs). The HLEDs estimate the redundancy in forwarded data using a deep learning architecture, i.e., a Convolutional Neural Network (CNN). The CNN is used to detect the presence of moving objects in the forwarded data. If a movement is detected, the data is forwarded to cloud servers for further analysis otherwise discarded. Experimental results show that the use of a multi-level edge computing architecture helps in managing the generated data. The machine learning algorithms help in addressing issues like data redundancy and privacy-preserving in end-to-end communication.

Journal ArticleDOI
TL;DR: Fog computing paradigm is used in the proposed model to decrease the makespan of time and the projected mechanism is tested and compared with different present systems and it is shown that proposed methodology produced effective results.
Abstract: The goal of Internet of Things (IoT) is to make “things” (wearable devices, smart cameras, sensors and smart home appliances) connect to internet. Large storage is required to store huge volume of data that is generated, data processing need to be carried out between IoT devices and the massive number of applications. This process can be made effectively with the help of cloud computing technology. Resources can be effectively utilized with the help of cloud, and IoT plays a significant role in managing the tasks that are to be offloaded to the cloud. The performance of the application is to be enhanced by providing Quality of Service (QoS) and the performance is evaluated in terms of QoS parameters like Power utilization, Makespan and Execution Time. The tasks are allocated based on priority. Fog computing paradigm is used in the proposed model to decrease the makespan of time. The projected mechanism is tested and compared with different present systems and is shown that proposed methodology produced effective results.

Journal ArticleDOI
TL;DR: The article considers the effects of treating the face as what Harun Farocki calls an “operative image”: not a representation, but part of a sequence of operations that deprive the face of its distinctive character to facilitate the automated governance of space.
Abstract: Concerns about the impending implementation of facial recognition technology in public and shared spaces go beyond privacy to include the changing relationship between space and power. This article...

Proceedings ArticleDOI
06 Jul 2021
Abstract: The behavior of individuals in crowds in public places has gained enormously in importance last year, for example through distancing requirements. However, automatically detecting pedestrians in real-world uncooperative scenarios remains a very challenging task. Especially crowded areas in surveillance footage are not only challenging for automatic vision systems, but also for human operators. Furthermore, complex detection models do not scale easily and are not traditionally designed for on-device processing in resource-constrained smart cameras, which become more and more popular due to technical and privacy issues at large events. In this work, we propose a new Fast Pedestrian Detector (FPD) based on RetinaNet which is a fast and efficient architecture for embedded platforms. The proposed FPD provides near real-time and real-time detection of hundreds of pedestrians on embedded platforms, outperforming popular YOLO-based approaches traditionally tuned for speed. Furthermore, by evaluating our approach on several different Jetson platforms in terms of speed and energy profiles, we highlight the challenges related to the deployment of a deep learning based pedestrian detector on embedded platforms for smart surveillance cameras.

Proceedings ArticleDOI
15 Nov 2021
TL;DR: In this paper, a battery-free smart camera that exploits aggressive power management and energy harvesting to achieve face recognition in an energy-neutral fashion is presented, and a novel hardware accelerator for Convolution Neural Networks is employed to speed up the inference of the Tiny Machine Learning algorithm.
Abstract: In this demo we present a battery-free smart camera that exploits aggressive power management and energy harvesting to achieve face recognition in an energy-neutral fashion. A novel hardware accelerator for Convolution Neural Networks is employed to speed up the inference of the Tiny Machine Learning algorithm. The recognized face, and not the entire image, is sent via LoRa in a sensor network-like scenario. Experimental results demonstrated the capability of the developed sensor node to start and work perpetually with only a small photovoltaic panel array.

Journal ArticleDOI
Jie Bai1, Sen Li1, Han Zhang1, Libo Huang1, Ping Wang1 
05 Feb 2021-Sensors
TL;DR: In this paper, a stable perception system based on a millimeter-wave radar and camera is designed to address the problem of target occlusion and external environmental interference in traffic management and traffic safety.
Abstract: Intelligent transportation systems (ITSs) play an increasingly important role in traffic management and traffic safety. Smart cameras are the most widely used sensors in ITSs. However, cameras suffer from a reduction in detection and positioning accuracy due to target occlusion and external environmental interference, which has become a bottleneck restricting ITS development. This work designs a stable perception system based on a millimeter-wave radar and camera to address these problems. Radar has better ranging accuracy and weather robustness, which is a better complement to camera perception. Based on an improved Gaussian mixture probability hypothesis density (GM-PHD) filter, we also propose an optimal attribute fusion algorithm for target detection and tracking. The algorithm selects the sensors’ optimal measurement attributes to improve the localization accuracy while introducing an adaptive attenuation function and loss tags to ensure the continuity of the target trajectory. The verification experiments of the algorithm and the perception system demonstrate that our scheme can steadily output the classification and high-precision localization information of the target. The proposed framework could guide the design of safer and more efficient ITSs with low costs.

Proceedings ArticleDOI
15 Nov 2021
TL;DR: In this article, a Lightweight Environmental Fingerprint Consensus based detection of compromised smart cameras in edge surveillance systems (LEFC) is proposed, which is a partial decentralized authentication mechanism that leverages Electrical Network Frequency (ENF) as an environmental fingerprint and distributed ledger technology (DLT).
Abstract: Rapid advances in the Internet of Video Things (IoVT) deployment in modern smart cities has enabled secure infrastructures with minimal human intervention. However, attacks on audio-video inputs affect the reliability of large-scale multimedia surveillance systems as attackers are able to manipulate the perception of live events. For example, Deepfake audio/video attacks and frame duplication attacks can cause significant security breaches. This paper proposes a Lightweight Environmental Fingerprint Consensus based detection of compromised smart cameras in edge surveillance systems (LEFC). LEFC is a partial decentralized authentication mechanism that leverages Electrical Network Frequency (ENF) as an environmental fingerprint and distributed ledger technology (DLT). An ENF signal carries randomly fluctuating spatio-temporal signatures, which enable digital media authentication. With the proposed DLT consensus mechanism named Proof-of-ENF (PoENF) as a backbone, LEFC can estimate and authenticate the media recording and detect byzantine nodes controlled by the perpetrator. The experimental evaluation shows feasibility and effectiveness of proposed LEFC scheme under a distributed byzantine network environment.

Proceedings ArticleDOI
01 Oct 2021
TL;DR: In this article, a blockchain-independent smart contract infrastructure for resource-constrained IoT devices is proposed, which is capable of executing time-sensitive business logic, and an end-to-end application is described.
Abstract: Due to the proliferation of IoT and the popularity of smart contracts mediated by blockchain, smart home systems have become capable of providing privacy and security to their occupants. In blockchain-based home automation systems, business logic is handled by smart contracts securely. However, a blockchain-based solution is inherently resource-intensive, making it unsuitable for resource-constrained IoT devices. Moreover, time-sensitive actions are complex to perform in a blockchainbased solution due to the time required to mine a block. In this work, we propose a blockchain-independent smart contract infrastructure suitable for resource-constrained IoT devices. Our proposed method is also capable of executing time-sensitive business logic. As an example of an end-to-end application, we describe a smart camera system using our proposed method, compare this system with an existing blockchain-based solution, and present an empirical evaluation of their performance.

Journal ArticleDOI
09 Nov 2021-Sensors
TL;DR: In this article, the OpenCV AI Kit (OAK-D) was adapted for the ADR running entirely on board of a drone, enabling it to perform computationally intensive tasks such as gate detection, drone localisation, and state estimation.
Abstract: Recent advances have shown for the first time that it is possible to beat a human with an autonomous drone in a drone race. However, this solution relies heavily on external sensors, specifically on the use of a motion capture system. Thus, a truly autonomous solution demands performing computationally intensive tasks such as gate detection, drone localisation, and state estimation. To this end, other solutions rely on specialised hardware such as graphics processing units (GPUs) whose onboard hardware versions are not as powerful as those available for desktop and server computers. An alternative is to combine specialised hardware with smart sensors capable of processing specific tasks on the chip, alleviating the need for the onboard processor to perform these computations. Motivated by this, we present the initial results of adapting a novel smart camera, known as the OpenCV AI Kit or OAK-D, as part of a solution for the ADR running entirely on board. This smart camera performs neural inference on the chip that does not use a GPU. It can also perform depth estimation with a stereo rig and run neural network models using images from a 4K colour camera as the input. Additionally, seeking to limit the payload to 200 g, we present a new 3D-printed design of the camera’s back case, reducing the original weight 40%, thus enabling the drone to carry it in tandem with a host onboard computer, the Intel Stick compute, where we run a controller based on gate detection. The latter is performed with a neural model running on an OAK-D at an operation frequency of 40 Hz, enabling the drone to fly at a speed of 2 m/s. We deem these initial results promising toward the development of a truly autonomous solution that will run intensive computational tasks fully on board.

Journal ArticleDOI
TL;DR: In this paper, the authors demonstrate the simultaneous dual image capture capability of the FDMA-CDMA mode using silicon (Si) and germanium (Ge) large area point photodetectors, allowing the capture of the ultraviolet-near-infrared 350-1,800 nm full spectrum.
Abstract: For the first time, to the best of our knowledge, the hybrid triple coding empowered frequency division multiple access (FDMA)-code division multiple access (CDMA) mode of the coded access optical sensor (CAOS) camera is demonstrated. Compared to the independent FDMA and CDMA modes, the FDMA-CDMA mode has a novel high-security space-time-frequency triple signal encoding design for robust, faster, linear irradiance extraction at a moderately high dynamic range (HDR). Specifically, this hybrid mode simultaneously combines the linear HDR strength of the FDMA-mode fast Fourier transform (FFT) digital signal processing (DSP)-based spectrum analysis with the high signal-to-noise ratio (SNR) provided by the many simultaneous CAOS pixels' photodetection of the CDMA mode. In particular, the demonstrated FDMA-CDMA mode with P FDMA channels provides a P times faster camera operation versus the equivalent linear HDR frequency modulation (FM)-CDMA mode. Visible band imaging experiments using a digital-micromirror-device-based CAOS camera operating in its passive light mode demonstrates a P=4 channels FDMA-CDMA mode, illustrating high-quality image recovery of a calibrated 64 dB six-patch HDR target versus the CDMA and FM-CDMA CAOS modes, which limit dynamic range and speed, respectively. For the first time to our knowledge, we demonstrate the simultaneous dual image capture capability of the FDMA-CDMA mode using silicon (Si) and germanium (Ge) large-area point photodetectors, allowing the capture of the ultraviolet-near-infrared 350-1,800 nm full spectrum. The active FDMA-CDMA mode CAOS camera operation is also demonstrated using P=3 LED light sources, each with its unique optical spectral content driven by its independent FDMA frequency. This illuminated target spectral signature matched active CAOS mode allows simultaneous capture of P images without the use of P time-multiplexed slot operation tunable optical filters. Applications for such a FDMA-CDMA camera includes controlled light illumination food inspection to bright light exposure security systems.

Journal ArticleDOI
TL;DR: Centipede is the first solution that can fundamentally reduce the cloud workload and defend against geo-range attacks and is implemented, a cooperative video data storage system that distributes video content across geographically dispersed surveillance cameras.
Abstract: Surveillance cameras have been extensively used in smart cities and high security zones. However, with the exploding deployment of smart cameras, the rapid growth of cloud workloads from vision-based IoT applications are becoming a huge burden for all cloud service providers. Some researchers have proposed mechanisms, such as compression and deduplication to reduce the video traffic size, but these methods cannot offset the enormous growth of data volume. Most of the surveillance video data do not need to be proceeded in real time. By making use of the IoT camera’s onboard resources to store the data, the cloud workloads can be fundamentally reduced. However, recent incidents have posed a new, powerful geo-range attack, where the attacker may compromise a group of surveillance cameras located within an area. Existing simple onboard solutions cannot offer secure defense against such geo-range attacks. To tackle the problem, we develop Centipede , a cooperative video data storage system that distributes video content across geographically dispersed surveillance cameras. It generates secure copies for the video content and enhances data security by judiciously distributing erasure-coded video blocks across optimally-chosen surveillance cameras. In this article, we implement Centipede and evaluate its performance. Centipede is the first solution that can fundamentally reduce the cloud workload and defend against geo-range attacks.

Proceedings ArticleDOI
08 Sep 2021
TL;DR: In this paper, the authors present DOHMO, an embedded computer vision system where multiple sensors, including intelligent cameras, are connected to actuators that regulate illumination and doors, in accordance with privacy design principles.
Abstract: This paper presents DOHMO, an embedded computer vision system where multiple sensors, including intelligent cameras, are connected to actuators that regulate illumination and doors. The system aims at assisting elderly and impaired people in co-housing scenarios, in accordance with privacy design principles. The paper provides details of two core elements of the system: The first one is the BOX-IO controller, a fully scalable and customizable hardware and software IoT ecosystem that can collect, control, and monitor data, operational flows and business scenarios, whether indoor or outdoor. The second one is the embedded 3DEverywhere intelligent camera, a device composed of an embedded system that receives input data provided by a 3D/2D camera, analyzes it, and returns the metadata of this analysis. We illustrate how they can be connected and how simple decision mechanisms can be implemented in such a framework. In particular, illumination can be triggered on and off by the detected presence of people, overcoming the limitations of typical sensors, while doors can be opened or closed based on person trajectories in an intelligent manner. To substantiate the proposed system, numerous experiments are performed in a lab and a co-housina scenario.

Proceedings ArticleDOI
05 Jan 2021
TL;DR: In this paper, a smart robot that can provide quality security with a fraction of price, when considering long term effects, was designed and implemented, which takes the advantage of modern technology allowing it to roam autonomously without the assistance of humans.
Abstract: Security is a necessity of every human being. With an increase in population the demand of security has increased. But due to lack of proper resources proper security cannot be arranged. To achieve proper security, it requires a heavy payment, which not all can afford. The aim of the research was to provide a solution to the problem by designing a smart robot that can provide quality security with a fraction of price, when considering long term effects. The idea was to design and implement a rover, that takes the advantage of modern technology allowing it to roam autonomously without the assistance of humans, scouring the area and alerting the control room once any abnormality has been detected. The robot has built in GPS navigation for figuring out the routes of its workplace. The robot is also equipped with a smart camera that allows for capturing of live video and images of the intruder when the time comes.

Book ChapterDOI
01 Jan 2021
TL;DR: This chapter has presented a review of the most superior deep learning techniques used by robotic vision systems in recent years and how well they have performed on different benchmark datasets available all around the world.
Abstract: Pedestrian detection has become more and more important in the field of robotic vision, having applications in autonomous driving, automated surveillance, smart homes, as well as mobile robots (or mobot). With the help of smart cameras, mobile robots have been able to detect, localize, and recognize pedestrians in a scene. In recent years, researchers from all around the world have developed robust deep learning-based systems for detecting pedestrians with subpar results. In this chapter, we have presented a review of the most superior deep learning techniques used by robotic vision systems in recent years and how well they have performed on different benchmark datasets available all around the world. All the techniques differ in two major respects, firstly the architecture of the system, and secondly the preprocessing, where input data is used in different capacities. The field of robotic vision is still under constant development and the day isn’t far when full automation will be a reality.

Journal ArticleDOI
04 Mar 2021-Sensors
TL;DR: In this paper, a hardware architecture for smart cameras that understand the salient regions from an image frame and then performs high-level inference computation for sensor-level information creation instead of transporting raw pixels is presented.
Abstract: Cameras are widely adopted for high image quality with the rapid advancement of complementary metal-oxide-semiconductor (CMOS) image sensors while offloading vision applications’ computation to the cloud. It raises concern for time-critical applications such as autonomous driving, surveillance, and defense systems since moving pixels from the sensor’s focal plane are expensive. This paper presents a hardware architecture for smart cameras that understands the salient regions from an image frame and then performs high-level inference computation for sensor-level information creation instead of transporting raw pixels. A visual attention-oriented computational strategy helps to filter a significant amount of redundant spatiotemporal data collected at the focal plane. A computationally expensive learning model is then applied to the interesting regions of the image. The hierarchical processing in the pixels’ data path demonstrates a bottom-up architecture with massive parallelism and gives high throughput by exploiting the large bandwidth available at the image source. We prototype the model in field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) for integrating with a pixel-parallel image sensor. The experiment results show that our approach achieves significant speedup while in certain conditions exhibits up to 45% more energy efficiency with the attention-oriented processing. Although there is an area overhead for inheriting attention-oriented processing, the achieved performance based on energy consumption, latency, and memory utilization overcomes that limitation.

Journal ArticleDOI
TL;DR: This study comprises three convolutional modules which perform ELNet using fewer computations, which is able to be implemented in resource-constrained hardware equipment and effectively lowered the computational complexity and parameters required in comparison with other CNN architectures.
Abstract: Deep learning has accomplished huge success in computer vision applications such as self-driving vehicles, facial recognition, and controlling robots. A growing need for deploying systems on resource-limited or resource-constrained environments such as smart cameras, autonomous vehicles, robots, smartphones, and smart wearable devices drives one of the current mainstream developments of convolutional neural networks: reducing model complexity but maintaining fine accuracy. In this study, the proposed efficient light convolutional neural network (ELNet) comprises three convolutional modules which perform ELNet using fewer computations, which is able to be implemented in resource-constrained hardware equipment. The classification task using CIFAR-10 and CIFAR-100 datasets was used to verify the model performance. According to the experimental results, ELNet reached 92.3% and 69%, respectively, in CIFAR-10 and CIFAR-100 datasets; moreover, ELNet effectively lowered the computational complexity and parameters required in comparison with other CNN architectures.