scispace - formally typeset
Search or ask a question

Showing papers on "Smart camera published in 2018"


Journal ArticleDOI
TL;DR: Experimental results reveal that the proposed approach can reliably pre-alarm security risk events, substantially reduce storage space of recorded video and significantly speed up the evidence video retrieval associated with specific suspects.
Abstract: Video surveillance system has become a critical part in the security and protection system of modem cities, since smart monitoring cameras equipped with intelligent video analytics techniques can monitor and pre-alarm abnormal behaviors or events. However, with the expansion of the surveillance network, massive surveillance video data poses huge challenges to the analytics, storage and retrieval in the Big Data era. This paper presents a novel intelligent processing and utilization solution to big surveillance video data based on the event detection and alarming messages from front-end smart cameras. The method includes three parts: the intelligent pre-alarming for abnormal events, smart storage for surveillance video and rapid retrieval for evidence videos, which fully explores the temporal-spatial association analysis with respect to the abnormal events in different monitoring sites. Experimental results reveal that our proposed approach can reliably pre-alarm security risk events, substantially reduce storage space of recorded video and significantly speed up the evidence video retrieval associated with specific suspects.

100 citations


Journal ArticleDOI
TL;DR: This proposal describes a CNN architecture which is able to infer the noise pattern of mobile camera sensors (also known as camera fingerprint) with the aim at detecting and identifying not only the mobile device used to capture an image, but also from which embedded camera the image was captured.

87 citations


Book ChapterDOI
TL;DR: This chapter presents a review of the more developed paradigms aimed to bring computational, storage and control capabilities closer to where data is generated in the IoT: fog and edge computing, contrasted with the cloud computing paradigm.
Abstract: The main postulate of the Internet of things (IoT) is that everything can be connected to the Internet, at anytime, anywhere. This means a plethora of objects (e.g. smart cameras, wearables, environmental sensors, home appliances, and vehicles) are ‘connected’ and generating massive amounts of data. The collection, integration, processing and analytics of these data enable the realisation of smart cities, infrastructures and services for enhancing the quality of life of humans. Nowadays, existing IoT architectures are highly centralised and heavily rely on transferring data processing, analytics, and decision-making processes to cloud solutions. This approach of managing and processing data at the cloud may lead to inefficiencies in terms of latency, network traffic management, computational processing, and power consumption. Furthermore, in many applications, such as health monitoring and emergency response services, which require low latency, delay caused by transferring data to the cloud and then back to the application can seriously impact their performances. The idea of allowing data processing closer to where data is generated, with techniques such as data fusion, trending of data, and some decision making, can help reduce the amount of data sent to the cloud, reducing network traffic, bandwidth and energy consumption. Also, a more agile response, closer to real-time, will be achieved, which is necessary in applications such as smart health, security and traffic control for smart cities. Therefore, this chapter presents a review of the more developed paradigms aimed to bring computational, storage and control capabilities closer to where data is generated in the IoT: fog and edge computing, contrasted with the cloud computing paradigm. Also an overview of some practical use cases is presented to exemplify each of these paradigms and their main differences.

70 citations


Journal ArticleDOI
TL;DR: In this paper, a continuous-time representation of the event camera pose is used to perform visual-inertial odometry with an event camera, which can deal with the high temporal resolution and asynchronous nature of this sensor in a principled way.
Abstract: Event cameras are bioinspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. They offer significant advantages over standard cameras, namely a very high dynamic range, no motion blur, and a latency in the order of microseconds. However, due to the fundamentally different structure of the sensor's output, new algorithms that exploit the high temporal resolution and the asynchronous nature of the sensor are required. Recent work has shown that a continuous-time representation of the event camera pose can deal with the high temporal resolution and asynchronous nature of this sensor in a principled way. In this paper, we leverage such a continuous-time representation to perform visual-inertial odometry with an event camera. This representation allows direct integration of the asynchronous events with microsecond accuracy and the inertial measurements at high frequency. The event camera trajectory is approximated by a smooth curve in the space of rigid-body motions using cubic splines. This formulation significantly reduces the number of variables in trajectory estimation problems. We evaluate our method on real data from several scenes and compare the results against ground truth from a motion-capture system. We show that our method provides improved accuracy over the result of a state-of-the-art visual odometry method for event cameras. We also show that both the map orientation and scale can be recovered accurately by fusing events and inertial data. To the best of our knowledge, this is the first work on visual-inertial fusion with event cameras using a continuous-time framework.

57 citations


Proceedings ArticleDOI
01 Oct 2018
TL;DR: This paper proposes an edge computing framework for real-time monitoring, which moves the computation away from the centralized cloud to the near-device edge servers and proposes an efficient heuristic algorithm based on the simulated annealing strategy.
Abstract: Due to the ever-growing demands in modern cities, unreliable and inefficient power transportation becomes one critical issue in nowadays power grid. This makes power grid monitoring one of the key modules in power grid system and play an important role in preventing severe safety accidents. However, the traditional manual inspection cannot efficiently achieve this goal due to its low efficiency and high cost. Smart grid as a new generation of the power grid, sheds new light to construct an intelligent, reliable and efficient power grid with advanced information technology. In smart grid, automated monitoring can be realized by applying advanced deep learning algorithms on powerful cloud computing platform together with such IoT (Internet of Things) devices as smart cameras. The performance of cloud monitoring, however, can still be unsatisfactory since a large amount of data transmission over the Internet will lead to high delay and low frame rate. In this paper, we note that the edge computing paradigm can well complement the cloud and significantly reduce the delay to improve the overall performance. To this end, we propose an edge computing framework for real-time monitoring, which moves the computation away from the centralized cloud to the near-device edge servers. To maximize the benefits, we formulate a scheduling problem to further optimize the framework and propose an efficient heuristic algorithm based on the simulated annealing strategy. Both real-world experiments and simulation results show that our framework can increase the monitoring frame rate up to 10 times and reduce the detection delay up to 85% comparing to the cloud monitoring solution.

56 citations


Journal ArticleDOI
TL;DR: An extensive power characterization of the system’s operation is presented and an open-source wireless camera network that can adapt to address the requirements of future outdoor video monitoring applications is provided.

47 citations


Journal ArticleDOI
TL;DR: This paper proposes a simple, low-cost and high rate method for state estimation enabling autonomous flight of micro aerial vehicles, which presents a low computational burden and investigates the performances of two Kalman filters, in the extended and error-state flavors, alongside with a large number of algorithm modifications defended in earlier literature on visual-inertial odometry.
Abstract: The combination of visual and inertial sensors for state estimation has recently found wide echo in the robotics community, especially in the aerial robotics field, due to the lightweight and complementary characteristics of the sensors data. However, most state estimation systems based on visual-inertial sensing suffer from severe processor requirements, which in many cases make them impractical. In this paper, we propose a simple, low-cost and high rate method for state estimation enabling autonomous flight of micro aerial vehicles, which presents a low computational burden. The proposed state estimator fuses observations from an inertial measurement unit, an optical flow smart camera and a time-of-flight range sensor. The smart camera provides optical flow measurements up to a rate of 200 Hz, avoiding the computational bottleneck to the main processor produced by all image processing requirements. To the best of our knowledge, this is the first example of extending the use of these smart cameras from hovering-like motions to odometry estimation, producing estimates that are usable during flight times of several minutes. In order to validate and defend the simplest algorithmic solution, we investigate the performances of two Kalman filters, in the extended and error-state flavors, alongside with a large number of algorithm modifications defended in earlier literature on visual-inertial odometry, showing that their impact on filter performance is minimal. To close the control loop, a non-linear controller operating in the special Euclidean group SE(3) is able to drive, based on the estimated vehicle’s state, a quadrotor platform in 3D space guaranteeing the asymptotic stability of 3D position and heading. All the estimation and control tasks are solved on board and in real time on a limited computational unit. The proposed approach is validated through simulations and experimental results, which include comparisons with ground-truth data provided by a motion capture system. For the benefit of the community, we make the source code public.

41 citations


Proceedings ArticleDOI
10 Jun 2018
TL;DR: The following image processing algorithms are proposed: vehicle detection and counting algorithm, road marking detection algorithm, which are designed to process images obtained from a stationary camera.
Abstract: In this paper we are considering road situation analysis tasks for traffic control and ensuring safety. The following image processing algorithms are proposed: vehicle detection and counting algorithm, road marking detection algorithm. The algorithms are designed to process images obtained from a stationary camera. The developed vehicle detection and counting algorithm was implemented and tested also on an embedded platform of smart cameras. The results of experimental research of proposed algorithms are presented.

36 citations


Journal ArticleDOI
16 Jan 2018
TL;DR: A dead-reckoning method implementing a particle filter to estimate a set of likely map-matched hypotheses containing the correct solution with a high probability, and an integrity monitoring method for assessing the coherence of the set of hypotheses, using the fix of a global navigation satellite system receiver.
Abstract: Navigation maps provide critical information for advanced driving assistance systems and autonomous vehicles. When these maps are refined to lane-level, ambiguities may occur during the map-matching process, particularly when positioning estimates are inaccurate. This paper presents a dead-reckoning method implementing a particle filter to estimate a set of likely map-matched hypotheses containing the correct solution with a high probability. Our method uses lane-level maps that feature dedicated attributes such as connectedness and adjacency. The vehicle position is essentially estimated by dead-reckoning sensors and lane detection using an intelligent camera. We also describe an integrity monitoring method for assessing the coherence of the set of hypotheses, using the fix of a global navigation satellite system receiver. The method provides in real-time a “Use/Don’t Use” characterization of the vehicle positioning information that is transmitted to safety functions, where integrity is fundamental. The performance of the proposed map-aided dead-reckoning method with integrity monitoring is evaluated using data acquired by an experimental car on suburban public roads. The results obtained validate the approach.

30 citations


Journal ArticleDOI
TL;DR: A systematic methodology to perform joint approximations across different subsystems, leading to significant energy benefits compared to approximating individual subsystems in isolation is proposed.
Abstract: The intrinsic error resilience exhibited by emerging application domains enables new avenues for energy optimization of computing systems, namely, the introduction of a small amount of approximations during system operation in exchange for substantial energy savings. Prior work in the area of approximate computing has focused on individual subsystems of a computing system, for example, the computational subsystem or the memory subsystem. Since they focus only on individual subsystems, these techniques are unable to exploit the large energy-saving opportunities that stem from adopting a full-system perspective and approximating multiple subsystems of a computing platform simultaneously in a coordinated manner. This paper proposes a systematic methodology to perform joint approximations across different subsystems, leading to significant energy benefits compared to approximating individual subsystems in isolation. We use the example of a smart camera system that executes various computer vision and image processing applications to illustrate how the sensing, memory, processing, and communication subsystems can all be approximated synergistically. We demonstrate our proposed methodology using two variants of a smart camera system: 1) a compute-intensive smart camera system, $\texttt {AxSYS}_{\texttt {comp}}$ , where the error-resilient application executes locally within the camera and produces the final application output, and 2) a communication-intensive smart camera system, $\texttt {AxSYS}_{\texttt {comp}}$ , that sends the captured image to a remote cloud server, where the error-resilient application is executed and the final output is generated. We have implemented such an approximate smart camera system using an Altera Stratix IV GX FPGA development board, a Terasic TRDB-D5M 5-Megapixel camera module, a Terasic RFS WiFi module, and a 1-GB DDR3 dynamic random access memory small outline dual in-line memory module (SODIMM). Experimental results obtained using six application benchmarks demonstrate significant energy savings (around $7.5\times $ for $\texttt {AxSYS}_{\texttt {comp}}$ and $4\times $ on average for $\texttt {AxSYS}_{\texttt {comp}}$ ) for minimal ( $3.5\times $ – $5.5\times $ (in the case of $\texttt {AxSYS}_{\texttt {comp}}$ ) and $1.8\times $ – $3.7\times $ (in the case of $\texttt {AxSYS}_{\texttt {comm}}$ ) on average for a minimal (<1%) application-level quality loss.

29 citations


Proceedings ArticleDOI
19 Mar 2018
TL;DR: A global vulnerability assessment is performed using the Shodan search engine and the Common Vulnerabilities and Exposures database to detect smart connected cameras exposed on the Internet alongside their sensitive, potentially private, data being broadcasted.
Abstract: The Internet of Things is enabling innovative services promising added convenience and value in various domains such as the smart home. Increasingly, households, office environments and cities, are being fitted with smart camera systems aimed to enhance the security of citizens. At the same time, several systems being deployed suffer from weak security implementations. Recognizing this, and to understand the extent of this situation, in this study we perform a global vulnerability assessment using the Shodan search engine and the Common Vulnerabilities and Exposures database. This is done to detect smart connected cameras exposed on the Internet alongside their sensitive, potentially private, data being broadcasted. Furthermore, we discuss whether the discovered data can be used to compromise the safety and privacy of individuals, and identify some mitiga- tions that can be adopted. The results indicate that a significant number of smart cameras are indeed prone to diverse security and privacy vulnerabilities.

Journal ArticleDOI
TL;DR: EclipseIoT, an adaptive hub which uses dynamically loadable add-on modules to communicate with diverse IoT devices, provides policy-based access control, limits exposure of local IoT devices through cloaking, and offers a canary-function based capability to monitor attack behaviours is introduced.

Journal ArticleDOI
TL;DR: The article demonstrates the usefulness of heterogeneous System on Chip (SoC) devices in smart cameras used in intelligent transportation systems (ITS) and other advanced embedded image processing, analysis and recognition applications.
Abstract: The article demonstrates the usefulness of heterogeneous System on Chip (SoC) devices in smart cameras used in intelligent transportation systems (ITS). In a compact, energy efficient system the following exemplary algorithms were implemented: vehicle queue length estimation, vehicle detection, vehicle counting and speed estimation (using multiple virtual detection lines), as well as vehicle type (local binary features and SVM classifier) and colour (k-means classifier and YCbCr colourspace analysis) recognition. The solution exploits the hardware–software architecture, i.e. the combination of reconfigurable resources and the efficient ARM processor. Most of the modules were implemented in hardware, using Verilog HDL, taking full advantage of the possible parallelization and pipeline, which allowed to obtain real-time image processing. The ARM processor is responsible for executing some parts of the algorithm, i.e. high-level image processing and analysis, as well as for communication with the external systems (e.g. traffic lights controllers). The demonstrated results indicate that modern SoC systems are a very interesting platform for advanced ITS systems and other advanced embedded image processing, analysis and recognition applications.

Proceedings ArticleDOI
25 Jun 2018
TL;DR: This work explores a smart camera surveillance system aimed at tracking all vehicles in real time, and proposes novel techniques, namely, forward and backward propagation that reduces the latency for the operations and the communication overhead.
Abstract: To fully exploit the capabilities of sensors in real life, especially cameras, smart camera surveillance requires the cooperation from both domain experts in computer vision and systems. Existing alert-based smart surveillance is only capable of tracking a limited number of suspicious objects, while in most real-life applications, we often do not know the perpetrator ahead of time for tracking their activities in advance. In this work, we propose a radically different approach to smart surveillance for vehicle tracking. Specifically, we explore a smart camera surveillance system aimed at tracking all vehicles in real time. The insight is not to store the raw videos, but to store the space-time trajectories of the vehicles. Since vehicle tracking is a continuous and geo-distributed task, we assume a geo-distributed Fog computing infrastructure as the execution platform for our system. To bound the storage space for storing the trajectories on each Fog node (serving the computational needs of a camera), we focus on the activities of vehicles in the vicinity of a given camera in a specific geographic region instead of the time dimension, and the fact that every vehicle has a "finite" lifetime. To bound the computational and network communication requirements for detection, re-identification, and inter-node communication, we propose novel techniques, namely, forward and backward propagation that reduces the latency for the operations and the communication overhead. STTR is a system for smart surveillance that we have built embodying these ideas. For evaluation, we develop a toolkit upon SUMO to emulate camera detections from traffic flow and adopt MaxiNet to emulate the fog computing infrastructure on Microsoft Azure.

Journal ArticleDOI
26 Nov 2018
TL;DR: It is demonstrated that a scalable wireless sensor network with CO -based estimation is a viable alternative and a viable prototype is presented and used to train machine learning-based occupancy estimation systems.
Abstract: Many applications, such as smart buildings, crowd flow, action recognition, and assisted living, rely on occupancy information. Although the use of smart cameras and computer vision can assist with these tasks and provide accurate occupancy information, it can be cost prohibitive, invasive, and difficult to scale or generalise to different environments. An alternative solution should bring similar accuracy while minimising the listed problems. This work demonstrates that a scalable wireless sensor network with CO 2 -based estimation is a viable alternative. To support many applications, a solution must be transferable and must handle not knowing the physical system model; instead, it must learn to model CO 2 dynamics. This work presents a viable prototype and uses the captured data to train machine learning-based occupancy estimation systems. Models are trained under varying conditions to assess the consequences of design decisions on performance. Four different learning models were compared: gradient boosting, k-nearest neighbours (KNN), linear discriminant analysis, and random forests. With sufficient labelled data, the KNN model produced peak results with a root-mean-square error value of 1.021.

Proceedings ArticleDOI
01 Dec 2018
TL;DR: A Wireless Sensor Network (WSN) is presented, which is intended to provide a scalable solution for active cooperative monitoring of wide geographical areas and is designed to use different smart-camera prototypes.
Abstract: In this paper we present a Wireless Sensor Network (WSN), which is intended to provide a scalable solution for active cooperative monitoring of wide geographical areas. The system is designed to use different smart-camera prototypes: where the connection to the power grid is available a powerful embedded hardware implements a Deep Neural Network, otherwise a fully autonomous energy-harvesting node based on a low-energy custom board employs lightweight image analysis algorithms. Parking lots occupancy monitoring in the historical city of Lucca (Italy) is the application where the implemented smart cameras have been deployed. Traffic monitoring and surveillance are possible new scenarios for the system.

Proceedings ArticleDOI
14 Jun 2018
TL;DR: The aim of this paper is to demonstrate the kinds of vulnerabilities that exist in home monitoring smart cameras and to demonstrate their effects on users' security and privacy, by proposing a threat model and aSecurity and privacy analysis framework.
Abstract: The significant increase in the number of applications that depend on Internet of Things concept is becoming more evident. It has been deployed in many areas in smart homes, smart cities and health monitoring applications. The means to secure these applications are slower than our growing dependence on them. The aim of this paper is to demonstrate the kinds of vulnerabilities that exist in home monitoring smart cameras and to demonstrate their effects on users' security and privacy, by proposing a threat model and a security and privacy analysis framework. The framework covers five major components of the smart camera system with a set of designed test cases. The framework is applied to five commodity smart cameras. Range of vulnerabilities are discovered with respect to the framework. The vulnerabilities discovered indicate that IoT devices continue to be shipped by vendors without putting enough effort on their security and with insufficient regard for the implications that they have on users' privacy. The work reported here has been part of the first author's MSc thesis [1].

Journal ArticleDOI
TL;DR: This paper proposes a solution to automate camera placement for motion capture applications in order to assist a human operator through the use of a guided genetic algorithm to optimize camera network placement with an appropriate number of cameras.
Abstract: In multi-camera motion capture systems, determining the optimal camera configuration (camera positions and orientations) is still an unresolved problem. At present, configurations are primarily guided by a human operator’s intuition, which requires expertise and experience, especially with complex, cluttered scenes. In this paper, we propose a solution to automate camera placement for motion capture applications in order to assist a human operator. Our solution is based on the use of a guided genetic algorithm to optimize camera network placement with an appropriate number of cameras. In order to improve the performance of the genetic algorithm (GA), two techniques are described. The first is a distribution and estimation technique, which reduces the search space and generates camera positions for the initial GA population. The second technique is an error metric, which is integrated at GA evaluation level as an optimization function to evaluate the quality of the camera placement in a camera network. Simulation experiments show that our approach is more efficient than other approaches in terms of computation time and quality of the final camera network.

Journal ArticleDOI
TL;DR: A new approach to develop an "all-seeing" smart building, where the global system is the first step to attempt to provide Artificial Intelligence to a building by using ontologies and semantic web technologies.
Abstract: The contextual information in the built environment is highly heterogeneous, it goes from static information (e.g., information about the building structure) to dynamic information (e.g., user's space-time information, sensors detections and events that occurred). This paper proposes to semantically fuse the building's contextual information with extracted data from a smart camera network by using ontologies and semantic web technologies. The developed ontology allows interoperability between the different contextual data and enables, without human interaction, real-time event detections and system reconfiguration to be performed. The use of semantic knowledge in multi-camera monitoring systems guarantees the protection of the user's privacy by not sending nor saving any image, just extracting the knowledge from them. This paper presents a new approach to develop an "all-seeing" smart building, where the global system is the first step to attempt to provide Artificial Intelligence (AI) to a building.

Book ChapterDOI
01 Jan 2018
TL;DR: A comparative study of the most used and popular low-level local feature extractors: SIFT, SURF, ORB, PHOG, WGCH, Haralick and A-KAZE is provided, to discuss their behavior and robustness in terms of invariance with respect to the most important critical factors.
Abstract: Local feature detectors and descriptors (hereinafter extractors) play a key role in the modern computer vision. Their scope is to extract, from any image, a set of discriminative patterns (hereinafter keypoints) present on some parts of background and/or foreground elements of the image itself. A prerequisite of a wide range of practical applications (e.g., vehicle tracking, person re-identification) is the design and development of algorithms able to detect, recognize and track the same keypoints within a video sequence. Smart cameras can acquire images and videos of an interesting scenario according to different intrinsic (e.g., focus, iris) and extrinsic (e.g., pan, tilt, zoom) parameters. These parameters can make the recognition of a same keypoint between consecutive images a hard task when some critical factors such as scale, rotation and translation are present. The aim of this chapter is to provide a comparative study of the most used and popular low-level local feature extractors: SIFT, SURF, ORB, PHOG, WGCH, Haralick and A-KAZE. At first, the chapter starts by providing an overview of the different extractors referenced in a concrete case study to show their potentiality and usage. Afterwards, a comparison of the extractors is performed by considering the Freiburg-Berkeley Motion Segmentation (FBMS-59) dataset, a well-known video data collection widely used by the computer vision community. Starting from a default setting of the local feature extractors, the aim of the comparison is to discuss their behavior and robustness in terms of invariance with respect to the most important critical factors. The chapter also reports comparative considerations about one of the basic steps based on the feature extractors: the matching process. Finally, the chapter points out key considerations about the use of the discussed extractors in real application domains.

Journal ArticleDOI
23 Dec 2018-Sensors
TL;DR: A hardware architecture for depth from motion that consists of a flow/depth transformation and a new optical flow algorithm that delivers dense maps with motion and depth information on all image pixels, with a processing speed up to 128 times faster than that of previous work, making it possible to achieve high performance in the context of embedded applications.
Abstract: Applications such as autonomous navigation, robot vision, and autonomous flying require depth map information of a scene. Depth can be estimated by using a single moving camera (depth from motion). However, the traditional depth from motion algorithms have low processing speeds and high hardware requirements that limit the embedded capabilities. In this work, we propose a hardware architecture for depth from motion that consists of a flow/depth transformation and a new optical flow algorithm. Our optical flow formulation consists in an extension of the stereo matching problem. A pixel-parallel/window-parallel approach where a correlation function based on the sum of absolute difference (SAD) computes the optical flow is proposed. Further, in order to improve the SAD, the curl of the intensity gradient as a preprocessing step is proposed. Experimental results demonstrated that it is possible to reach higher accuracy (90% of accuracy) compared with previous Field Programmable Gate Array (FPGA)-based optical flow algorithms. For the depth estimation, our algorithm delivers dense maps with motion and depth information on all image pixels, with a processing speed up to 128 times faster than that of previous work, making it possible to achieve high performance in the context of embedded applications.

Journal ArticleDOI
TL;DR: A flexible uncertainty model that can be used to characterize the detection behavior in a camera network is introduced and how to utilize the model to formulate detection-aware optimization algorithms that can been used to reconfigure the network in order to improve the overall detection efficiency and thus increase the effective number of detected targets.
Abstract: Networks of smart cameras, equipped with on-board processing and communication infrastructure, are increasingly being deployed in a variety of different application fields, such as security and surveillance, traffic monitoring, industrial monitoring, and critical infrastructure protection. The task(s) that a network of smart cameras executes in these applications, e.g., activity monitoring and object identification, can be severely degraded due to errors in the detection module. However, in most cases, higher level tasks and decision making processes in smart camera networks (SCNs) assume ideal detection capabilities for the cameras, which is often not the case due to the probabilistic nature of the detection process, especially for low-cost cameras with limited capabilities. Realizing that it is necessary to introduce robustness in the decision process, this paper presents results toward uncertainty-aware SCNs. Specifically, we introduce a flexible uncertainty model that can be used to characterize the detection behavior in a camera network. We also show how to utilize the model to formulate detection-aware optimization algorithms that can be used to reconfigure the network in order to improve the overall detection efficiency and thus increase the effective number of detected targets. We evaluate our proposed model and algorithms using a network of Raspberry-Pi-based smart cameras that reconfigure in order to improve the detection performance based on the position of targets in the area. The experimental results in the laboratory as well as in a human monitoring application and extensive simulation results indicate that the proposed solutions are able to improve the robustness and reliability of SCNs.

Journal ArticleDOI
TL;DR: Owing to its fully self-powered operation, the proposed system can find wide applications in “always-on” vision systems, such as in surveillance, robotics, and consumer electronics with touch-less operation.
Abstract: This paper presents an ultralow power smart camera with gesture detection. Low power is achieved by directly extracting gesture features from the compressed measurements, which are the block averages and the linear combinations of the image sensor’s pixel values. We present two classifier techniques to allow low computational and storage requirements. The system has been implemented on an analog devices BlackFin ULP vision processor. By enabling ultralow energy consumption, we demonstrate that the system is powered by ambient light harvested through photovoltaic cells whose output is regulated by TI’s dc–dc buck converter with maximum power point tracking. Measured data reveals that with only 400 compressed measurements ( $768\times $ compression ratio) per frame, the system is able to recognize key wake-up gestures with greater than 80% accuracy and only $95mJ$ of energy per frame. Owing to its fully self-powered operation, the proposed system can find wide applications in “always-on” vision systems, such as in surveillance, robotics, and consumer electronics with touch-less operation.

Journal ArticleDOI
TL;DR: A compressed-domain smashed-filter-based object recognition system and measurements from a 130 nm mixed-signal test chip are proposed to demonstrate reconfigurable and ultra-low power operation.
Abstract: To avoid data deluge at the back end, it is imperative that advanced sensor nodes perform “in-sensor” processing to extract relevant features from data. We propose a compressed-domain smashed-filter-based object recognition system and measurements from a 130 nm mixed-signal test chip to demonstrate reconfigurable and ultra-low power operation. We measure greater than 90% accuracy in object localization with 165 nJ energy per frame on image data sets.

Patent
10 Apr 2018
TL;DR: In this paper, a human face detection and head posture angle assessment combined system adopting a small scale hardware convolutional netural network (CNN) module in an embedded system is presented.
Abstract: The invention provides embodiments of a human face detection and head posture angle assessment combined system adopting a small scale hardware convolutional netural network (CNN) module in an embeddedsystem. The small scale hardware CNN module is, for example, an embedded CNN module in the Hi3519 chip of Hess Semiconductor. In some embodiments, the face detection and head posture angle assessmentcombined system disclosed in the present application is used to jointly perform multiple tasks: detecting most or all faces in a sequence of video frames, generating a posture angle assessment for the detected face, tracking the detected person's face of the same person in the sequence of video frames and generating a "best posture" assessment of the tracked person. The face detection and postureassessment joint system can be implemented in resource-limited embedded systems, such as smart camera systems that only integrate one or more small-scale CNN modules. The system proposed by the present application and the sub-image-based technology combine to implement a small-scale, low-cost CNN module to perform multiple face detection and face recognition tasks on high-resolution input images.

Proceedings ArticleDOI
04 Jun 2018
TL;DR: In this paper, the authors presented aggregate-signcryption which extends the EC-based signcryption approach to a cluster-based multi-camera setup, where the signcrypted data from the smart cameras within a cluster is aggregated on a specific node called cluster head.
Abstract: Smart cameras are considered as key sensors in Internet of Things (IoT) applications ranging from home to city scales. Since these cameras often capture highly sensitive information, security is a major concern. An elliptic curve (EC) based signcryption achieves resource-efficiency by performing data encryption and signing in a single step. In this work, we present aggregate-signcryption which extends the EC-based signcryption approach to a cluster-based multi-camera setup. The signcrypted data from the smart cameras within a cluster is aggregated on a specific node called cluster head. Aggregatesigncryption reduces the communication overhead and requires fewer steps for the unsigncryption as compared to individual signcryption.

Proceedings ArticleDOI
12 Oct 2018
TL;DR: This paper presents an approach that allows building an intrusion detection system, based on face recognition, running on embedded devices, that relies on deep learning techniques and does not exploit the GPUs.
Abstract: With the advent of deep learning based methods, facial recognition algorithms have become more effective and efficient. However, these algorithms have usually the disadvantage of requiring the use of dedicated hardware devices, such as graphical processing units (GPUs), which pose restrictions on their usage on embedded devices with limited computational power.In this paper, we present an approach that allows building an intrusion detection system, based on face recognition, running on embedded devices. It relies on deep learning techniques and does not exploit the GPUs. Face recognition is performed using a knn classifier on features extracted from a 50-layers Residual Network (ResNet-50) trained on the VGGFace2 dataset. In our experiment, we determined the optimal confidence threshold that allows distinguishing legitimate users from intruders.In order to validate the proposed system, we created a ground truth composed of 15,393 images of faces and 44 identities, captured by two smart cameras placed in two different offices, in a test period of six months. We show that the obtained results are good both from the efficiency and effectiveness perspective.

Journal ArticleDOI
09 Jan 2018
TL;DR: The Smart Building Suite is presented, in which independent and different technologies are developed in order to realize a multimodal surveillance system.
Abstract: The main goal of a surveillance system is to collect information in a sensing environment and notify unexpected behavior. Information provided by single sensor and surveillance technology may not be sufficient to understand the whole context of the monitored environment. On the other hand, by combining information coming from different sources, the overall performance of a surveillance system can be improved. In this paper, we present the Smart Building Suite, in which independent and different technologies are developed in order to realize a multimodal surveillance system.

Patent
30 Apr 2018
TL;DR: In this paper, an apparatus comprises a communication interface and a processor, where the communication interface is to communicate with a plurality of cameras and the processor is to obtain metadata associated with an initial state of an object, wherein the object is captured by a first camera in a first video stream at a first point in time, and wherein the metadata is obtained based on the first video streams.
Abstract: In one embodiment, an apparatus comprises a communication interface and a processor. The communication interface is to communicate with a plurality of cameras. The processor is to obtain metadata associated with an initial state of an object, wherein the object is captured by a first camera in a first video stream at a first point in time, and wherein the metadata is obtained based on the first video stream. The processor is further to predict, based on the metadata, a future state of the object at a second point in time, and identify a second camera for capturing the object at the second point in time. The processor is further to configure the second camera to capture the object in a second video stream at the second point in time, wherein the second camera is configured to capture the object based on the future state of the object.

Proceedings ArticleDOI
01 Nov 2018
TL;DR: This framework generates approximate nanoscale memristor crossbar designs that can perform edge detection on images that agrees with the exact computation on 96.6% of the input space while using a considerably smaller crossbar than an exact design.
Abstract: Smart cameras and other smart sensors require the capability to perform edge computing using energy-efficient and compact nanoscale systems. Many soft applications such as image processing can benefit from approximately correct computations if they reduce the energy and space requirements of the edge computing device. Our framework generates approximate nanoscale memristor crossbar designs that can perform edge detection on images. Our approximate design agrees with the exact computation on 96.6% of the input space while using a considerably smaller crossbar than an exact design.