scispace - formally typeset
Search or ask a question

Showing papers on "Smart camera published in 2020"


Proceedings ArticleDOI
16 Nov 2020
TL;DR: This work presents Distream, a distributed live video analytics system based on the smart camera-edge cluster architecture that is able to adapt to the workload dynamics to achieve low-latency, high-throughput, and scalable live video Analytics.
Abstract: Video cameras have been deployed at scale today. Driven by the breakthrough in deep learning (DL), organizations that have deployed these cameras start to use DL-based techniques for live video analytics. Although existing systems aim to optimize live video analytics from a variety of perspectives, they are agnostic to the workload dynamics in real-world deployments. In this work, we present Distream, a distributed live video analytics system based on the smart camera-edge cluster architecture, that is able to adapt to the workload dynamics to achieve low-latency, high-throughput, and scalable live video analytics. The key behind the design of Distream is to adaptively balance the workloads across smart cameras and partition the workloads between cameras and the edge cluster. In doing so, Distream is able to fully utilize the compute resources at both ends to achieve optimized system performance. We evaluated Distream with 500 hours of distributed video streams from two real-world video datasets with a testbed that consists of 24 cameras and a 4-GPU edge cluster. Our results show that Distream consistently outperforms the status quo in terms of throughput, latency, and latency service level objective (SLO) miss rate.

66 citations


Posted Content
TL;DR: The state of the art of deep learning in camera operations is reviewed and the impact of AI on the physical design of cameras is considered.
Abstract: We review camera architecture in the age of artificial intelligence. Modern cameras use physical components and software to capture, compress and display image data. Over the past 5 years, deep learning solutions have become superior to traditional algorithms for each of these functions. Deep learning enables 10-100x reduction in electrical sensor power per pixel, 10x improvement in depth of field and dynamic range and 10-100x improvement in image pixel count. Deep learning enables multiframe and multiaperture solutions that fundamentally shift the goals of physical camera design. Here we review the state of the art of deep learning in camera operations and consider the impact of AI on the physical design of cameras.

59 citations


Journal ArticleDOI
TL;DR: The design and implementation of a comprehensive multi-camera-based testbed for 3-D tracking and control of UAVs, which employs smart cameras with field-programmable gate array modules to allow for real-time computation at a frame rate of 100 Hz is presented.
Abstract: Flight testbeds with multiple unmanned aerial vehicles (UAVs) are especially important to support research on multi-vehicle-related algorithms. The existing platforms usually lack a generic and complete solution allowing for software and hardware design. For such a purpose, this paper presents the design and implementation of a comprehensive multi-camera-based testbed for 3-D tracking and control of UAVs. First, the testbed software consists of a multi-camera system and a ground control system, which performs image processing, camera calibration, 3-D reconstruction, pose estimation, and motion control. In the multi-camera system, the positions and orientations of UAVs are first reconstructed by using epipolar geometric constraints and triangulation methods and then filtered by an extended Kalman filter (EKF). In the ground control system, a classical proportional–derivative (PD) controller is designed to receive the navigation data from the multi-camera system and then generates control commands to the target vehicles. Then, the testbed hardware employs smart cameras with field-programmable gate array (FPGA) modules to allow for real-time computation at a frame rate of 100 Hz. Lightweight quadcopter Parrot Bebop drones are chosen as the target UAVs, which does not require any modification to the hardware. Artificial infrared reflective markers are asymmetrically mounted on target vehicles and observed by multiple infrared cameras located around the flight region. Finally, extensive experiments are performed to demonstrate that the proposed testbed is a comprehensive and complete platform with good scalability applicable for research on a variety of advanced guidance, navigation, and control algorithms.

35 citations


Journal ArticleDOI
TL;DR: This work proposes a compact DCNN architecture for Gender Recognition from face images that achieves approximately state of the art accuracy at a highly reduced computational cost (almost five times).
Abstract: Gender recognition has been among the most investigated problems in the last years; although several contributions have been proposed, gender recognition in unconstrained environments is still a challenging problem and a definitive solution has not been found yet. Furthermore, Deep Convolutional Neural Networks (DCNNs) achieve very interesting performance, but they typically require a huge amount of computational resources (CPU, GPU, RAM, storage), that are not always available in real systems, due to their cost or to specific application constraints (when the application needs to be installed directly on board of low-power smart cameras, e.g. for digital signage). In the latest years the Machine Learning community developed an interest towards optimizing the efficiency of Deep Learning solutions, in order to make them portable and widespread. In this work we propose a compact DCNN architecture for Gender Recognition from face images that achieves approximately state of the art accuracy at a highly reduced computational cost (almost five times). We also perform a sensitivity analysis in order to show how some changes in the architecture of the network can influence the tradeoff between accuracy and speed. In addition, we compare our optimized architecture with popular efficient CNNs on various common benchmark dataset, widely adopted in the scientific community, namely LFW, MIVIA-Gender, IMDB-WIKI and Adience, demonstrating the effectiveness of the proposed solution.

31 citations


Proceedings ArticleDOI
11 May 2020
TL;DR: In this paper, the authors present a framework for performance optimization in serverless edge-cloud platforms using dynamic task placement, which allows the user to specify cost and latency requirements for each application task, and for each input, it determines whether to execute the task on the edge device or in the cloud.
Abstract: We present a framework for performance optimization in serverless edge-cloud platforms using dynamic task placement. We focus on applications for smart edge devices, for example, smart cameras or speakers, that need to perform processing tasks on input data in real to near-real time. Our framework allows the user to specify cost and latency requirements for each application task, and for each input, it determines whether to execute the task on the edge device or in the cloud. Further, for cloud executions, the framework identifies the container resource configuration needed to satisfy the performance goals. We have evaluated our framework in simulation using measurements collected from serverless applications in AWS Lambda and AWS Greengrass. In addition, we have implemented a prototype of our framework that runs in these same platforms. In experiments with our prototype, our models can predict average end-to-end latency with less than 6% error, and we obtain almost three orders of magnitude reduction in end-to-end latency compared to edge-only execution.

29 citations


Journal ArticleDOI
TL;DR: A novel sensor fusion method called visual model‐predictive localization (VML), which approximates the error between the model prediction position and the visual measurements as a linear function so that outliers can be handled efficiently and the vision delay can also be compensated efficiently.
Abstract: Drone racing is becoming a popular e-sport all over the world, and beating the best human drone race pilots has quickly become a new major challenge for artificial intelligence and robotics. In this paper, we propose a novel sensor fusion method called visual model-predictive localization (VML). Within a small time window, VML approximates the error between the model prediction position and the visual measurements as a linear function. Once the parameters of the function are estimated by the RANSAC algorithm, this error model can be used to compensate the prediction in the future. In this way, outliers can be handled efficiently and the vision delay can also be compensated efficiently. Theoretical analysis and simulation results show the clear advantage compared with Kalman filtering when dealing with the occasional large outliers and vision delays that occur in fast drone racing. Flight tests are performed on a tiny racing quadrotor named “Trashcan,” which was equipped with a Jevois smart camera for a total of 72 g. An average speed of 2 m/s is achieved while the maximum speed is 2.6 m/s. To the best of our knowledge, this flying platform is currently the smallest autonomous racing drone in the world, while still being one of the fastest autonomous racing drones.

28 citations


Journal ArticleDOI
TL;DR: This paper surveys the available literature in terms of multi-camera systems’ physical arrangements, calibrations, algorithms, and their advantages and disadvantages, which are surveillance, sports, education, and mobile phones.
Abstract: A multi-camera system combines features from different cameras to exploit a scene of an event to increase the output image quality. The combination of two or more cameras requires prior settings in terms of calibration and architecture. Therefore, this paper surveys the available literature in terms of multi-camera systems’ physical arrangements, calibrations, algorithms, and their advantages and disadvantages. We also survey the recent developments and advancements in four areas of multi-camera system applications, which are surveillance, sports, education, and mobile phones. In the surveillance system, the combination of multiple heterogeneous cameras and the discovery of Pan-Tilt-Zoom (PTZ) and smart cameras have brought tremendous achievements in the area of multi-camera control and coordination. Different approaches have been proposed to facilitate effective collaboration and monitoring among the camera network. Furthermore, the application of multi-cameras in sports has made the games more interesting in the aspect of analyses and transparency. The application of the multi-camera system in education has taken education beyond the four walls of the class. The method of teaching, student attendance enrollment, determination of students’ attention, teacher and student assessment can now be determined with ease, and all forms of proxy and manipulation in education can be reduced by using a multi-camera system. Besides, the number of cameras featuring on smartphones is gaining noticeable recognition. However, most of these cameras serve different purposes, from zooming, telephoto, and wider Field of View (FOV). Therefore, future smartphones should be expecting more cameras or the development would be in a different direction.

24 citations


Journal ArticleDOI
TL;DR: The development of artificial neural networks and their recent intersection with computational imaging is reviewed, and in more detail how deep learning impacts the primary strategies of computational photography: focal plane modulation, lens design, and robotic control is considered.
Abstract: We review the impact of deep-learning technologies on camera architecture. The function of a camera is first to capture visual information and second to form an image. Conventionally, both functions are implemented in physical optics. Throughout the digital age, however, joint design of physical sampling and electronic processing, e.g., computational imaging, has been increasingly applied to improve these functions. Over the past five years, deep learning has radically improved the capacity of computational imaging. Here we briefly review the development of artificial neural networks and their recent intersection with computational imaging. We then consider in more detail how deep learning impacts the primary strategies of computational photography: focal plane modulation, lens design, and robotic control. With focal plane modulation, we show that deep learning improves signal inference to enable faster hyperspectral, polarization, and video capture while reducing the power per pixel by 10−100×. With lens design, deep learning improves multiple aperture image fusion to enable task-specific array cameras. With control, deep learning enables dynamic scene-specific control that may ultimately enable cameras that capture the entire optical data cube (the “light field”), rather than just a focal slice. Finally, we discuss how these three strategies impact the physical camera design as we seek to balance physical compactness and simplicity, information capacity, computational complexity, and visual fidelity.

23 citations


Proceedings ArticleDOI
21 Apr 2020
TL;DR: Drawing analogies between smart cameras and electric lighting, design trends towards always-on sensing in intimate contexts, and the functional expansion of smart cameras as general-purpose and multi-functional devices are highlighted and extrapolate.
Abstract: Drawing analogies between smart cameras and electric lighting, we highlight and extrapolate design trends towards always-on sensing in intimate contexts, and the functional expansion of smart cameras as general-purpose and multi-functional devices. Employing a research through design (RtD) approach, we extrapolate these trends using speculative scenarios, materialize the scenarios by designing and constructing lighting-inspired smart camera fixtures, and self-experiment with these fixtures to introduce and exacerbate privacy and security issues, and inspire creative workarounds and design opportunities for sensor-level regulation. We synthesize our insights by presenting 8 smart camera sensing design qualities for addressing privacy, security, and related social and ethical issues.

22 citations


Journal ArticleDOI
21 May 2020
TL;DR: A new optimization method is designed that integrates black-box optimization with Neural Processes (NPs) as a system performance approximator and allows black- box optimizer to query NPs instead of the real system.
Abstract: Widely deployed smart cameras are generating a large amount of video data and capable of processing frames on devices. Empowered by edge computing, the video data can also be offloaded to edge servers for processing. By leveraging the on-device processing and computation offloading, we propose a federated video analytics system named FedVision to efficiently provision video analytics across devices and servers. The challenge of designing FedVision is to optimally use the computing and networking resources for video analytics. Since there is no closed-form expression of the system performance, black-box optimization is employed to optimize the system performance. However, using black-box optimization directly incurs excessive system queries that lead to very poor system performance. To solve this problem, we design a new optimization method that integrates black-box optimization with Neural Processes (NPs) as a system performance approximator. This method allows black-box optimizer to query NPs instead of the real system. We validate the performance of FedVision and the new optimization method using both numerical results and experiments with a testbed.

21 citations


Posted Content
TL;DR: A framework for performance optimization in serverless edge-cloud platforms using dynamic task placement for smart edge devices that need to perform processing tasks on input data in real to near-real time is presented.
Abstract: We present a framework for performance optimization in serverless edge-cloud platforms using dynamic task placement. We focus on applications for smart edge devices, for example, smart cameras or speakers, that need to perform processing tasks on input data in real to near-real time. Our framework allows the user to specify cost and latency requirements for each application task, and for each input, it determines whether to execute the task on the edge device or in the cloud. Further, for cloud executions, the framework identifies the container resource configuration needed to satisfy the performance goals. We have evaluated our framework in simulation using measurements collected from serverless applications in AWS Lambda and AWS Greengrass. In addition, we have implemented a prototype of our framework that runs in these same platforms. In experiments with our prototype, our models can predict average end-to-end latency with less than 6% error, and we obtain almost three orders of magnitude reduction in end-to-end latency compared to edge-only execution.

Journal ArticleDOI
01 Sep 2020
TL;DR: In this article, a flexible, adaptive and phase-free distributed traffic control algorithm that uses the information provided by distributed smart cameras to efficiently control traffic signals is presented, which improves the routing of emergency vehicles in a cross congestion area.
Abstract: Smart and decentralized control systems have recently been proposed to handle the growing traffic congestion in urban cities. Proposed smart traffic light solutions based on Wireless Sensor Network and Vehicular Ad-hoc NETwork are either unreliable and inflexible or complex and costly. Furthermore, the handling of special vehicles such as emergency is still not viable, especially during busy hours. Inspired by the emergence of distributed smart cameras, we present a novel fuzzy logic approach to traffic control at intersections. Our approach uses smart cameras at intersections along with image understanding for real-time traffic monitoring and assessment. Besides understanding the traffic flow, the cameras can detect and track special vehicles and help prioritize emergency cases. Traffic violations can be identified as well and traffic statistics collected. In this paper, we introduce a flexible, adaptive and phase-free distributed traffic-control algorithm that uses the information provided by distributed smart cameras to efficiently control traffic signals. Experimental results show that our collision-free approach outperforms the state-of-the-art of the average user’s waiting time in the queue and improves the routing of emergency vehicles in a cross congestion area.

Proceedings ArticleDOI
16 Nov 2020
TL;DR: A battery-free smart camera that combines tiny machine learning, long-range communication, power management, and energy harvesting, and the smart sensor node has been implemented and evaluated in the field, showing both battery-less capabilities with a small-size photovoltaic panel and the energy efficiency of the proposed solution.
Abstract: This paper presents a battery-free smart camera that combines tiny machine learning, long-range communication, power management, and energy harvesting. The smart sensor node has been implemented and evaluated in the field, showing both battery-less capabilities with a small-size photovoltaic panel and the energy efficiency of the proposed solution. We evaluated two different ARM Cortex-M4F microcontrollers, the Ambiq Apollo 3 that is an energy-efficient microcontroller, and a Microchip SAMD51 able to work in high radiation environments but with a higher power in active mode. Finally, a low power LoRa module provides the long-range wireless transmission capability. The tiny machine learning algorithm for face recognition has been optimized in terms of accuracy versus energy, achieving up to 97% accuracy recognizing five different faces. Experimental results demonstrated the capability of the developed sensor node to start from the cold start after 1 minute at a very low luminosity of 350 lux using a cm size flexible photovoltaic panels and work perpetually after the cold start.

Book ChapterDOI
01 Jan 2020
TL;DR: Emerging technologies that will transform mobile imaging to enable unique and useful applications, which will fuel market growth and broaden the adoption of imaging technology across the globe are discussed.
Abstract: This chapter discusses the application of complementary metal-oxide-semiconductor (CMOS) image sensors in the mobile market. The chapter first reviews the history and how innovation has overcome the limitations of core technologies to enable mobile imaging. A review of the CMOS image sensor architectures and product considerations provides insight into how the basic image sensor evolves to a smart camera imaging solution, featuring key functions beyond picture taking into full “imaging for information.” This includes the recent adoption of depth imaging solutions to quickly capture the scene's contextual information for use by computer vision applications within mobile devices. The chapter then discusses emerging technologies that will transform mobile imaging to enable unique and useful applications, which will fuel market growth and broaden the adoption of imaging technology across the globe.

Journal ArticleDOI
TL;DR: XNOR-Net as mentioned in this paper approximates convolutions using binary operations, which results in 58x faster convolutional operations and 32x memory savings compared to the full-precision AlexNet.
Abstract: In recent years we have seen a growing number of edge devices adopted by consumers, in their homes (e.g., smart cameras and doorbells), in their cars (e.g., driver assisted systems), and even on their persons (e.g., smart watches and rings). Similar growth is reported in industries including aerospace, agriculture, healthcare, transport, and manufacturing. At the same time that devices are getting smaller, Deep Neural Networks (DNN) that power most forms of artificial intelligence are getting larger, requiring more compute power, memory, and bandwidth. This creates a growing disconnect between advances in artificial intelligence and the ability to develop smart devices at the edge. In this paper, we present a novel approach to running state-of-the-art AI algorithms at the edge. We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks (BWN) and XNOR-Networks. In BWN, the filters are approximated with binary values resulting in 32x memory saving. In XNOR-Networks, both the filters and the input to convolutional layers are binary. XNOR-Networks approximate convolutions using primarily binary operations. This results in 58x faster convolutional operations (in terms of number of the high precision operations) and 32x memory savings. XNOR-Nets offer the possibility of running state-of-the-art networks on CPUs (rather than GPUs) in real-time. Our binary networks are simple, accurate, efficient, and work on challenging visual tasks. We evaluate our approach on the ImageNet classification task. The classification accuracy with a BWN version of AlexNet is the same as the full-precision AlexNet. Our code is available at: urlhttp://allenai.org/plato/xnornet.

Journal ArticleDOI
TL;DR: This paper proposes a method for gender recognition on video sequences, specifically designed for making it suited to smart cameras; although the algorithm uses very limited resources, it is able to run on smart cameras available today.
Abstract: In recent years we have assisted to a growing interest for embedded vision, due to the availability of low cost hardware systems, effective for energy consumption, flexible for their size at the cost of limited (compared to the server) computing resources. Their use is boosted by the simplicity of their positioning in places where energy or network bandwidth is limited. Smart cameras are digital cameras embedding computer systems able to host video applications; due to the cost and the performance, they are progressively gaining popularity and conquering large amount of the market. Smart cameras are now able to host on board video applications, even if this imposes an heavy reformulation of the algorithms and of the software design so as to make them compliant with the limited CPUs and the small RAM and flash memory (typically of a few megabytes). In this paper we propose a method for gender recognition on video sequences, specifically designed for making it suited to smart cameras; although the algorithm uses very limited resources (in terms of RAM and CPU), it is able to run on smart cameras available today, presenting at the same time an high accuracy on unrestricted videos taken in real environments (malls, shops, etc.).

Proceedings ArticleDOI
13 Oct 2020
TL;DR: In this paper, the authors examined attack datasets from the Hacking and Countermeasure Research Lab (HCRL) collected from real-life IoT devices that include smart cameras, laptops, and smartphones, and presented a model using Random Forest, Logistic Regression, and Decision Tree.
Abstract: The Internet of Things (IoT) is growing with the advancement of technology. Many vendors are creating IoT devices to leverage the quality of life of consumers. These devices include smart grids, smart homes, smart health care systems, smart transportation, and many other applications. IoT devices interact with the environment and each other using sensors and actuators. However, the widespread proliferation of IoT devices poses many cybersecurity threats. The IoT devices’ interconnection opens the door to attackers who try to gain unauthorized access to these devices. For many IT networks, establishing trust and security during the device operation is challenging. Further, devices also may leak vital information, which is a significant concern in cybersecurity. Prior research has shown that security breaches have increased by 67% over the past five years, and 95% of HTTPs servers are vulnerable to Man-in-the-middle (MIM) attacks. This paper examines attack datasets from the Hacking and Countermeasure Research Lab (HCRL) collected from real-life IoT devices that include smart cameras, laptops, and smartphones [1]. We present a model using Random Forest, Logistic Regression, and Decision Tree. Results indicate that the overall detection accuracy is 98-100%, which is more promising than traditional Intrusion Detection Systems (IDS).

Journal ArticleDOI
TL;DR: The experience is described and how the semantic trust model adopted mitigates the effects of weaknesses and the risks related to smart home cyber-attacks.
Abstract: Pepper is a humanoid robot that just embeds the few computational resources for controlling its sensors and actuators, and is not capable of handling big amounts of data or performing in parallel complicate tasks. Aiming at enriching its functionalities and its interaction with the environment, the robot has been put in communication with a plethora of satellite smart objects and services ranging from simple environmental sensors, up to deep learning enhanced smart cameras. The addition of biometric, emotional, social, machine learning and other capabilities to Pepper, while enabling advanced functionalities and additional instruments for controlling users and the environment, raises security and, obviously, privacy concerns. The robot itself, its interaction with the environment and every weakness exposed by the smart objects involved in its eco-system, may represent an exploit point for attacking the smart home and threaten security and privacy. Aiming at preventing attacks and strengthen security, each action with the system is evaluated against the entire context, as detected by the entire eco-system of smart-objects. This paper describes and analyses the experience and how the semantic trust model adopted mitigates the effects of weaknesses and the risks related to smart home cyber-attacks.

Journal ArticleDOI
TL;DR: This proposed system predicts pet posture using smart camera networks powered by the artificial intelligence of things and can determine from an image whether there is a detection target and generate a contour mask based on Mask R-CNN Technology.
Abstract: In today’s society, the number of people rearing pets has increased and their awareness of the need to protect pets’ health has increased. Pet posture behaviour analysis and prediction are providing assistance in the medical treatment of pets. Hence, the demand for pet skeleton drawing applications has risen dramatically. Our proposed system predicts pet posture using smart camera networks powered by the artificial intelligence of things. This system is built on a platform using a Raspberry Pi embedded system. The system can determine from an image whether there is a detection target and generate a contour mask based on Mask R-CNN Technology. According to object detection, poses and key parts can be identified to predict and draw pet skeletons. Simultaneously, the behavioural action of a pet can be determined according to continuous skeleton data and then the system will actively inform the owner to perform subsequent processing.

Journal ArticleDOI
TL;DR: This paper prototype and evaluate a multi-view smart vision system for object recognition that exploits state-of-the-art deep learning optimization methods, such as parameter removing and data quantization, and demonstrates that accuracy drops caused by these optimizations can be compensated by the multi-View nature of the captured information.

Journal ArticleDOI
TL;DR: Details of the design and functions of the smart camera and the simple visual inspection algorithm are discussed in this paper, which easily surpassed manual grading, which constantly faces the challenges of human fatigue or other distractions.
Abstract: Due to the increasing consumption of food products and demand for food quality and safety, most food processing facilities in the United States utilize machines to automate their processes, such as cleaning, inspection and grading, packing, storing, and shipping. Machine vision technology has been a proven solution for inspection and grading of food products since the late 1980s. The remaining challenges, especially for small to midsize facilities, include the system and operating costs, demand for high-skilled workers for complicated configuration and operation and, in some cases, unsatisfactory results. This paper focuses on the development of an embedded solution with learning capability to alleviate these challenges. Three simple application cases are included to demonstrate the operation of this unique solution. Two datasets of more challenging cases were created to analyze and demonstrate the performance of our visual inspection algorithm. One dataset includes infrared images of Medjool dates of four levels of skin delamination for surface quality grading. The other one consists of grayscale images of oysters with varying shape for shape quality evaluation. Our algorithm achieved a grading accuracy of 95.0% on the date dataset and 98.6% on the oyster dataset, both easily surpassed manual grading, which constantly faces the challenges of human fatigue or other distractions. Details of the design and functions of our smart camera and our simple visual inspection algorithm are discussed in this paper.

Book ChapterDOI
06 May 2020
TL;DR: The results demonstrate that multiple vehicle application model achieved better performance in maintaining low latency and in terms of resource usage, the model shows an increase in network usage, RAM used, and energy consumption due to the high volume of vehicles being targeted.
Abstract: iFogSim is a toolkit to model, simulate, and evaluate the networks of Fog computing, Edge computing, and Internet of Things (IoT) The framework provides the capabilities of analyzing and evaluating the performance of applications and resource management policies in Fog/IoT environments, based on which designers can model and test their applications This paper reports on the performance evaluation of a traffic surveillance vehicular network application that uses smart cameras using iFogSim, where the scenario of multiple vehicles tracking is considered The effectiveness of the proposed application model is assessed and validated by simulation experiments using a modified application model inherited from a case study of intelligent surveillance through distributed camera networks introduced in (Gupta H, Vahid Dastjerdi A, Ghosh SK, Buyya R Softw Pract Exp 47(9):1275–1296, 2017) Simulations were conducted using the iFogSim tool The comparison between one vehicle and multiple vehicle tracking was done and the results demonstrate that multiple vehicle application model achieved better performance in maintaining low latency The new model shows inconsistency in data transfer rate as workload increases, and in terms of resource usage, the model shows an increase in network usage, RAM used, and energy consumption due to the high volume of vehicles being targeted

Proceedings ArticleDOI
25 Oct 2020
TL;DR: An ultra low power smart camera capable of detecting and recognizing the pest in the field and transmitted over a long distance and implementing a machine learning approach based on neural networks on the camera board is presented.
Abstract: Apple is one of the most produced fruits in the world because it is easy to grow, store, and transport. The most significant threat of this crop is the attack of the codling moth, a small insect capable of damaging whole orchards in a few days. To prevent this parasite and to plan effective countermeasures, we present an ultra low power smart camera capable of detecting and recognizing the pest in the field; therefore, a wireless alarm can be transmitted over a long distance. The system implements a machine learning approach based on neural networks on the camera board. The sensor is also provided with long-range radio capability and an energy harvester; it permits to operate indefinitely because of its positive energy balance when deployed in the field. Experimental tests on the proposed energy-neutral smart camera demonstrate a validation accuracy of 93% and only 3.5mJ required for image analysis and classification.

Journal ArticleDOI
TL;DR: A novel FPGA architecture of high dynamic range (HDR) video processing pipeline based on the capturing of a sequence of differently exposed images achieves real-time performance on full HD HDR video which overcomes state-of-the-art solutions that use local tone mapping and deghosting algorithm.
Abstract: This paper presents a novel FPGA architecture of high dynamic range (HDR) video processing pipeline, based on the capturing of a sequence of differently exposed images. An acquisition process enabling multi-exposure HDR as well as fast implementation of local tone mapping operator involving bilateral filtering is proposed. The HDR acquisition process is enhanced by the application of novel deghosting method, which is dedicated for hardware implementation and proposed in this paper. The hardware processing pipeline is designed with regards to efficiency and performance and the calculations are performed in fixed point arithmetic. The pipeline is suitable for programmable hardware (FPGA—Field Programmable Gate Arrays) implementation and it achieves real-time performance on full HD HDR video which overcomes state-of-the-art solutions that use local tone mapping and deghosting algorithm.

Journal ArticleDOI
TL;DR: A novel fast flying-dot projector prototype is built using a high speed camera and a scanning MEMS (Micro-electro-mechanical system) mirror, and new methods for overcoming the effects of MEMS mirror resonance are developed.
Abstract: The light transport captures a scene's visual complexity. Acquiring light transport for dynamic scenes is difficult, since any change in viewpoint, materials, illumination or geometry also varies the transport. One strategy to capture dynamic light transport is to use a fast “flying-dot” projector; i.e., where an impulse light-probe is quickly scanned across the scene. We have built a novel fast flying-dot projector prototype using a high speed camera and a scanning MEMS (Micro-electro-mechanical system) mirror. Our contributions are calibration strategies that enable dynamic light transport acquisition at near video rates with such a system. We develop new methods for overcoming the effects of MEMS mirror resonance. We utilize new algorithms for denoising impulse scanning at high frame rates and compare the trade-offs in visual quality between frame rate and illumination power. Finally, we show the utility of our calibrated setup by demonstrating graphics applications such as video relighting, direct/global separation, and dual videography for dynamic scenes such as fog, water, and glass. Please see our accompanying video for dynamic scene results.

Journal ArticleDOI
TL;DR: In this paper, a computationally efficient architecture based on separable convolutions that integrates dense connections across layers and multi-scale feature fusion was proposed to improve representational capacity while decreasing the number of parameters and operations.
Abstract: Deep-learning-based pedestrian detectors can enhance the capabilities of smart camera systems in a wide spectrum of machine vision applications including video surveillance, autonomous driving, robots and drones, smart factory, and health monitoring. However, such complex paradigms do not scale easily and are not traditionally implemented in resource-constrained smart cameras for on-device processing which offers significant advantages in situations when real-time monitoring and privacy are vital. This work addresses the challenge of achieving a good trade-off between accuracy and speed for efficient deep-learning-based pedestrian detection in smart camera applications. The contributions of this work are the following: 1) a computationally efficient architecture based on separable convolutions that integrates dense connections across layers and multi-scale feature fusion to improve representational capacity while decreasing the number of parameters and operations, 2) a more elaborate loss function for improved localization, 3) and an anchor-less approach for detection. The proposed approach referred to as YOLOpeds is evaluated using the PETS2009 surveillance dataset on 320 × 320 images. A real-system implementation is presented using the Jetson TX2 embedded platform. YOLOpeds provides real-time sustained operation of over 30 frames per second with detection rates in the range of 86% outperforming existing deep learning models.

Proceedings ArticleDOI
03 Jun 2020
TL;DR: This paper proposes a system which implements gender recognition from face images using a smart camera mounted above the monitor, dedicated to gender recognition in real-time, and a component that dynamically modifies the content projected on the screen according to the gender of the audience.
Abstract: Digital signage is a new advertising strategy using smart multimedia screens, which carries out the dynamic customization of the promotional content according to the customers who are looking at the monitor. Gender recognition from face images is among the most popular applications for digital signage, since it allows to select in real-time advertising spots customized for males or females. In this paper, we propose a system which implements this solution using a smart camera mounted above the monitor, dedicated to gender recognition in real-time, and a component that dynamically modifies the content projected on the screen according to the gender of the audience. The computer vision algorithm is designed to be as fast as effective, since the whole processing chain must be performed in real-time in order to avoid missing people passing in front of the screen. We evaluated the performance of the proposed solution on a standard dataset for gender recognition in the wild and in a real fair, obtaining a gender recognition accuracy of 94.99% and 92.70%, respectively, that is very relevant in such unconstrained scenarios. In addition, the method is able to process 5 fps on a smart camera and, thus, it can be used in a digital signage application.

Journal ArticleDOI
TL;DR: This paper manages plan and usage of intelligent child support framework which is extraordinary blessing to guardians in this century and a baby bed with intelligent system was designed and implemented.
Abstract: Step by step the innovation likewise becomes exceptionally quick and the human makes it. Thus, it is imperative to take care of the people to come, a unique consideration ought to be appeared to them particularly indulges. This paper manages plan and usage of intelligent child support framework which is extraordinary blessing to guardians in this century In this work a baby bed with intelligent system was be designed and implemented . many sensors where be used to monitor the baby behavior . the component of this project consist of a smart camera , moisture sensor , sensitive Dc Motor and WiFi system.

Journal ArticleDOI
20 Aug 2020-Sensors
TL;DR: This work proposes a new decentralised approach for network reconfiguration, where each camera dynamically adapts its parameters and position to optimise scene coverage and evaluates the approach in a simulated environment monitored with fixed, PTZ and UAV-based cameras.
Abstract: Crowd surveillance plays a key role to ensure safety and security in public areas Surveillance systems traditionally rely on fixed camera networks, which suffer from limitations, as coverage of the monitored area, video resolution and analytic performance On the other hand, a smart camera network provides the ability to reconfigure the sensing infrastructure by incorporating active devices such as pan-tilt-zoom (PTZ) cameras and UAV-based cameras, thus enabling the network to adapt over time to changes in the scene We propose a new decentralised approach for network reconfiguration, where each camera dynamically adapts its parameters and position to optimise scene coverage Two policies for decentralised camera reconfiguration are presented: a greedy approach and a reinforcement learning approach In both cases, cameras are able to locally control the state of their neighbourhood and dynamically adjust their position and PTZ parameters When crowds are present, the network balances between global coverage of the entire scene and high resolution for the crowded areas We evaluate our approach in a simulated environment monitored with fixed, PTZ and UAV-based cameras

Journal ArticleDOI
TL;DR: The focal point of this work is to develop an intelligent camera surveillance system which englobes the key functionalities of existing surveillance systems and integrates a novel and advanced object displacement detection feature to provide more security by determining if an object has been displaced by an intruder.
Abstract: The focal point of this work is to develop an intelligent camera surveillance system which englobes the key functionalities of existing surveillance systems. Other than regular functionalities such as motion detection, object detection, face recognition and counting people, it also integrates a novel and advanced object displacement detection feature to provide more security by determining if an object has been displaced by an intruder. When people are detected, a counting module displays the number of persons present in the surveillance area. A face recognition module distinguishes between authorised and unauthorised users. This biometric functionality reduces false alarms which makes the system more robust. An object detection module detects certain valuable objects such as handbags, laptops and smartphones. Also, images and short video recordings are stored on the cloud. Furthermore, the system introduces innovative real-time notification approaches for surveillance systems such as WhatsApp messages and phone calls, in addition to SMS and emails. Thus, this system is reliable and meets the aim of a modern intelligent surveillance system by combining multiple approaches to detect intrusions and to inform users effectively.