scispace - formally typeset
Search or ask a question

Showing papers on "Smart camera published in 2016"


Proceedings ArticleDOI
01 Jun 2016
TL;DR: This work proposes, to the best of the knowledge, the first algorithm to simultaneously recover the motion field and brightness image, while the camera undergoes a generic motion through any scene, within a sliding window time interval.
Abstract: Event cameras are bio-inspired vision sensors which mimic retinas to measure per-pixel intensity change rather than outputting an actual intensity image. This proposed paradigm shift away from traditional frame cameras offers significant potential advantages: namely avoiding high data rates, dynamic range limitations and motion blur. Unfortunately, however, established computer vision algorithms may not at all be applied directly to event cameras. Methods proposed so far to reconstruct images, estimate optical flow, track a camera and reconstruct a scene come with severe restrictions on the environment or on the motion of the camera, e.g. allowing only rotation. Here, we propose, to the best of our knowledge, the first algorithm to simultaneously recover the motion field and brightness image, while the camera undergoes a generic motion through any scene. Our approach employs minimisation of a cost function that contains the asynchronous event data as well as spatial and temporal regularisation within a sliding window time interval. Our implementation relies on GPU optimisation and runs in near real-time. In a series of examples, we demonstrate the successful operation of our framework, including in situations where conventional cameras suffer from dynamic range limitations and motion blur.

247 citations


Proceedings ArticleDOI
27 Jun 2016
TL;DR: This paper presents an approach for real-time car parking occupancy detection that uses a Convolutional Neural Network (CNN) classifier running on-board of a smart camera with limited resources that is effective and robust to light condition changes, presence of shadows, and partial occlusions.
Abstract: This paper presents an approach for real-time car parking occupancy detection that uses a Convolutional Neural Network (CNN) classifier running on-board of a smart camera with limited resources. Experiments show that our technique is very effective and robust to light condition changes, presence of shadows, and partial occlusions. The detection is reliable, even when tests are performed using images captured from a viewpoint different than the viewpoint used for training. In addition, it also demonstrates its robustness when training and tests are executed on different parking lots. We have tested and compared our solution against state of the art techniques, using a reference benchmark for parking occupancy detection. We have also produced and made publicly available an additional dataset that contains images of the parking lot taken from different viewpoints and in different days with different light conditions. The dataset captures occlusion and shadows that might disturb the classification of the parking spaces status.

149 citations


Journal ArticleDOI
TL;DR: This paper designs and implements an automatic system, using computer vision, which runs on a computationally limited embedded smart camera platform to detect yawning, and uses a significantly modified implementation of the Viola-Jones algorithm for face and mouth detections.
Abstract: Yawning detection has a variety of important applications in a driver fatigue detection, well-being assessment of humans, driving behavior monitoring, operator attentiveness detection, and understanding the intentions of a person with a tongue disability. In all of the above applications, an automatic detection of yawning is one important system component. In this paper, we design and implement such automatic system, using computer vision, which runs on a computationally limited embedded smart camera platform to detect yawning. We use a significantly modified implementation of the Viola-Jones algorithm for face and mouth detections and, then, use a backprojection theory for measuring both the rate and the amount of the changes in the mouth, in order to detect yawning. As proof-of-concept, we have also implemented and tested our system on top of an actual smart camera embedded platform, called APEX from CogniVue Corporation. In our design and implementations, we took into consideration the practical aspects that many existing works ignore, such as real-time requirements of the system, as well as the limited processing power, memory, and computing capabilities of the embedded platform. Comparisons with existing methods show significant improvements in the correct yawning detection rate obtained by our proposed method.

93 citations


Patent
12 Feb 2016
TL;DR: A vehicle vision system includes at least one camera disposed at a vehicle and having a field of view exterior of the vehicle, and includes a control having an image processor for processing image data captured by the camera as discussed by the authors.
Abstract: A vehicle vision system includes at least one camera disposed at a vehicle and having a field of view exterior of the vehicle, and includes a control having an image processor for processing image data captured by the at least one camera. The control, responsive at least in part to image processing of captured image data by the image processor, is operable to carry out one or more actions based on the detection of selected elements in the image data captured by the at least one camera. The control may control the operation of an additional component in the vehicle aside from the camera. The vision system may include a forward facing camera, a rearward facing camera and/or a sideward facing camera.

76 citations


Journal ArticleDOI
TL;DR: The experimental results confirm the effectiveness of the proposed vision based method for counting the number of persons which cross a virtual line, especially when combining RGB and depth information, and the possibility of deploying the method both on high-end servers for processing in parallel a large number of video streams and on low power CPUs as those embedded on commercial smart cameras.

72 citations


Journal ArticleDOI
TL;DR: The opportunities, challenges, and design principles of industrial wireless communication over the mmWave spectrum are discussed and performance analysis of mmWave industrial systems with respect to channel capacity is conducted using a realistic physical-statistical-based channel model.
Abstract: The large bandwidth available at the mmWave spectrum can open the way for a wide variety of new industrial automation capabilities. With the use of wireless smart cameras and vision technologies, applications such as remote visual monitoring and surveillance, intelligent logistics product tracking, image guided automated assembly, and fault detection can be realized. Vision capabilities can enable robots, machines, and other industrial automation systems to meaningfully interact with objects and safely navigate through their surroundings. Allowing them to adapt to changing manufacturing-line conditions opens a wide range of new industrial automation applications. In this article, we discuss the opportunities, challenges, and design principles of industrial wireless communication over the mmWave spectrum. Open research issues are identified and discussed. In addition, performance analysis of mmWave industrial systems with respect to channel capacity is conducted using a realistic physical-statistical-based channel model.

69 citations


Patent
26 Oct 2016
TL;DR: In this article, the authors proposed a method to increase the size of the display area associated with the mobile device by various camera placement, such as placing the camera on the front side of a mobile device and displaying it as an icon on the display of the mobile devices.
Abstract: The technology disclosed here maximizes the size of the display area associated with the mobile device by various camera placement. In one embodiment, the camera is placed inside the mobile device, and can pop outside the mobile device when the camera is activated. When the camera is inactive the camera retracts inside the mobile device, and becomes unnoticeable to the user. In another embodiment, the camera is integrated into the mobile device display as a camera icon. The integrated camera serves two purposes: to record pictures, and to act as a camera icon, that when selected activates the camera. By removing the camera from the front side of the mobile device, or by integrating the camera into the display screen of the mobile device, the size of the mobile device display screen can be increased.

64 citations


Patent
Igor Karp1, Lev Stesin1
04 Jan 2016
TL;DR: In this paper, the authors present a camera system application program interface (API) for third-party integrations, in which a camera device captures images as a video stream and communicates the video stream to a cloud-based service.
Abstract: In embodiments of a camera system application program interface (API) for third-party integrations, a camera device captures images as a video stream and communicates the video stream to a cloud-based service. The cloud-based service implements a service application that processes video data received as the video stream. The cloud-based service exposes the camera system API that can be invoked by a third-party application running on a client device to request the video data and camera data that is associated with the camera device. The API permits access by the third-party application to the video data and the camera data from the cloud-based service. The API is exposed for the third-party application to communicate with the cloud-based service via a network connection, and the camera device communicates with the cloud-based service via a secure connection to provide the requested camera data and communicate the video stream to the cloud-based service.

55 citations


Journal ArticleDOI
TL;DR: This paper describes a complete FPGA-based smart camera architecture named HDR-ARtiSt (High Dynamic Range Adaptive Real-time Smart camera) which produces a real-time high dynamic range (HDR) live video stream from multiple captures.
Abstract: This paper describes a complete FPGA-based smart camera architecture named HDR-ARtiSt (High Dynamic Range Adaptive Real-time Smart camera) which produces a real-time high dynamic range (HDR) live video stream from multiple captures. A specific memory management unit has been defined to adjust the number of acquisitions to improve HDR quality. This smart camera is built around a standard B&W CMOS image sensor and a Xilinx FPGA. It embeds multiple captures, HDR processing, data display and transfer, which is an original contribution compared to the state-of-the-art. The proposed architecture enables a real-time HDR video flow for a full sensor resolution (1.3 Mega pixels) at 60 frames per second.

54 citations


Journal ArticleDOI
TL;DR: In this article, the authors examine most, if not all, of the recent approaches (post 2000) addressing camera placement in a structured manner, and provide a complete study of relevant formulation strategies and brief introductions to most commonly used optimization techniques by researchers.
Abstract: With recent advances in consumer electronics and the increasingly urgent need for public security, camera networks have evolved from their early role of providing simple and static monitoring to current complex systems capable of obtaining extensive video information for intelligent processing, such as target localization, identification, and tracking. In all cases, it is of vital importance that the optimal camera configuration (i.e., optimal location, orientation, etc.) is determined before cameras are deployed as a suboptimal placement solution will adversely affect intelligent video surveillance and video analytic algorithms. The optimal configuration may also provide substantial savings on the total number of cameras required to achieve the same level of utility. In this article, we examine most, if not all, of the recent approaches (post 2000) addressing camera placement in a structured manner. We believe that our work can serve as a first point of entry for readers wishing to start researching into this area or engineers who need to design a camera system in practice. To this end, we attempt to provide a complete study of relevant formulation strategies and brief introductions to most commonly used optimization techniques by researchers in this field. We hope our work to be inspirational to spark new ideas in the field.

45 citations


Journal ArticleDOI
12 Jan 2016
TL;DR: The proposed architecture detects candidate regions of interest using LBP-AdaBoost in the first stage, which offers robustness to false positives in real-time conditions, and design a two-stage architecture which outperforms the aforementioned detectors.
Abstract: Detecting large animals on roadways using automated systems such as robots or vehicles is a vital task. This can be achieved using conventional tools such as ultrasonic sensors, or with innovative technology based on smart cameras. In this paper, we investigate a vision-based solution. We begin the paper by performing a comparative study between three detectors: 1) Haar-AdaBoost; 2) histogram of oriented gradient (HOG)-AdaBoost; and 3) local binary pattern (LBP)-AdaBoost, which were initially developed to detect humans and their faces. These detectors are implemented, evaluated, and compared to each other in terms of accuracy and processing time. Based on our evaluation and comparison results, we design a two-stage architecture which outperforms the aforementioned detectors. The proposed architecture detects candidate regions of interest using LBP-AdaBoost in the first stage, which offers robustness to false positives in real-time conditions. The second stage is based on support vector machine classifiers that were trained using HOG features. The training data are generated from our novel dataset called large animal dataset, which contains common and thermographic images of large road-animals. We emphasize that no such public dataset currently exists.

Journal ArticleDOI
TL;DR: This work proposes a HW/SW implementation to detect falls in a home environment using a single camera and an optimized descriptor adapted to real-time tasks based on Xilinx's system-on-chip named Zynq.
Abstract: Smart camera, i.e. cameras that are able to acquire and process images in real-time, is a typical example of the new embedded computer vision systems. A key example of application is automatic fall detection, which can be useful for helping elderly people in daily life. In this paper, we propose a methodology for development and fast-prototyping of a fall detection system based on such a smart camera, which allows to reduce the development time compared to standard approaches. Founded on a supervised classification approach, we propose a HW/SW implementation to detect falls in a home environment using a single camera and an optimized descriptor adapted to real-time tasks. This heterogeneous implementation is based on Xilinx's system-on-chip named Zynq. The main contributions of this work are (i) the proposal of a co-design methodology. These methodologies enable the HW/SW partitioning to be delayed using high-level algorithmic description and high-level synthesis tools. Our approach enables fast prototyping which allows fast architecture exploration and optimisation to be performed, (ii) the design of a hardware accelerator dedicated to boosting-based classification, which is a very popular and efficient algorithm used in image analysis, (iii) the proposal of fall-detection embedded in a smart camera and enabling integration into the elderly people environment. Performances of our system are finally compared to the state-of-the-art.

Journal ArticleDOI
TL;DR: An ultra-low-power smart visual sensor architecture featuring internal analog preprocessing is coupled with an energy-efficient quad-core cluster processor that exploits near-threshold computing within a few milliwatt power envelope and the capability of the smart camera on a moving object detection framework is demonstrated.
Abstract: In this paper, we present an ultra-low-power smart visual sensor architecture. A 10.6- $\mu \text{W}$ low-resolution contrast-based imager featuring internal analog preprocessing is coupled with an energy-efficient quad-core cluster processor that exploits near-threshold computing within a few milliwatt power envelope. We demonstrate the capability of the smart camera on a moving object detection framework. The computational load is distributed among mixed-signal pixel and digital parallel processing. Such local processing reduces the amount of digital data to be sent out of the node by 91%. Exploiting context aware analog circuits, the imager only dispatches meaningful postprocessed data to the processing unit, lowering the sensor-to-processor bandwidth by $31{\times }$ with respect to transmitting a full pixel frame. To extract high-level features, an event-driven approach is applied to the sensor data and optimized for parallel runtime execution. A $57.7{\times }$ system energy saving is reached through the event-driven approach with respect to frame-based processing, on a low-power MCU node. The near-threshold parallel processor further reduces the processing energy cost by $6.64{\times }$ , achieving an overall system energy cost of $1.79~\mu \mathrm {J}$ per frame, which results to be $21.8{\times }$ and up to $383{\times }$ lower than, respectively, an event-based imaging system based on an asynchronous visual sensor and a traditional frame-based smart visual sensor.

Journal ArticleDOI
TL;DR: A smart camera aimed at security and law enforcement applications for intelligent transportation systems with capabilities for automatic detection and recognition of selected parameters of cars, as well as different aspects of the system efficiency.
Abstract: The paper presents a smart camera aimed at security and law enforcement applications for intelligent transportation systems. An extended background is presented first as a scholar literature review. The smart camera components and their capabilities for automatic detection and recognition of selected parameters of cars, as well as different aspects of the system efficiency, are described and discussed in detail in subsequent sections. Smart features of make and model recognition (MMR), license plate recognition (LPR) and color recognition (CR) are highlighted as the main benefits of the system. Their implementations, flowcharts and recognition rates are described, discussed and finally reported in detail. In addition to MMR, three different approaches, referred to as bag-of-features, scalable vocabulary tree and pyramid match, are also considered. The conclusion includes a discussion of the smart camera system efficiency as a whole, with an insight into potential future improvements.

Journal ArticleDOI
TL;DR: Experimental results demonstrate that the hybrid method is superior to Gabor and wavelet methods on detection accuracy and actual operations in a textile factory verify the effectiveness of the inspection system.
Abstract: In this study, an embedded machine vision system using Gabor filters and Pulse Coupled Neural Network (PCNN) is developed to identify defects of warp-knitted fabrics automatically. The system consists of smart cameras and a Human Machine Interface (HMI) controller. A hybrid detection algorithm combing Gabor filters and PCNN is running on the SOC processor of the smart camera. First, Gabor filters are employed to enhance the contrast of images captured by a CMOS sensor. Second, defect areas are segmented by PCNN with adaptive parameter setting. Third, smart cameras will notice the controller to stop the warp-knitting machine once defects are found out. Experimental results demonstrate that the hybrid method is superior to Gabor and wavelet methods on detection accuracy. Actual operations in a textile factory verify the effectiveness of the inspection system.

Journal ArticleDOI
31 May 2016
TL;DR: This paper aims to provide general overview of intelligent surveillance system and discuss some possible sensor modalities and their fusion scenarios such as visible camera (CCTV), infrared camera, thermal camera and radar.
Abstract: Intelligent surveillance system (ISS) has received growing attention due to the increasing demand on security and safety. ISS is able to automatically analyze image, video, audio or other type of surveillance data without or with limited human intervention. The recent developments in sensor devices, computer vision, and machine learning have an important role in enabling such intelligent system. This paper aims to provide general overview of intelligent surveillance system and discuss some possible sensor modalities and their fusion scenarios such as visible camera (CCTV), infrared camera, thermal camera and radar. This paper also discusses main processing steps in ISS: background-foreground segmentation, object detection and classification, tracking, and behavioral analysis.

Journal ArticleDOI
TL;DR: A new approach to compute the spatial contrast based on inter-pixel event communication less prone to mismatch effects than diffusive networks is proposed.
Abstract: This paper presents a novel event-based vision sensor with two operation modes: 1) intensity mode and spatial contrast detection. They can be combined with two different readout approaches: 1) pulse density modulation and time-to-first spike. The sensor is conceived to be a node of an smart camera network made up of several independent an autonomous nodes that send information to a central one. The user can toggle the operation and the readout modes with two control bits. The sensor has low latency (below 1 ms under average illumination conditions), low power consumption (19 mA), and reduced data flow, when detecting spatial contrast. A new approach to compute the spatial contrast based on inter-pixel event communication less prone to mismatch effects than diffusive networks is proposed. The sensor was fabricated in the standard AMS4M2P 0.35- $\mu \text{m}$ process. A detailed system-level description and experimental results are provided.

Proceedings ArticleDOI
12 Sep 2016
TL;DR: This work designs, develops, and evaluates a system, called COIN (Cloak Of INvisibility), that enables a user to flexibly express her privacy requirement and empowers the photo service provider (or image taker) to exert the privacy protection policy.
Abstract: The wide adoption of smart devices with onboard cameras facilitates photo capturing and sharing, but greatly increases people's concern on privacy infringement. Here we seek a solution to respect the privacy of persons being photographed in a smarter way that they can be automatically erased from photos captured by smart devices according to their requirements. To make this work, we need to address three challenges: 1) how to enable users explicitly express their privacy protection intentions without wearing any visible specialized tag, and 2) how to associate the intentions with persons in captured photos accurately and efficiently. Furthermore, 3) the association process itself should not cause portrait information leakage and should be accomplished in a privacy-preserving way. In this work, we design, develop, and evaluate a system, called COIN (Cloak Of INvisibility), that enables a user to flexibly express her privacy requirement and empowers the photo service provider (or image taker) to exert the privacy protection policy. Leveraging the visual distinguishability of people in the field-of-view and the dimension-order-independent property of vector similarity measurement, COIN achieves high accuracy and low overhead. We implement a prototype system, and our evaluation results on both the trace-driven and real-life experiments confirm the feasibility and efficiency of our system.

Patent
02 Jun 2016
TL;DR: In this article, a panoramic camera system is mounted on a mobile device to produce a video image from the image data and display of the video image may be manipulated based on a change of orientation of the mobile device and/or a touch action of the device user.
Abstract: A mobile device-mountable camera apparatus includes a panoramic camera system and a cable-free mounting arrangement. The panoramic camera system includes a panoramic lens assembly and a sensor. The lens assembly provides a vertical field of view in a range of greater than 180° to 360°. The sensor is positioned in image-receiving relation to the lens assembly and is operable to produce image data based on an image received through the lens assembly. The mounting arrangement is configured to removably secure the panoramic camera system to an externally-accessible data port of a mobile computing device to facilitate transfer of the image data to processing circuitry of the mobile device. The mobile device's processing circuitry may produce a video image from the image data and display of the video image may be manipulated based on a change of orientation of the mobile device and/or a touch action of the device user.

Patent
17 Aug 2016
TL;DR: In this paper, a 3D image capture method and system using dynamic camera arrays leveraging depth data for position registration both from frame to frame in a video sequence of individual cameras and from camera to camera to efficiently and accurately generate 3D data of an object being captured by the camera array.
Abstract: A 3D image capture method and system using dynamic camera arrays leveraging depth data for position registration both from frame to frame in a video sequence of individual cameras and from camera to camera to efficiently and accurately generate 3D data of an object being captured by the camera array, thereby eliminating the constraints of a static camera array and consequently reducing complexity and cost of interpolation for 3D view points between the camera images.

Proceedings ArticleDOI
01 Oct 2016
TL;DR: This paper investigates the use of a photo-realistic simulation tool to address challenges of robust place recognition, visual SLAM and object recognition, and provides a multi-domain demonstration of the beneficial properties of using simulation to characterise and analyse a wide range of robotic vision algorithms.
Abstract: Robotic vision, unlike computer vision, typically involves processing a stream of images from a camera with time varying pose operating in an environment with time varying lighting conditions and moving objects. Repeating robotic vision experiments under identical conditions is often impossible, making it difficult to compare different algorithms. For machine learning applications a critical bottleneck is the limited amount of real world image data that can be captured and labelled for both training and testing purposes. In this paper we investigate the use of a photo-realistic simulation tool to address these challenges, in three specific domains: robust place recognition, visual SLAM and object recognition. For the first two problems we generate images from a complex 3D environment with systematically varying camera paths, camera viewpoints and lighting conditions. For the first time we are able to systematically characterise the performance of these algorithms as paths and lighting conditions change. In particular, we are able to systematically generate varying camera viewpoint datasets that would be difficult or impossible to generate in the real world. We also compare algorithm results for a camera in a real environment and a simulated camera in a simulation model of that real environment. Finally, for the object recognition domain, we generate labelled image data and characterise the viewpoint dependency of a current convolution neural network in performing object recognition. Together these results provide a multi-domain demonstration of the beneficial properties of using simulation to characterise and analyse a wide range of robotic vision algorithms.

Posted Content
TL;DR: This paper proposes a data-driven algorithm based on convolutional neural networks, which learns features characterizing each camera directly from the acquired pictures, providing an accuracy greater than 94% in discriminating 27 camera models.
Abstract: The possibility of detecting which camera has been used to shoot a specific picture is of paramount importance for many forensics tasks. This is extremely useful for copyright infringement cases, ownership attribution, as well as for detecting the authors of distributed illicit material (e.g., pedo-pornographic shots). Due to its importance, the forensics community has developed a series of robust detectors that exploit characteristic traces left by each camera on the acquired images during the acquisition pipeline. These traces are reverse-engineered in order to attribute a picture to a camera. In this paper, we investigate an alternative approach to solve camera identification problem. Indeed, we propose a data-driven algorithm based on convolutional neural networks, which learns features characterizing each camera directly from the acquired pictures. The proposed approach is tested on both instance-attribution and model-attribution, providing an accuracy greater than 94% in discriminating 27 camera models.

Patent
08 Jun 2016
TL;DR: In this article, a video big data rapid searching method and system constrained by abnormal behavior early-warning information, wherein a massive video monitoring data clue investigation range is constrained by using abnormal behaviour early warning information sent out by an intelligent monitoring camera.
Abstract: The present invention discloses a video big data rapid searching method and system constrained by abnormal behavior early-warning information, wherein a massive video monitoring data clue investigation range is constrained by using abnormal behavior early-warning information sent out by an intelligent monitoring camera. The method comprises: receiving abnormal behavior early-warning information of a front-end intelligent camera, and establishing an abnormal behavior library; establishing a correlation model table, wherein the correlation model table stores a correlation between a case or event type and an abnormal behavior type; and according to the correlation model table, searching the abnormal behavior library, and positioning a corresponding valuable clue of monitoring point The method and system disclosed by the technical scheme of the present invention significantly reduce the size of recording data called in the video searching process and greatly improve valuable clue discovery efficiency.

Patent
29 Mar 2016
TL;DR: In this paper, the authors described a system and methods for generating a virtual reality experience including generating a user interface with a plurality of regions on a display in a head-mounted display device.
Abstract: Systems and methods are described for generating a virtual reality experience including generating a user interface with a plurality of regions on a display in a head-mounted display device. The head-mounted display device housing may include at least one pass-through camera device. The systems and methods can include obtaining image content from the at least one pass-through camera device and displaying a plurality of virtual objects in a first region of the plurality of regions in the user interface, the first region substantially filling a field of view of the display in the head-mounted display device. In response to detecting a change in a head position of a user operating the head-mounted display device, the methods and systems can initiate display of updated image content in a second region of the user interface.

Proceedings ArticleDOI
27 Jun 2016
TL;DR: This work captures all the light rays required for stereo panoramas in a single frame using a compact custom designed mirror, thus making the design practical to manufacture and easier to use.
Abstract: We present a practical solution for generating 360° stereo panoramic videos using a single camera. Current approaches either use a moving camera that captures multiple images of a scene, which are then stitched together to form the final panorama, or use multiple cameras that are synchronized. A moving camera limits the solution to static scenes, while multi-camera solutions require dedicated calibrated setups. Our approach improves upon the existing solutions in two significant ways: It solves the problem using a single camera, thus minimizing the calibration problem and providing us the ability to convert any digital camera into a panoramic stereo capture device. It captures all the light rays required for stereo panoramas in a single frame using a compact custom designed mirror, thus making the design practical to manufacture and easier to use. We analyze several properties of the design as well as present panoramic stereo and depth estimation results.

Proceedings ArticleDOI
01 Sep 2016
TL;DR: A histogram compression and feature selection framework based on Sparse Non-negative Matrix Factorization (SNMF) and experiments affirm that this approach outperforms the state of the art in accuracy and bandwidth usage.
Abstract: Distributed object recognition is a significantly fast-growing research area, mainly motivated by the emergence of high performance cameras and their integration with modern wireless sensor network technologies. In wireless distributed object recognition, the bandwidth is limited and it is desirable to avoid transmitting redundant visual features from multiple cameras to the base station. In this paper, we propose a histogram compression and feature selection framework based on Sparse Non-negative Matrix Factorization (SNMF). In our proposed method, histograms of the features are modeled as linear combination of a small set of signature vectors with associated weight vectors. The recognition process in the base station is then performed based on these small sets of transmitted weights from each camera. Furthermore, we propose another novel distributed object recognition scheme based on local classification in each camera and sending the label information to the base station and making the final decision based on majority voting. Experiments on BMW dataset affirm that our approach outperforms the state of the art in accuracy and bandwidth usage.

Journal ArticleDOI
01 Oct 2016
TL;DR: A solution that provides continuous online calibration of PTZ cameras which is robust to rapid camera motion, changes of the environment due to varying illumination or moving objects, and allows real-time tracking of multiple targets with high and stable degree of accuracy even at far distances and any zoom level is presented.
Abstract: Pan---tilt---zoom (PTZ) cameras are well suited for object identification and recognition in far-field scenes However, the effective use of PTZ cameras is complicated by the fact that a continuous online camera calibration is needed and the absolute pan, tilt and zoom values provided by the camera actuators cannot be used because they are not synchronized with the video stream So, accurate calibration must be directly extracted from the visual content of the frames Moreover, the large and abrupt scale changes, the scene background changes due to the camera operation and the need of camera motion compensation make target tracking with these cameras extremely challenging In this paper, we present a solution that provides continuous online calibration of PTZ cameras which is robust to rapid camera motion, changes of the environment due to varying illumination or moving objects The approach also scales beyond thousands of scene landmarks extracted with the SURF keypoint detector The method directly derives the relationship between the position of a target in the ground plane and the corresponding scale and position in the image and allows real-time tracking of multiple targets with high and stable degree of accuracy even at far distances and any zoom level

Journal ArticleDOI
22 Mar 2016-Sensors
TL;DR: This paper model two kinds of cameras, a parallel and a converged one, and analyze the difference between them in vertical and horizontal parallax, and finds that the threshold of shooting distance for converged cameras is 7 m.
Abstract: The Internet of Things is built based on various sensors and networks. Sensors for stereo capture are essential for acquiring information and have been applied in different fields. In this paper, we focus on the camera modeling and analysis, which is very important for stereo display and helps with viewing. We model two kinds of cameras, a parallel and a converged one, and analyze the difference between them in vertical and horizontal parallax. Even though different kinds of camera arrays are used in various applications and analyzed in the research work, there are few discussions on the comparison of them. Therefore, we make a detailed analysis about their performance over different shooting distances. From our analysis, we find that the threshold of shooting distance for converged cameras is 7 m. In addition, we design a camera array in our work that can be used as a parallel camera array, as well as a converged camera array and take some images and videos with it to identify the threshold.

Journal ArticleDOI
TL;DR: The working prototype of a complete standalone automated video surveillance system, including input camera interface, designed motion detection VLSI architecture, and output display interface, with real-time relevant motion detection capabilities, has been implemented on Xilinx ML510 (Virtex-5 FX130T) FPGA platform.
Abstract: Design of automated video surveillance systems is one of the exigent missions in computer vision community because of their ability to automatically select frames of interest in incoming video streams based on motion detection. This research paper focuses on the real-time hardware implementation of a motion detection algorithm for such vision based automated surveillance systems. A dedicated VLSI architecture has been proposed and designed for clustering-based motion detection scheme. The working prototype of a complete standalone automated video surveillance system, including input camera interface, designed motion detection VLSI architecture, and output display interface, with real-time relevant motion detection capabilities, has been implemented on Xilinx ML510 (Virtex-5 FX130T) FPGA platform. The prototyped system robustly detects the relevant motion in real-time in live PAL (720 × 576) resolution video streams directly coming from the camera.

Proceedings ArticleDOI
01 Oct 2016
TL;DR: An FPGA-based video HiL (Hardware in theLoop) solution to evaluate an of-the-shelf ADAS (Advanced Driver Assistance System) smart camera consisting of a video sensor and video processing unit, which allows to directly capture and store on a PC the video stream from the sensor.
Abstract: In this paper an FPGA-based video HiL (Hardwarein-the-Loop) solution is presented. It was designed to evaluate an of-the-shelf ADAS (Advanced Driver Assistance System) smart camera consisting of a video sensor (so-called forward looking camera (FLC)) and video processing unit. It allows to directly capture and store on a PC the video stream from the sensor. Then, the pre-recorded sequences can be directly injected into the video processing unit. Therefore, it is possible to evaluate the vision system in the laboratory under conditions almost identical to those present during real test drives. Moreover, all experiments are fully repeatable. The proposed system supports video streams with resolution 1280×960 @ 30 fps in the RCCC (Red/Clear)1 colour system. It is used in research and development projects in Delphi Technical Center Krakow.