scispace - formally typeset
Search or ask a question

Showing papers on "Smart camera published in 2013"


Journal ArticleDOI
TL;DR: This paper reviews the recent development of relevant technologies from the perspectives of computer vision and pattern recognition, and discusses how to face emerging challenges of intelligent multi-camera video surveillance.

695 citations


Patent
28 Feb 2013
TL;DR: In this paper, a mobile device includes a camera, a user interface system, and a processor communicatively coupled to the camera and the user interface systems, and the processor is typically configured for running a first application.
Abstract: A mobile device includes a camera, a user interface system, and a processor communicatively coupled to the camera and the user interface system. The processor is typically configured for running a first application. The first application is typically configured for (i) accessing the camera, (ii) upon the initialization of the first application, initializing the camera, and (iii) maintaining the camera in an initialized state as long as the first application is running. The application may be further configured for focusing the camera and maintaining the camera in a focused state as long as the first application is running.

336 citations


Proceedings ArticleDOI
23 Dec 2013
TL;DR: Qualitative results demonstrate high quality reconstructions even visually comparable to active depth sensor-based systems such as KinectFusion, making such systems even more accessible.
Abstract: MonoFusion allows a user to build dense 3D reconstructions of their environment in real-time, utilizing only a single, off-the-shelf web camera as the input sensor. The camera could be one already available in a tablet, phone, or a standalone device. No additional input hardware is required. This removes the need for power intensive active sensors that do not work robustly in natural outdoor lighting. Using the input stream of the camera we first estimate the 6DoF camera pose using a sparse tracking method. These poses are then used for efficient dense stereo matching between the input frame and a key frame (extracted previously). The resulting dense depth maps are directly fused into a voxel-based implicit model (using a computationally inexpensive method) and surfaces are extracted per frame. The system is able to recover from tracking failures as well as filter out geometrically inconsistent noise from the 3D reconstruction. Our method is both simple to implement and efficient, making such systems even more accessible. This paper details the algorithmic components that make up our system and a GPU implementation of our approach. Qualitative results demonstrate high quality reconstructions even visually comparable to active depth sensor-based systems such as KinectFusion.

130 citations


Patent
29 Nov 2013
TL;DR: In this paper, a smart camera system is disclosed, which can work with cloud data storage systems and compute cloud, and a call center can access the cloud to provide security monitoring services.
Abstract: A smart camera system is disclosed. The camera can work with cloud data storage systems and compute cloud. A call center can access the cloud to provide security monitoring services.

119 citations


Patent
08 Feb 2013

112 citations


Patent
19 Feb 2013
TL;DR: In this paper, a robotic system that includes a mobile robot and a remote input device is described, where the input device may be a joystick that is used to move a camera and a mobile platform of the robot.
Abstract: A robotic system that includes a mobile robot and a remote input device. The input device may be a joystick that is used to move a camera and a mobile platform of the robot. The system may operate in a mode where the mobile platform moves in a camera reference coordinate system. The camera reference coordinate system is fixed to a viewing image provided by the camera so that movement of the robot corresponds to a direction viewed on a screen. This prevents disorientation during movement of the robot if the camera is panned across a viewing area.

91 citations


Journal ArticleDOI
TL;DR: A fall detection algorithm employing histograms of edge orientations and strengths, and an optical flow-based method for activity classification is presented and Experimental results show the success of the proposed method.
Abstract: Robust detection of events and activities, such as falling, sitting, and lying down, is a key to a reliable elderly activity monitoring system. While fast and precise detection of falls is critical in providing immediate medical attention, other activities like sitting and lying down can provide valuable information for early diagnosis of potential health problems. In this paper, we present a fall detection and activity classification system using wearable cameras. Since the camera is worn by the subject, monitoring is not limited to confined areas, and extends to wherever the subject may go including indoors and outdoors. Furthermore, since the captured images are not of the subject, privacy concerns are alleviated. We present a fall detection algorithm employing histograms of edge orientations and strengths, and propose an optical flow-based method for activity classification. The first set of experiments has been performed with prerecorded video sequences from eight different subjects wearing a camera on their waist. Each subject performed around 40 trials, which included falling, sitting, and lying down. Moreover, an embedded smart camera implementation of the algorithm was also tested on a CITRIC platform with subjects wearing the CITRIC camera, and each performing 50 falls and 30 non-fall activities. Experimental results show the success of the proposed method.

83 citations


Proceedings ArticleDOI
21 Oct 2013
TL;DR: This work presents a method to generate aesthetic video from a robotic camera by incorporating a virtual camera operating on a delay, and a hybrid controller which uses feedback from both the robotic and virtual cameras.
Abstract: We present a method to generate aesthetic video from a robotic camera by incorporating a virtual camera operating on a delay, and a hybrid controller which uses feedback from both the robotic and virtual cameras. Our strategy employs a robotic camera to follow a coarse region-of-interest identified by a realtime computer vision system, and then resamples the captured images to synthesize the video that would have been recorded along a smooth, aesthetic camera trajectory. The smooth motion trajectory is obtained by operating the virtual camera on a short delay so that perfect knowledge of immediate future events is known. Previous autonomous camera installations have employed either robotic cameras or stationary wide-angle cameras with subregion cropping. Robotic cameras track the subject using realtime sensor data, and regulate a smoothness-latency trade-off through control gains. Fixed cameras post-process the data and suffer significant reductions in image resolution when the subject moves freely over a large area.Our approach provides a solution for broadcasting events from locations where camera operators cannot easily access. We can also offer broadcasters additional actuated camera angles without the overhead of additional human operators. Experiments on our prototype system for college basketball illustrate how our approach better mimics human operators compared to traditional robotic control approaches, while avoiding the loss in resolution that occurs from fixed camera system.

74 citations


Journal ArticleDOI
TL;DR: In this paper, the capability of UAV-based data collection was evaluated for two different consumer camera systems and compared to an aerial survey with a state-of-the-art digital airborne camera system.
Abstract: UAVs are becoming standard platforms for photogrammetric data capture especially while aiming at large scale aerial mapping for areas of limited extent. Such applications especially benefit from the very reasonable price of a small light UAS including control system and standard consumer grade digital camera, which is some orders of magnitude lower compared to digital photogrammetric systems. Within the paper the capability of UAV-based data collection will be evaluated for two different consumer camera systems and compared to an aerial survey with a state-of-the-art digital airborne camera system. During this evaluation, the quality of 3D point clouds generated by dense multiple image matching will be used as a benchmark. Also due to recent software developments such point clouds can be generated at a resolution similar to the ground sampling distance of the available imagery and are used for an increasing number of applications. Usually, image matching benefits from the good images quality as provided from digital airborne camera systems, which is frequently not available from the low-cost sensor components used for UAV image collection. Within the paper an investigation on UAV-based 3D data capture will be presented. For this purpose dense 3D point clouds are generated for a test area from three different platforms: first a UAV with a light weight compact camera, second a system using a system camera and finally a medium-format airborne digital camera system. Despite the considerable differences in system costs, suitable results can be derived from all data, especially if large redundancy is available such highly overlapping image blocks are not only beneficial during georeferencing, but are especially advantageous while aiming at a dense and accurate image based 3D surface reconstruction.

59 citations


Patent
24 Apr 2013
TL;DR: An array camera, a mobile terminal, and methods for operating the same are disclosed in this article, which includes applying a different camera environment setting from the other camera modules to at least one of the plurality of camera modules, and combining the acquired images.
Abstract: An array camera, a mobile terminal, and methods for operating the same are disclosed. The method for operating an array camera including a plurality of camera modules includes applying a different camera environment setting from the other camera modules to at least one of the plurality of camera modules, acquiring images through the plurality of camera modules, and combining the acquired images.

58 citations


Journal ArticleDOI
TL;DR: This study reports on the development of an automated method for designing the optimal camera network for a given cultural heritage building or statue using a mathematical non-linear optimization with constraints.

Journal ArticleDOI
TL;DR: A multimodal wireless smart camera equipped with a pyroelectric infrared sensor and solar energy harvester is introduced and simulation results show how perpetual work can be achieved in an outdoor scenario within a typical video surveillance application dealing with abandoned/removed object detection.
Abstract: Surveillance is one of the most promising applications for wireless sensor networks, stimulated by a confluence of simultaneous advances in key disciplines: computer vision, image sensors, embedded computing, energy harvesting, and sensor networks. However, computer vision typically requires notable amounts of computing performance, a considerable memory footprint and high power consumption. Thus, wireless smart cameras pose a challenge to current hardware capabilities in terms of low-power consumption and high imaging performance. For this reason, wireless surveillance systems still require considerable amount of research in different areas such as mote architectures, video processing algorithms, power management, energy harvesting and distributed engine. In this paper, we introduce a multimodal wireless smart camera equipped with a pyroelectric infrared sensor and solar energy harvester. The aim of this work is to achieve the following goals: 1) combining local processing, low power hardware design, power management and energy harvesting to develop a low-power, low-cost, power-aware, and self-sustainable wireless video sensor node for video processing on board; 2) develop an energy efficient smart camera with high accuracy abandoned/removed object detection capability. The efficiency of our approach is demonstrated by experimental results in terms of power consumption and video processing accuracy as well as in terms of self-sustainability. Finally, simulation results show how perpetual work can be achieved in an outdoor scenario within a typical video surveillance application dealing with abandoned/removed object detection.

Patent
14 Aug 2013
TL;DR: In this article, a camera system is installed on the front end of a vehicle, either on the left front, the right front, or both sides, either wired or wireless connection to an onboard computer and a navigation display located within the passenger compartment of the vehicle.
Abstract: A camera system is installed on the front end of a vehicle, either on the left front, the right front, or both sides. The camera is linked via wired or wireless connection to an onboard computer and a navigation display that is located within the passenger compartment of the vehicle. The driver reviews a visual description on the display of any oncoming traffic in the form of motor vehicles, pedestrians, cyclists, animals and the like on the navigation display via a single screen, split screen or alternating screens. The camera system can include a speed sensor that detects when the vehicle reaches a threshold speed to activate or de-activate the camera. Alternatively, the computer can activate the system when a turn signal is activated, and de-activate the system when the turn signal is no longer activated. This camera system can be retrofitted into older vehicles.

Patent
15 Mar 2013
TL;DR: In this paper, the authors present methods and apparatus to generate multiple images that are combined into a stereoscopic or a panoramic image using a remote computing device with a camera and a digital compass.
Abstract: Methods and apparatus to create and display screen stereoscopic and panoramic images are disclosed. Methods and apparatus are provided to generate multiple images that are combined into a stereoscopic or a panoramic image. A controller provides correct camera settings for different conditions. A controller rotationally aligns images of lens/sensor units that are rotationally misaligned. A compact controllable platform holds and rotates a camera. A remote computing device with a camera and a digital compass tracks an object causing the camera in the platform to track the object.

Journal ArticleDOI
24 Jan 2013-Sensors
TL;DR: The advantage of a smart one-chip camera design with NDVI image performance is demonstrated in terms of low cost and simplified design and new algorithms for establishing an enhancedNDVI image quality for data processing are discussed.
Abstract: The application of (smart) cameras for process control, mapping, and advanced imaging in agriculture has become an element of precision farming that facilitates the conservation of fertilizer, pesticides, and machine time. This technique additionally reduces the amount of energy required in terms of fuel. Although research activities have increased in this field, high camera prices reflect low adaptation to applications in all fields of agriculture. Smart, low-cost cameras adapted for agricultural applications can overcome this drawback. The normalized difference vegetation index (NDVI) for each image pixel is an applicable algorithm to discriminate plant information from the soil background enabled by a large difference in the reflectance between the near infrared (NIR) and the red channel optical frequency band. Two aligned charge coupled device (CCD) chips for the red and NIR channel are typically used, but they are expensive because of the precise optical alignment required. Therefore, much attention has been given to the development of alternative camera designs. In this study, the advantage of a smart one-chip camera design with NDVI image performance is demonstrated in terms of low cost and simplified design. The required assembly and pixel modifications are described, and new algorithms for establishing an enhanced NDVI image quality for data processing are discussed.

Journal ArticleDOI
TL;DR: A novel and high-quality strategy for real-time moving object detection by nonparametric modeling is presented, suitable for its application to smart cameras operating in real time in a large variety of scenarios.
Abstract: Recently, the number of electronic devices with smart cameras has grown enormously. These devices require new, fast, and efficient computer vision applications that include moving object detection strategies. In this paper, a novel and high-quality strategy for real-time moving object detection by nonparametric modeling is presented. It is suitable for its application to smart cameras operating in real time in a large variety of scenarios. While the background is modeled using an innovative combination of chromaticity and gradients, reducing the influence of shadows and reflected light in the detections, the foreground model combines this information and spatial information. The application of a particle filter allows to update the spatial information and provides a priori knowledge about the areas to analyze in the following images, enabling an important reduction in the computational requirements and improving the segmentation results. The quality of the results and the achieved computational efficiency show the suitability of the proposed strategy to enable new applications and opportunities in last generation of electronic devices.

Proceedings ArticleDOI
01 Dec 2013
TL;DR: A novel design for providing a third-person view for a ground vehicle using a dynamic, external camera mounted on a quadcopter, which results in fewer collisions and more victims being located, compared to the front-mounted camera.
Abstract: Navigating remote robots and providing awareness of the remote environment is essential in many teleoperated tasks An external view on the remote robot, a bird's eye view, is thought to improve operator performance In this paper we explore a novel design for providing such a third-person view for a ground vehicle using a dynamic, external camera mounted on a quadcopter Compared to earlier methods that use 3D reconstruction to create third-person views, our approach comprises a true third-person view through a video feed We so provide visually rich, live information to the operator In an experiment simulating a search and rescue mission in a simplified environment, we compared our proposed design to a pole-mounted camera and to a traditional front-mounted camera The third-person perspective provided by our flying camera and pole-mounted camera resulted in fewer collisions and more victims being located, compared to the front-mounted camera

Journal ArticleDOI
TL;DR: A test platform that tracks laparoscopic instruments and automatically moves a camera with no explicit human direction is developed, showing that it is possible to create an autonomous camera system that cooperates with a surgeon without requiring any explicit user input.
Abstract: Introduction: During laparoscopic surgery, the surgeon currently must instruct a human camera operator or a robotic arm to move the camera This process is distracting, and the camera is not always placed in an ideal location To mitigate these problems, we have developed a test platform that tracks laparoscopic instruments and automatically moves a camera with no explicit human direction Materials and Methods: The test platform is designed to mimic a typical laparoscopic working environment, where two hand-operated tools are manipulated through small ports A pan-tilt-zoom camera is positioned over the tools, which emulates the positioning capabilities of a straight (0°) scope placed through a trocar A camera control algorithm automatically keeps the tools in the view In addition, two test tasks that require camera movement have been developed to aid in future evaluation of the system Results: The system was found to successfully track the laparoscopic instruments in the camera view as inten

Patent
22 Mar 2013
TL;DR: In this article, a vision system for a vehicle includes a camera and an image processor, which can detect objects that are present forward of the vehicle and outside of the forward field of view of the camera.
Abstract: A vision system for a vehicle includes a camera and an image processor. The camera has a forward field of view exterior of the vehicle. The image processor is operable to process image data captured by the camera. At least one device is operable to detect objects that are present forward of the vehicle and outside of the forward field of view of the camera. The device may include at least one of (i) a sensor, (ii) an element of a vehicle-to-vehicle communication system and (iii) an element of a vehicle-to-infrastructure communication system. Responsive to detection of the object being indicative of the object about to enter the field of view of the camera, the image processor anticipates the object entering the field of view of the camera and, upon entering of the field of view of the camera by the object, the image processor detects the object.

Journal ArticleDOI
TL;DR: The efforts to build a low-bandwidth wireless camera network platform, called CITRIC, and its applications in smart camera networks are provided, and concrete research results will be demonstrated in two areas, namely, distributed coverage hole identification and distributed object recognition.
Abstract: Smart camera networks have recently emerged as a new class of sensor network infrastructure that is capable of supporting high-power in-network signal processing and enabling a wide range of applications. In this article, we provide an exposition of our efforts to build a low-bandwidth wireless camera network platform, called CITRIC, and its applications in smart camera networks. The platform integrates a camera, a microphone, a frequency-scalable (up to 624 MHz) CPU, 16 MB FLASH, and 64 MB RAM onto a single device. The device then connects with a standard sensor network mote to form a wireless camera mote. With reasonably low power consumption and extensive algorithmic libraries running on a decent operating system that is easy to program, CITRIC is ideal for research and applications in distributed image and video processing. Its capabilities of in-network image processing also reduce communication requirements, which has been high in other existing camera networks with centralized processing. Furthermore, the mote easily integrates with other low-bandwidth sensor networks via the IEEE 802.15.4 protocol. To justify the utility of CITRIC, we present several representative applications. In particular, concrete research results will be demonstrated in two areas, namely, distributed coverage hole identification and distributed object recognition.

Patent
22 Jul 2013
TL;DR: In this paper, a camera detects devices, such as other cameras, smart devices, and access points, with which the camera may communicate and switches between operating as a wireless station and a wireless access point.
Abstract: A camera detects devices, such as other cameras, smart devices, and access points, with which the camera may communicate. The camera may alternate between operating as a wireless station and a wireless access point. The camera may connect to and receive credentials from a device for another device to which it is not connected. In one embodiment, the camera is configured to operate as a wireless access point, and is configured to receive credentials from a smart device operating as a wireless station. The camera may then transfer the credentials to additional cameras, each configured to operate as wireless stations. The camera and additional cameras may connect to a smart device directly or indirectly (for instance, through an access point), and the smart device may change the camera mode of the cameras. The initial modes of the cameras may be preserved and restored by the smart device upon disconnection.

Journal ArticleDOI
TL;DR: A robust threshold decision algorithm for video object segmentation with a multibackground model and a video object tracking framework based on a particle filter with the likelihood function composed of diffusion distance for measuring color histogram similarity and motion clue from video object segmentsation are proposed.
Abstract: Video object segmentation and tracking are two essential building blocks of smart surveillance systems. However, there are several issues that need to be resolved. Threshold decision is a difficult problem for video object segmentation with a multibackground model. In addition, some conditions make robust video object tracking difficult. These conditions include nonrigid object motion, target appearance variations due to changes in illumination, and background clutter. In this paper, a video object segmentation and tracking framework is proposed for smart cameras in visual surveillance networks with two major contributions. First, we propose a robust threshold decision algorithm for video object segmentation with a multibackground model. Second, we propose a video object tracking framework based on a particle filter with the likelihood function composed of diffusion distance for measuring color histogram similarity and motion clue from video object segmentation. The proposed framework can track nonrigid moving objects under drastic changes in illumination and background clutter. Experimental results show that the presented algorithms perform well for several challenging sequences, and our proposed methods are effective for the aforementioned issues.

Posted Content
TL;DR: A comprehensive survey of recent research results is presented to address the problems of intra-camera tracking, topological structure learning, target appearance modeling, and global activity understanding in sparse camera networks.
Abstract: Technological advances in sensor manufacture, communication, and computing are stimulating the development of new applications that are transforming traditional vision systems into pervasive intelligent camera networks The analysis of visual cues in multi-camera networks enables a wide range of applications, from smart home and office automation to large area surveillance and traffic surveillance While dense camera networks - in which most cameras have large overlapping fields of view - are well studied, we are mainly concerned with sparse camera networks A sparse camera network undertakes large area surveillance using as few cameras as possible, and most cameras have non-overlapping fields of view with one another The task is challenging due to the lack of knowledge about the topological structure of the network, variations in the appearance and motion of specific tracking targets in different views, and the difficulties of understanding composite events in the network In this review paper, we present a comprehensive survey of recent research results to address the problems of intra-camera tracking, topological structure learning, target appearance modeling, and global activity understanding in sparse camera networks A number of current open research issues are discussed

Journal ArticleDOI
08 Jul 2013-Sensors
TL;DR: This paper proposes a distributed activity classification framework, in which several camera sensors are observing the scene, and each camera processes its own observations, and while communicating with other cameras, they come to an agreement about the activity class.
Abstract: With the increasing demand on the usage of smart and networked cameras in intelligent and ambient technology environments, development of algorithms for such resource-distributed networks are of great interest. Multi-view action recognition addresses many challenges dealing with view-invariance and occlusion, and due to the huge amount of processing and communicating data in real life applications, it is not easy to adapt these methods for use in smart camera networks. In this paper, we propose a distributed activity classification framework, in which we assume that several camera sensors are observing the scene. Each camera processes its own observations, and while communicating with other cameras, they come to an agreement about the activity class. Our method is based on recovering a low-rank matrix over consensus to perform a distributed matrix completion via convex optimization. Then, it is applied to the problem of human activity classification. We test our approach on IXMAS and MuHAVi datasets to show the performance and the feasibility of the method.

Journal ArticleDOI
TL;DR: A voting-based motion estimation algorithm is proposed to process the image frames captured by the mobile camera and, under limited network bandwidth, the transmitted image quality can be progressively achieved and the transmission bandwidth utilization can be effectively decreased.
Abstract: In a video-based surveillance system, a mobile camera can provide dynamical and wider monitoring range and the video data transmitted from cooperative mobile cameras can be used to actively detect the objects of interest. However, it is a difficult task to accurately detect the moving objects from the image frames captured by the mobile cameras and the data flow of surveillance video from multiple cameras could be huge. The camera motion usually causes the shifting of static background as well as the moving objects in the captured image frames. In order to correctly estimate the motion of moving objects, a voting-based motion estimation algorithm is proposed to process the image frames captured by the mobile camera. Based on the estimation, a content-based video transmission mechanism is then implemented to further effectively decrease encoding cost and bandwidth utilization. The overall approach consists of voting-based motion estimation, moving object edges detection and content-based sampling coding at temporal and spatial scales. Without knowing the prior knowledge of camera motion, the motion estimation algorithm only utilizes the shifting information of edges of static background to estimate the camera movement. The shifting information is determined based on the voting decision of several representative regions of interest and the estimated motion is then used to compensate for the visual content obtained from the captured image frames. The proposed algorithms have been experimentally tested on several practical scenarios and it is demonstrated that, under limited network bandwidth, the transmitted image quality can be progressively achieved and the transmission bandwidth utilization can be effectively decreased.

Journal ArticleDOI
10 May 2013-Sensors
TL;DR: The experimental results show that the proposed access control system has the advantages of high precision, safety, reliability, and can be responsive to demands, while preserving the benefits of being low cost and high added value.
Abstract: This paper presents an innovative access control system, based on human detection and path analysis, to reduce false automatic door system actions while increasing the added values for security applications. The proposed system can first identify a person from the scene, and track his trajectory to predict his intention for accessing the entrance, and finally activate the door accordingly. The experimental results show that the proposed system has the advantages of high precision, safety, reliability, and can be responsive to demands, while preserving the benefits of being low cost and high added value.

Proceedings ArticleDOI
18 Mar 2013
TL;DR: This paper demonstrates how an anonymous ”stick figure” like description of the motion of a user's body parts provided by the vision system with the sensor signals as a means of analyzing the sensors' properties can be used to determine on which body part a motion sensor is worn.
Abstract: In this paper we investigate how vision based devices (cameras or the Kinect controller) that happen to be in the users' environment can be used to improve and fine tune on body sensor systems for activity recognition. Thus we imagine a user with his on body activity recognition system passing through a space with a video camera (or a Kinect), picking up some information, and using it to improve his system. The general idea is to correlate an anonymous ”stick figure” like description of the motion of a user's body parts provided by the vision system with the sensor signals as a means of analyzing the sensors' properties. In the paper we for example demonstrate how such a correlation can be used to determine, without the need to train any classifiers, on which body part a motion sensor is worn.

Journal ArticleDOI
TL;DR: A novel model and the corresponding calibration approach are presented, which take the sensor as an integrated structure with the viewpoint that the camera captures two virtual points formed by the same object point due to the reflection effects of the mirrors.

Proceedings ArticleDOI
01 Oct 2013
TL;DR: A novel method for video cameras positioning and reconfiguration, to maximize visual coverage in complex indoor environments by taking into account a number of parameters on the quality of view at the camera position.
Abstract: In this paper we present a novel method for video cameras positioning and reconfiguration, to maximize visual coverage in complex indoor environments. Based on a suitable modeling of the camera field-of-view and of the environmental setup, the optimization procedure determines the most appropriate configuration of cameras to satisfy a coverage objective, taking into account a number of parameters on the quality of view at the camera position. This includes the global ground area coverage, the expected geometric distortion, and the entropy of the image. The proposed solution has been validated in different environmental setups, including synthetic settings, taking into account the presence of obstacles and constraints.

Proceedings ArticleDOI
21 Oct 2013
TL;DR: This paper presents an active camera-based realtime high-quality face image acquisition system, which utilizes pan-tilt-zoom parameters of a camera to focus on a human face in a scene and employs a face quality assessment method to log the best quality faces from the captured frames.
Abstract: Traditional still camera-based facial image acquisition systems in surveillance applications produce low quality face images. This is mainly due to the distance between the camera and subjects of interest. Furthermore, people in such videos usually move around, change their head poses, and facial expressions. Moreover, the imaging conditions like illumination, occlusion, and noise may change. These all aggregate the quality of most of the detected face images in terms of measures like resolution, pose, brightness, and sharpness. To deal with these problems this paper presents an active camera-based realtime high-quality face image acquisition system, which utilizes pan-tilt-zoom parameters of a camera to focus on a human face in a scene and employs a face quality assessment method to log the best quality faces from the captured frames. The system consists of four modules: face detection, camera control, face tracking, and face quality assessment before logging. Experimental results show that the proposed system can effectively log the high quality faces from the active camera in real-time (an average of 61.74ms was spent per frame) with an accuracy of 85.27% compared to human annotated data.