scispace - formally typeset
Search or ask a question

Showing papers in "Real-time Imaging in 2005"


Journal ArticleDOI
TL;DR: A real-time algorithm for foreground-background segmentation that can handle scenes containing moving backgrounds or illumination variations, and it achieves robust detection for different types of videos is presented.
Abstract: We present a real-time algorithm for foreground-background segmentation. Sample background values at each pixel are quantized into codebooks which represent a compressed form of background model for a long image sequence. This allows us to capture structural background variation due to periodic-like motion over a long period of time under limited memory. The codebook representation is efficient in memory and speed compared with other background modeling techniques. Our method can handle scenes containing moving backgrounds or illumination variations, and it achieves robust detection for different types of videos. We compared our method with other multimode modeling techniques. In addition to the basic algorithm, two features improving the algorithm are presented-layered modeling/detection and adaptive codebook updating. For performance evaluation, we have applied perturbation detection rate analysis to four background subtraction algorithms and two videos of different types of scenes.

1,552 citations


Journal ArticleDOI
TL;DR: The proposed technique employs the switching scheme based on the impulse detection mechanism using the so-called peer group concept and consistently yields very good results in suppressing both the random and fixed-valued impulsive noise.
Abstract: In this paper, a novel approach to the impulsive noise removal in color images is presented. The proposed technique employs the switching scheme based on the impulse detection mechanism using the so-called peer group concept. Compared to the vector median filter and other commonly used multichannel filters, the proposed technique consistently yields very good results in suppressing both the random and fixed-valued impulsive noise. The main advantage of the proposed noise detection framework is its enormous computational speed, which enables efficient filtering of color images in real-time applications.

190 citations


Journal ArticleDOI
TL;DR: A ground-based real-time remote sensing system for detecting diseases in arable crops under field conditions and in an early stage of disease development, before it can visibly be detected through sensor fusion of hyper-spectral reflection information between 450 and 900nm and fluorescence imaging is developed.
Abstract: The objective of this research was to develop a ground-based real-time remote sensing system for detecting diseases in arable crops under field conditions and in an early stage of disease development, before it can visibly be detected. This was achieved through sensor fusion of hyper-spectral reflection information between 450 and 900nm and fluorescence imaging. The work reported here used yellow rust (Puccinia striiformis) disease of winter wheat as a model system for testing the featured technologies. Hyper-spectral reflection images of healthy and infected plants were taken with an imaging spectrograph under field circumstances and ambient lighting conditions. Multi-spectral fluorescence images were taken simultaneously on the same plants using UV-blue excitation. Through comparison of the 550 and 690nm fluorescence images, it was possible to detect disease presence. The fraction of pixels in one image, recognized as diseased, was set as the final fluorescence disease variable called the lesion index (LI). A spectral reflection method, based on only three wavebands, was developed that could discriminate disease from healthy with an overall error of about 11.3%. The method based on fluorescence was less accurate with an overall discrimination error of about 16.5%. However, fusing the measurements from the two approaches together allowed overall disease from healthy discrimination of 94.5% by using QDA. Data fusion was also performed using a Self-Organizing Map (SOM) neural network which decreased the overall classification error to 1%. The possible implementation of the SOM-based disease classifier for rapid retraining in the field is discussed. Further, the real-time aspects of the acquisition and processing of spectral and fluorescence images are discussed. With the proposed adaptations the multi-sensor fusion disease detection system can be applied in the real-time detection of plant disease in the field.

181 citations


Journal ArticleDOI
TL;DR: This work describes the development of a unique industrial inline material sorting system which uses the spectral imaging technique, and the main functional parts and the sensor unit are described in detail.
Abstract: Spectral imaging is becoming increasingly interesting not only for agricultural use but also for industrial applications. Wavelengths in the near infrared (NIR) range, in particular, can be used for materials classification. However, sorting paper according to quality is a very difficult task due to the close similarities between the materials. This work describes the development of a unique industrial inline material sorting system which uses the spectral imaging technique. The main functional parts and the sensor unit are described in detail. Classification methods for cellulose-based materials such as pulp, paper and cardboard will be discussed, as will hardware requirements for the industrial use of spectral imaging solutions, including adjustment and calibration techniques. The description of the software design focuses on the classification speed required.

132 citations


Journal ArticleDOI
TL;DR: It is shown that the new filter outperforms the classical-order statistics filtering techniques and its performance is similar to FSVF, outperforming it in some cases.
Abstract: In this paper, the problem of impulsive noise reduction in multichannel images is addressed. A new filter is proposed on the basis of a recently introduced family of computationally attractive filters with a good detail-preserving ability (FSVF). FSVF is based on privileging the central pixel in each filtering window in order to replace it only when it is really noisy and preserve the original undistorted image structures. The new filter is based on a novel fuzzy metric and it is created by combining the mentioned scheme and the fuzzy metric. The use of the fuzzy metric makes the filter computationally simpler and it allows to adjust the privilege of the central pixel giving the filter an adaptive nature. Moreover, it is shown that the new filter outperforms the classical-order statistics filtering techniques and its performance is similar to FSVF, outperforming it in some cases.

130 citations


Journal ArticleDOI
TL;DR: A robust approach to extract moving objects on H.264/AVC compressed video using a block-based Markov Random Field model to segment moving objects from the sparse motion vector field obtained directly from the bitstream is proposed.
Abstract: Moving object segmentation in compressed domain plays an important role in many real-time applications, e.g. video indexing, video transcoding, video surveillance, etc. Because H.264/AVC is the up-to-date video-coding standard, few literatures have been reported in the area of video analysis on H.264/AVC compressed video. Compared with the former MPEG standard, H.264/AVC employs several new coding tools and provides a different video format. As a consequence, moving object segmentation on H.264/AVC compressed video is a new task and challenging work. In this paper, a robust approach to extract moving objects on H.264/AVC compressed video is proposed. Our algorithm employs a block-based Markov Random Field (MRF) model to segment moving objects from the sparse motion vector field obtained directly from the bitstream. In the proposed method, object tracking is integrated in the uniform MRF model and exploits the object temporal consistency simultaneously. Experiments show that our approach provides the remarkable performance and can extract moving objects efficiently and robustly. The prominent applications of the proposed algorithm are object-based transcoding, fast moving object detection, video analysis on compressed video, etc.

126 citations


Journal ArticleDOI
TL;DR: The proposed AFM can track deformable, partially occluded objects by using the greatly reduced number of feature points rather than taking entire shapes in the existing shape-based methods, which can realize a real-time, robust tracking system.
Abstract: This paper presents a feature-based object tracking algorithm using optical flow under the non-prior training (NPT) active feature model (AFM) framework. The proposed tracking procedure can be divided into three steps: (i) localization of an object-of-interest, (ii) prediction and correction of the object's position by utilizing spatio-temporal information, and (iii) restoration of occlusion using NPT-AFM. The proposed algorithm can track both rigid and deformable objects, and is robust against the object's sudden motion because both a feature point and the corresponding motion direction are tracked at the same time. Tracking performance is not degraded even with complicated background because feature points inside an object are completely separated from background. Finally, the AFM enables stable tracking of occluded objects with maximum 60% occlusion. NPT-AFM, which is one of the major contributions of this paper, removes the off-line, preprocessing step for generating a priori training set. The training set used for model fitting can be updated at each frame to make more robust object's features under occluded situation. The proposed AFM can track deformable, partially occluded objects by using the greatly reduced number of feature points rather than taking entire shapes in the existing shape-based methods. The on-line updating of the training set and reducing the number of feature points can realize a real-time, robust tracking system. Experiments have been performed using several in-house video clips of a static camera including objects such as a robot moving on a floor and people walking both indoor and outdoor. In order to show the performance of the proposed tracking algorithm, some experiments have been performed under noisy and low-contrast environment. For more objective comparison, PETS 2001 and PETS 2002 datasets were also used.

98 citations


Journal ArticleDOI
TL;DR: The real-time segmentation of surgical instruments with color images used in minimally invasive surgery is addressed and a technique based on a discriminant color feature with robustness capabilities with respect to intensity variations and specularities is developed.
Abstract: In this paper, the real-time segmentation of surgical instruments with color images used in minimally invasive surgery is addressed. This work has been developed in the scope of the robotized laparoscopic surgery, specifically for the detection and tracking of gray regions and accounting for images of metallic instruments inside the abdominal cavity. With this environment, the moving background due to the breathing motion, the non-uniform and time-varying lighting conditions and the presence of specularities are the main difficulties to overcome. Then, to achieve an automatic color segmentation suitable for robot control, we developed a technique based on a discriminant color feature with robustness capabilities with respect to intensity variations and specularities. We also designed an adaptive region growing with automatic region seed detection and a model-based region classification, both dedicated to laparoscopy. The foreseen application is a good training ground to evaluate the proposed technique and the effectiveness of this work has been demonstrated through experimental results with endoscopic image sequences to efficiently locate boundaries of a landmark-free needle-holder at half the video-rate.

75 citations


Journal ArticleDOI
TL;DR: This paper presents a detailed description of a real-time correlation-based stereo algorithm running completely on the graphics processing unit (GPU) to free up the main processor for other tasks including high-level interpretation of the stereo results.
Abstract: This paper presents a detailed description of a real-time correlation-based stereo algorithm running completely on the graphics processing unit (GPU). This is important since it allows to free up the main processor for other tasks including high-level interpretation of the stereo results. We first introduce a two-view stereo algorithm that includes some advanced features such as adaptive windows and cross-checking. Then we extend it using a plane-sweep approach to allow multiple frames without rectification. By taking advantage of advanced features of recent GPUs the proposed algorithm runs in real-time. Our implementation running on an ATI Radeon 9800 graphics card achieves up to 289 million disparity evaluations per second including all the overhead to download images and read-back the disparity map, which is several times faster than commercially available CPU-based implementations.

65 citations


Journal ArticleDOI
TL;DR: A novel real-time 3D and color sensor for the mid-distance range (0.1-3m) based on color-encoded structured light is presented, designed to assist and complement a face authentication system integrating both 2D and 3D images.
Abstract: In this paper, a novel real-time 3D and color sensor for the mid-distance range (0.1-3m) based on color-encoded structured light is presented. The sensor is integrated using low-cost of-the-shelf components and allows the combination of 2D and 3D image processing algorithms, since it provides a 2D color image of the scene in addition to the range data. Its design is focused on enabling the system to operate reliably in real-world scenarios, i.e. in uncontrolled environments and with arbitrary scenes. To that end, novel approaches for encoding and recognizing the projected light are used, which make the system practically independent of intrinsic object colors and minimize the influence of the ambient light conditions. The system was designed to assist and complement a face authentication system integrating both 2D and 3D images. Depth information is used for robust face detection, localization and 3D pose estimation. To cope with illumination and pose variations, 3D information is used for the normalization of the input images. The performance and robustness of the proposed system is tested on a face database recorded in conditions similar to those encountered in real-world applications.

60 citations


Journal ArticleDOI
TL;DR: A new implementation of wavelet packet decomposition which is combined with SPIHT (Set Partitioning in Hierarchical Trees) compression scheme is introduced and it has been shown that WP-SPIHT significantly outperforms base-line SPIHT coder for texture images.
Abstract: This paper introduces a new implementation of wavelet packet decomposition which is combined with SPIHT (Set Partitioning in Hierarchical Trees) compression scheme. We provide the analysis of the problems arising from the application of zerotree quantisation based algorithms (such as SPIHT) to wavelet packet transform coefficients. We established the generalized parent-child relationships for wavelet packets, providing complete tree structures for SPIHT. The proposed algorithm can be used for both wavelet dyadic and Wavelet Packet decomposition (WP-SPIHT). An extensive evaluation of the algorithm was performed and it has been shown that WP-SPIHT significantly outperforms base-line SPIHT coder for texture images. For these images the suboptimal WP cost-function enables good enough energy compaction that is efficiently exploited by the WP-SPIHT.

Journal ArticleDOI
TL;DR: Video-based surveillance (or video surveillance) is one of the fastest growing sectors in the security market due to the high amount of useful information that can be extracted from a video sequence.
Abstract: Video-based surveillance (or video surveillance) is one of the fastest growing sectors in the security market. This is due to the high amount of useful information that can be extracted from a video sequence. In particular, the automatic real-time processing of video objects, i.e., the extraction of video objects and related high-level content, is hereby of paramount importance. High-level video content, e.g., object activities and events, are generally related to the movement of video objects. This is related to the human visual system (HVS) which is strongly attracted to moving objects creating luminance change [Nothdurft (1993); Franconeri and Simons (2003); Abrams and Christ (2003)].

Journal ArticleDOI
TL;DR: A new adaptive vector filter is proposed for impulse noise suppression and its relationship with the recent impulse reduction filters is investigated, which outperforms other prior-art methods in suppressing impulse noise in natural color images.
Abstract: A new adaptive vector filter is proposed for impulse noise suppression and its relationship with the recent impulse reduction filters is investigated. The new filter detects outliers presented in the image through a novel neighborhood evaluation process, which significantly improves the accuracy of noise detection and detail preservation. The computational complexity of the new filter is very competitive. Its two parameters can be configured efficiently using online/offline optimization processes. Extensive simulations indicated that the new filter outperforms other prior-art methods in suppressing impulse noise in natural color images.

Journal ArticleDOI
TL;DR: The SmartSpectra system is described, its performance in the estimation of chlorophyll in plant leaves is demonstrated, and its implications in real-time applications are discussed.
Abstract: SmartSpectra is a smart multispectral system for industrial, environmental, and commercial applications where the use of spectral information beyond the visible range is needed. The SmartSpectra system provides six spectral bands in the range 400-1000nm. The bands are configurable in terms of central wavelength and bandwidth by using electronic tunable filters. SmartSpectra consists of a multispectral sensor and the software that controls the system and simplifies the acquisition process. A first prototype called Autonomous Tunable Filter System is already available. This paper describes the SmartSpectra system, demonstrates its performance in the estimation of chlorophyll in plant leaves, and discusses its implications in real-time applications.

Journal ArticleDOI
TL;DR: A new secret sharing scheme suitable for encrypting color images is introduced that can be used in secure transmission of digital imaging material over untrusted networks or it can serve as stand-alone image encryption solution.
Abstract: A new secret sharing scheme suitable for encrypting color images is introduced. The scheme, which can be viewed as a cost-effective, private-key cryptosystem, encrypts the secret color image into two color shares with dimensions identical to those of the original secret input. Cryptographic operations performed at the bit levels are used to alter both the spectral correlation among the RGB color components and the spatial correlation of the neighboring color vectors in the secret image. The original image is decrypted with perfect reconstruction using inverse logical operations which are applied to the noise-like color shares. The scheme can be used in secure transmission of digital imaging material over untrusted networks or it can serve as stand-alone image encryption solution.

Journal ArticleDOI
TL;DR: The paper demonstrates how, over a sufficient length of time, observations from the monitored scene itself can be used to parameterize the semantic landscape, and this scene knowledge must be automatically learnt to facilitate plug and play functionality.
Abstract: The accuracy of object tracking methodologies can be significantly improved by utilizing knowledge about the monitored scene. Such scene knowledge includes the homography between the camera and ground planes and the occlusion landscape identifying the depth map associated with the static occlusions in the scene. Using the ground plane, a simple method of relating the projected height and width of people objects to image location is used to constrain the dimensions of appearance models. Moreover, trajectory modeling can be greatly improved by performing tracking on the ground-plane tracking using global real-world noise models for the observation and dynamic processes. Finally, the occlusion landscape allows the tracker to predict the complete or partial occlusion of object observations. To facilitate plug and play functionality, this scene knowledge must be automatically learnt. The paper demonstrates how, over a sufficient length of time, observations from the monitored scene itself can be used to parameterize the semantic landscape.

Journal ArticleDOI
TL;DR: This paper describes the development of a low-cost and high-speed OMR system prototype for marking multiple-choice questions and the novelty of this approach is the implementation of the complete system into a single low- cost Field Programmable Gate Array (FPGA) to achieve the high processing speed.
Abstract: In today's fast-paced information-driven society, the need for accurate, timely, and cost-effective data collection is very critical Optical mark reader (OMR) systems can be used to achieve these aspects This paper describes the development of a low-cost and high-speed OMR system prototype for marking multiple-choice questions The novelty of this approach is the implementation of the complete system into a single low-cost Field Programmable Gate Array (FPGA) to achieve the high processing speed Effective mark detection and verification algorithms have been developed and implemented to achieve real-time performance at low computational cost The OMR is capable of processing a high-resolution CCD linear sensor with 3456pixels at 5000frame/s at the effective maximum clock rate of the sensor of 20MHz (4x5MHz) The performance of the prototype system is tested for different marker colours and marking methods At the end of the paper the proposed OMR system is compared with commercially available systems and the pro and cons are discussed

Journal ArticleDOI
TL;DR: Edge-sensing weights and the original color filter array data are used to detect structural elements in the captured image, and correct color components generated by the demosaicking process using adaptive, spectral model-based enhancement operations.
Abstract: This paper presents an efficient post-processing/enhancement solution capable of reducing visual artifacts introduced during the image demosaicking process. Edge-sensing weights and the original color filter array data are used to detect structural elements in the captured image, and correct color components generated by the demosaicking process using adaptive, spectral model-based enhancement operations. The solution produces excellent results in terms of both objective and subjective image quality measures.

Journal ArticleDOI
TL;DR: The design and field programmable gate array (FPGA) implementation of a non-separable 2-D DBWT architecture which is the heart of the proposed high-definition television (HDTV) compression system and conforms the JPEG-2000 standard is reported on.
Abstract: Recent advances in image analysis have shown that the application of 2-D discrete biorthogonal wavelet transform (DBWT) to digital image compression overcomes some of the barriers imposed by block-based transform coding algorithms while offering significant advantages in terms of coding gain, quality, natural compatibility with video formats requiring lower-resolution and graceful performance degradation when compressing at low bit rates. This paper reports on the design and field programmable gate array (FPGA) implementation of a non-separable 2-D DBWT architecture which is the heart of the proposed high-definition television (HDTV) compression system. The architecture adopts periodic symmetric extension at the image boundaries, therefore it conforms the JPEG-2000 standard. It computes the DBWT decomposition of an NxN image in approximately 2N^2/3 clock cycles (ccs). Hardware implementation results based on a Xilinx Virtex-2000E FPGA chip showed that the processing of 2-D DBWT can be performed at 105MHz providing a complete solution for the real-time computation of 2-D DBWT for HDTV compression.

Journal ArticleDOI
TL;DR: Simulation studies reported here indicate that the new zooming methods produced here produce excellent results, in terms of both objective and subjective evaluation metrics, and outperform conventional zooming schemes operating in the RGB domain.
Abstract: In this paper, zooming methods which operate directly on color filter array (CFA) data are proposed, analyzed, and evaluated. Under the proposed framework enlarged spatial resolution images are generated directly from the CFA-based image sensors. The reduced computational complexity of the proposed schemes makes them ideal for real-time surveillance systems, industrial strength computer vision solutions, and mobile sensor-based visual systems. Simulation studies reported here indicate that the new methods (i) produce excellent results, in terms of both objective and subjective evaluation metrics, and (ii) outperform conventional zooming schemes operating in the RGB domain.

Journal ArticleDOI
TL;DR: This work attempts to solve the 3-D limb tracking problem using only monocular imagery (a single 2-D video source) in largely unconstrained environments and presents a complete visual tracking system which incorporates target detection, target model acquisition/initialization, and target tracking components into a single, cohesive, probabilistic framework.
Abstract: The 3-D visual tracking of human limbs is fundamental to a wide array of computer vision applications including gesture recognition, interactive entertainment, biomechanical analysis, vehicle driver monitoring, and electronic surveillance. The problem of limb tracking is complicated by issues of occlusion, depth ambiguities, rotational ambiguities, and high levels of noise caused by loose fitting clothing. We attempt to solve the 3-D limb tracking problem using only monocular imagery (a single 2-D video source) in largely unconstrained environments. The approach presented is a movement towards full real-time operating capabilities. The described system presents a complete visual tracking system which incorporates target detection, target model acquisition/initialization, and target tracking components into a single, cohesive, probabilistic framework. The presence of a target is detected, using visual cues alone, by recognition of an individual performing a simple pre-defined initialization cue. The physical dimensions of the limb are then learned probabilistically until a statistically stable model estimate has been found. The appearance of the limb is learned in a joint spatial-chromatic domain which incorporates normalized color data with spatial constraints in order to model complex target appearances. The target tracking is performed within a Monte Carlo particle filtering framework which is capable of maintaining multiple state-space hypotheses and propagating ambiguity until less ambiguous data is observed. Multiple image cues are combined within this framework in a principled Bayesian manner. The target detection and model acquisition components are able to perform at near real-time frame rates and are shown to accurately recognize the presence of a target and initialize a target model specific to that user. The target tracking component has demonstrated exceptional resilience to occlusion and temporary target disappearance and contains a natural mechanism for the trade-off between accuracy and speed. At this point, the target tracking component performs at sub real-time frame rates, although several methods to increase the effective operating speed are proposed.

Journal ArticleDOI
TL;DR: A hybrid scheme of inter and intraframe coding is proposed, which achieves higher lossless video compression ratio than existing methods such as JPEG-LS and JPEG-2000.
Abstract: This paper presents a simple, fast coding technique for lossless compression of mosaic video data. The design of a video codec needs to strike a balance between the compression performance and the codec throughput. Aiming to make the encoding throughput high enough for real-time lossless video compression, we propose a hybrid scheme of inter and intraframe coding. Interframe predictive coding is invoked only when the motion between adjacent frames is modest and a simple motion compensation operation can significantly improve the compression performance. Otherwise, still frame compression is performed to keep the complexity low. Experimental results show that the proposed scheme achieves higher lossless video compression ratio than existing methods such as JPEG-LS and JPEG-2000.

Journal ArticleDOI
TL;DR: An automated visual inspection (AVI) system for detecting and sorting contaminants from wool in real time is presented and the results demonstrate that the factory AVI system could identify and remove the contaminants at a camera speed of around 800 lines/s and the conveyor speed of 20m/min inreal time.
Abstract: In the textile industry, scoured wool contains different types of foreign materials (contaminants) that need to be separated out before it goes into further processing, so that the textile machines are protected from damage and the quality of the final woollen products is ensured. This paper presents an automated visual inspection (AVI) system for detecting and sorting contaminants from wool in real time. The techniques were first developed in the lab and subsequently applied to a large-scale factory system. The combinative use of image processing algorithms in RGB and HSV colour spaces can segment 96% of contaminant types (minimum size around 4cm long and 5mm in diameter) in real-time on the lab test rig. One of the most important aspects of the system is to use the non-linear colour space transformation and merge the threshold algorithm in HSV colour space into the image processing algorithms in RGB colour space to enhance the contaminant identification in real time. The real-time capability of the system is also analysed in detail. The experimental results demonstrate that the factory AVI system could identify and remove the contaminants at a camera speed of around 800 lines/s and the conveyor speed of 20m/min in real time.

Journal ArticleDOI
TL;DR: This paper utilizes the discrete wavelet transform to achieve a computationally efficient implementation of the multi-scale clustering algorithm in a 3D color space and it is shown that the developed method outperforms the popular color quantization methods in terms of color distortion.
Abstract: Color quantization is the process of reducing the number of colors in an image. That is, color quantization maps a large number of colors into a much smaller number of representative colors while keeping color distortion to an acceptable level. The reduction in the number of colors lowers computational complexity associated with color processing, and achieves higher color image compression for storage and transmission purposes. The existing color quantization methods require that the number of representative or prominent colors be specified by the user. This paper presents a scene-adaptive color quantization method which eases this constraint by determining the number of representative colors automatically. This method utilizes the discrete wavelet transform to achieve a computationally efficient implementation of the multi-scale clustering algorithm in a 3D color space. The performance is evaluated in terms of compression ratio or number of representative colors, color distortion, and computational complexity. It is shown that the developed method outperforms the popular color quantization methods in terms of color distortion.

Journal ArticleDOI
TL;DR: A context-based intra-mode selection method, which uses neighbors' mode information to alleviate the large computational burden of intra-block coding in H.264/MPEG-4 part 10 video-coding.
Abstract: A new mode selection method for intra-block coding in the H.264/MPEG-4 part 10 video-coding standard is proposed. There are nine 4x4 and four 16x16 intra-prediction modes for intra-block coding. Thus, to find the best intra-block coding mode for a 16x16 macroblock, all of the 144 (16x9) 4x4 and four 16x16 intra-modes need to be exhaustively computed and evaluated. We propose a context-based intra-mode selection method, which uses neighbors' mode information to alleviate this large computational burden. We have experimented the effectiveness of our proposed algorithm under various conditions.

Journal ArticleDOI
TL;DR: This paper describes a real-time system that has been used for detecting tailgating, an example of complex interactions and activities within a vehicle parking scenario, using an adaptive background learning algorithm and intelligence to overcome the problems of object masking, separation and occlusion.
Abstract: Intelligent surveillance has become an important research issue due to the high cost and low efficiency of human supervisors, and machine intelligence is required to provide a solution for automated event detection. In this paper we describe a real-time system that has been used for detecting tailgating, an example of complex interactions and activities within a vehicle parking scenario, using an adaptive background learning algorithm and intelligence to overcome the problems of object masking, separation and occlusion. We also show how a generalized framework may be developed for the detection of other complex events.

Journal ArticleDOI
TL;DR: A CMOS sensor, processor and, reconfigurable unit associated in the same chip will allow scalability, flexibility, and high performance in a smart camera with real-time video processing capabilities.
Abstract: Computer-assisted vision plays an important role in our society, in various fields such as personal and goods safety, industrial production, telecommunications, robotics, etc. However, technical developments are still rare and slowed down by various factors linked to sensor cost, lack of system flexibility, difficulty of rapidly developing complex and robust applications, and lack of interaction among these systems themselves, or with their environment. This paper describes our proposal for a smart camera with real-time video processing capabilities. A CMOS sensor, processor and, reconfigurable unit associated in the same chip will allow scalability, flexibility, and high performance.

Journal ArticleDOI
TL;DR: The interpretation of spatio-temporal object features to detect context-independent events in real time, the adaptation to noise, and the correction and compensation of low-level processing errors at higher layers where more information is available are made.
Abstract: The purpose of this paper is to investigate a real-time system to detect context-independent events in video shots. We test the system in video surveillance environments with a fixed camera. We assume that objects have been segmented (not necessarily perfectly) and reason with their low-level features, such as shape, and mid-level features, such as trajectory, to infer events related to moving objects. Our goal is to detect generic events, i.e., events that are independent of the context of where or how they occur. Events are detected based on a formal definition of these and on approximate but efficient world models. This is done by continually monitoring changes and behavior of features of video objects. When certain conditions are met, events are detected. We classify events into four types: primitive, action, interaction, and composite. Our system includes three interacting video processing layers: enhancement to estimate and reduce additive noise, analysis to segment and track video objects, and interpretation to detect context-independent events. The contributions in this paper are the interpretation of spatio-temporal object features to detect context-independent events in real time, the adaptation to noise, and the correction and compensation of low-level processing errors at higher layers where more information is available. The effectiveness and real-time response of our system are demonstrated by extensive experimentation on indoor and outdoor video shots in the presence of multi-object occlusion, different noise levels, and coding artifacts.

Journal ArticleDOI
TL;DR: A novel progressive image transmission scheme based on the quadtree segmentation technique that provides good image qualities at low bit rates and consumes very little computational cost in both image encoding and decoding procedures is introduced.
Abstract: A novel progressive image transmission scheme based on the quadtree segmentation technique is introduced in this paper. A 3-level quadtree is used in the quadtree segmentation technique to partition the original image into blocks of different sizes. Image blocks of different sizes are encoded by their block mean values. The relatively addressing technique is employed to cut down the storage cost of block mean values. In the proposed scheme, the number of image hierarchies can be adaptively selected according to the specific applications. By exploiting inter-pixel correlation and differently sized blocks for segmentation, the proposed scheme provides good image qualities at low bit rates and consumes very little computational cost in both image encoding and decoding procedures. It is quite suitable for real-time progressive image transmission.

Journal ArticleDOI
TL;DR: The building of a multi-camera recording system using off-the-shelf video cameras that, unlike existing multi- camera systems which rely on expensive and time consuming optical alignment of camera views, relies upon the virtual alignment of cameras performed in software and in real-time.
Abstract: For consumer imaging applications, multi-spectral color refers to capturing and displaying images in more than three primary colors in order to achieve color gamuts significantly larger than those produced by RGB capture and display devices. In this paper, we describe the building of a multi-camera recording system using off-the-shelf video cameras that, unlike existing multi-camera systems which rely on expensive and time consuming optical alignment of camera views, relies upon the virtual alignment of cameras performed in software and in real-time. Once images are properly aligned, the described camera system represents a recording platform that scales linearly in cost with the number of color primaries where new colors are added by simply attaching more cameras, and in this paper, we illustrate frames of the color video produced using a five camera system. The real-time aspects of this work are to both (1) collect an arbitrary number of colors simultaneously recording video at video rate and (2) synthesize and display the aligned channels on the computer screen on-line and in real time.