scispace - formally typeset
Search or ask a question

Showing papers on "Channel (digital image) published in 2008"


Proceedings ArticleDOI
09 Dec 2008
TL;DR: A new algorithm for RGB image based steganography that introduces the concept of storing variable number of bits in each channel based on the actual color values of that pixel: lower color component stores higher number of bit.
Abstract: In this paper, we present a new algorithm for RGB image based steganography. Our algorithm introduces the concept of storing variable number of bits in each channel (R, G or B) of pixel based on the actual color values of that pixel: lower color component stores higher number of bits. Our algorithm offers very high capacity for cover media compared to other existing algorithms. We present experimental results showing the superiority of our algorithm. We also present comparative results with other similar algorithms in image based steganography.

117 citations


Proceedings Article
01 Jan 2008
TL;DR: In this article, a pair of lens-mounted filters were used to enhance the visible images using near-IR information, and the results showed that using information from two different color encodings, depending on the image content, produces vivid, contrasted images that are pleasing to the observers.
Abstract: Current digital camera sensors are inherently sensitive to the near-infrared part of the spectrum To prevent the near-IR contamination of images, an IR blocking filter (hot mirror) is placed in front of the sensor In this work, we start by replacing the camera's hot mirror by a piece of clear glass, thus making the camera sensitive to both visible and near-IR light Using a pair of lens-mounted filters, we explore the differences in operating the camera to take visible and near-IR images of a given scene Our aim is to enhance the visible images using near-IR information To do so, we first discuss the physical causes of differences between visible and near-IR natural images, and remark that these causes are not correlated with a particular colour, but with atmospheric conditions and surface characteristics We then investigate image enhancement by considering the near-IR channel as either colour, luminance, or frequency counterpart to the visible image and conclude that using information from two different colour encodings, depending on the image content, produces vivid, contrasted images that are pleasing to the observers

107 citations


Patent
17 Dec 2008
TL;DR: In this article, the authors present a method to obtain first data representing a first chrominance channel of a color image or video, where the first data comprises a watermark signal embedded therein.
Abstract: The present invention relate generally to digital watermarking. One claim recites a method including: obtaining first data representing a first chrominance channel of a color image or video, where the first data comprises a watermark signal embedded therein; obtaining second data representing a second chrominance channel of the color image or video, the second data comprising the watermark signal embedded therein but with a signal polarity that is inversely related to the polarity of the watermark signal in the first data; combining the second data with the first data in a manner that reduces image or video interference relative to the watermark signal, said act of combining yielding third data; using at least a processor or electronic processing circuitry, processing the third data to obtain the watermark signal; and once obtained, providing information associated with the watermark signal. Of course, additional combinations and claims are provided as well.

89 citations


Patent
30 Jun 2008
TL;DR: In this article, an augmented stereoscopic display system outputs a real-time stereoscopic image comprising a three-dimensional presentation of a blend of a stereoscopic visible image and the stereoscopic pair of fluorescence images.
Abstract: An illumination channel, a stereoscopic optical channel and another optical channel are held and positioned by a robotic surgical system. A first capture unit captures a stereoscopic visible image from the first light from the stereoscopic optical channel while a second capture unit captures a fluorescence image from the second light from the other optical channel. An intelligent image processing system receives the captured stereoscopic visible image and the captured fluorescence image and generates a stereoscopic pair of fluorescence images. An augmented stereoscopic display system outputs a real-time stereoscopic image comprising a three-dimensional presentation of a blend of the stereoscopic visible image and the stereoscopic pair of fluorescence images.

85 citations


Proceedings ArticleDOI
28 Jan 2008
TL;DR: In this article, a localized inverse tone mapping method is proposed for efficient inter-layer prediction, which applies a scaling factor and an offset to each macroblock, per color channel, and then the differences are entropy coded.
Abstract: This paper presents a technique for coding high dynamic range videos. The proposed coding scheme is scalable, such that both standard dynamic range and high dynamic range representations of a video can be extracted from one bit stream. A localized inverse tone mapping method is proposed for efficient inter-layer prediction, which applies a scaling factor and an offset to each macroblock, per color channel. The scaling factors and offsets are predicted from neighboring macroblocks, and then the differences are entropy coded. The proposed inter-layer prediction technique is independent of the forward tone mapping method and is able to cover a wide range of bit-depths and various color spaces. Simulations are performed based on H.264/AVC SVC common software and core experiment conditions. Results show the effectiveness of the proposed method.

78 citations


Journal ArticleDOI
TL;DR: A histogram-based model, derived from exemplars, provides a pragmatic guide for image analysis and enhancement and in AREDS2, the best digital images matched the best film.
Abstract: PURPOSE To analyze brightness, contrast, and color balance of digital versus film retinal images in a multicenter clinical trial, to propose a model image from exemplars, and to optimize both image types for evaluation of age-related macular degeneration (AMD). METHODS The Age-Related Eye Disease Study 2 (AREDS2) is enrolling subjects from 90 clinics, with three quarters of them using digital and one quarter using film cameras. Image brightness (B), contrast (C), and color balance (CB) were measured with three-color luminance histograms. First, the exemplars (film and digital) from expert groups were analyzed, and an AMD-oriented model was constructed. Second, the impact of B/C/CB on the appearance of typical AMD abnormalities was analyzed. Third, B/C/CB in AREDS2 images were compared between film (156 eyes) and digital (605 eyes), and against the model. Fourth, suboptimal images were enhanced by adjusting B/C/CB to bring them into accord with model parameters. RESULTS Exemplar images had similar brightness, contrast, and color balance, supporting an image model. Varying a specimen image through a wide range of B/C/CB revealed greatest contrast of drusen and pigment abnormalities against normal retinal pigment epithelium with the model parameters. AREDS2 digital images were more variable than film, with lower correspondence to our model. Ten percent of digital were too dim and 19% too bright (oversaturated), versus 1% and 4% of film, respectively. On average, digital had lower green channel contrast (giving less retinal detail) than film. Overly red color balance (weaker green) was observed in 23% of digital versus 8% of film. About half of digital (but fewer film) images required enhancement before AMD grading. After optimization of both image types, AREDS2 image quality was judged as good as that in AREDS (all film). CONCLUSIONS A histogram-based model, derived from exemplars, provides a pragmatic guide for image analysis and enhancement. In AREDS2, the best digital images matched the best film. Overall, however, digital provided lower contrast of retinal detail. Digital images taken with higher G-to-R ratio showed better brightness and contrast management. Optimization of images in the multicenter study helps standardize documentation of AMD (ClinicalTrials.gov NCT00345176).

69 citations


Proceedings ArticleDOI
01 Nov 2008
TL;DR: An easy algorithm for pupil center and iris boundary localization and a new algorithm for eye state analysis are proposed, which are incorporated into a four step system for drowsiness detection: face detection, eye detection, eyes state analysis, and drowsy decision.
Abstract: Drowsiness detection is vital in preventing traffic accidents Eye state analysis - detecting whether the eye is open or closed - is critical step for drowsiness detection In this paper, we propose an easy algorithm for pupil center and iris boundary localization and a new algorithm for eye state analysis, which we incorporate into a four step system for drowsiness detection: face detection, eye detection, eye state analysis, and drowsy decision This new system requires no training data at any step or special cameras Our eye detection algorithm uses Eye Map, thus achieving excellent pupil center and iris boundary localization results on the IMM database Our novel eye state analysis algorithm detects eye state using the saturation (S) channel of the HSV color space We analyze our eye state analysis algorithm using five video sequences and show superior results compared to the common technique based on distance between eyelids

63 citations


Proceedings ArticleDOI
Wei Feng1, Bo Hu1
27 May 2008
TL;DR: This paper proposes a simple but efficient algorithm to calculate the quaternion discrete cosine transform (QDCT) of a quaternions matrix using its Cayley-Dickson form.
Abstract: Quaternions are the extension of complex numbers and have proved to be a very useful tool for digital color image processing. In this paper, we extend the discrete cosine transform (DCT) to the quaternion field and propose a simple but efficient algorithm to calculate the quaternion discrete cosine transform (QDCT) of a quaternion matrix using its Cayley-Dickson form. Since a color image can be represented by a quaternion matrix, we can apply our novel QDCT to the field of color template matching. Rather than separating a color image into three channel images and processing them respectively as the traditional methods, QDCT can handle color image pixels as vectors and process them in a holistic manner. Experimental results have shown the effectiveness and the promising application future of the proposed QDCT.

42 citations


Journal ArticleDOI
TL;DR: The first integrated multi-aperture image sensor is reported in this article, which comprises a 166 × 76 array of 16 × 16, 0.7 × 7 pixels, FT-CCD subarrays with local readout circuit, per-column 10-bit ADCs, and control circuits.
Abstract: The first integrated multi-aperture image sensor is reported. It comprises a 166 $\times$ 76 array of 16 $\times$ 16, 0.7 $\mu{\hbox{m}}$ pixel, FT-CCD subarrays with local readout circuit, per-column 10-bit ADCs, and control circuits. The image sensor is fabricated in a 0.11 $\mu{\hbox{m}}$ CMOS process modified for buried channel charge transfer. Global snap shot image acquisition with CDS is performed at up to 15 fps with 0.15 V/lux-s responsivity, 3500 e- well capacity, 5 e- read noise, 33 e-/s dark signal, 57 dB dynamic range, and 35 dB peak SNR. When coupled with local optics, the multi-aperture image sensor captures overlapping views of the scene, which can be postprocessed to obtain both a high-resolution 2-D image and a depth map. Other benefits include the ability to image objects at close proximity to the sensor without the need for objective optics, achieve nearly complete color separation through a per-aperture color filter array, relax the requirements on the camera objective optics, and increase the tolerance to defective pixels. The multi-aperture architecture is also highly scalable, making it possible to increase pixel counts well beyond current levels.

38 citations


Journal ArticleDOI
TL;DR: This work presents a detection metric and an analysis determining the detection error rate in TCM, considering an assumed print and scan (PS) channel model, and a perceptual impact model is employed to evaluate the perceptual difference between a modified and a non-modified character.
Abstract: This paper improves the use of text color modulation (TCM) as a reliable text document data hiding method. Using TCM, the characters in a document have their color components modified (possibly unperceptually) according to a side message to be embedded. This work presents a detection metric and an analysis determining the detection error rate in TCM, considering an assumed print and scan (PS) channel model. In addition, a perceptual impact model is employed to evaluate the perceptual difference between a modified and a non-modified character. Combining this perceptual model and the results from the detection error analysis it is possible to determine the optimum color modulation values. The proposed detection metric also exploits the orientation characteristics of color halftoning to reduce the error rate. In particular, because color halftoning algorithms use different screen orientation angles for each color channel, this is used as an effective feature to detect the embedded message. Experiments illustrate the validity of the analysis and the applicability of the method.

37 citations


Journal ArticleDOI
TL;DR: In this paper, a region growing pulse coupled neural network (PCNN) algorithm is proposed for multi-value image segmentation, which improves the region growing PCNN model by modifying the linking channel function and decreases the complexity of adjusting parameters.

Patent
Gun-Ill Lee1
08 Sep 2008
TL;DR: In this paper, a method for controlling moving picture encoding using channel information of wireless networks is presented, where it is possible to use a pre-verified standard technology in the prescription of a stereoscopic image file format, thereby simplifying a verification procedure for a new standard.
Abstract: A method for controlling moving picture encoding using channel information of wireless networks is provided. By the method, it is possible to use a pre-verified standard technology in the prescription of a stereoscopic image file format, thereby simplifying a verification procedure for a new standard. Also, it is possible to use a new a stereoscopic image file format, thereby selecting, generating, and reproducing either of a 2D image file or a 3D stereoscopic image file. In particular, according to a system and a method for using a file format used to generate a 3D stereoscopic image, it is possible to reproduce and display a caption in the form of a 2D image during reproduction of the 3D stereoscopic image, thereby reducing eyestrain of a user, and additionally providing an image such as news, or an advertisement, to a user.

Proceedings ArticleDOI
12 May 2008
TL;DR: A novel approach for channel estimation based on frames, which preserves sparsity and improves estimation accuracy and compared with a Slepian basis expansion estimator based on DPSS for a variety of mobile channel parameters is proposed.
Abstract: Accurate and sparse representation of a moderately fast fading channel using bases functions is achievable when both channel and bases bands align. If a mismatch exists, usually a larger number of bases functions is needed to achieve the same accuracy. In this paper, we propose a novel approach for channel estimation based on frames, which preserves sparsity and improves estimation accuracy. Members of the frame are formed by modulating and varying the bandwidth of discrete prolate spheroidal sequences (DPSS) in order to reflect various scattering scenarios. To achieve the sparsity of the proposed representation, a matching pursuit approach is employed. The estimation accuracy of the scheme is evaluated and compared with the accuracy of a Slepian basis expansion estimator based on DPSS for a variety of mobile channel parameters. The results clearly indicate that for the same number of atoms, a significantly higher estimation accuracy is achievable with the proposed scheme when compared to the DPSS estimator.

Proceedings ArticleDOI
11 Nov 2008
TL;DR: This paper presents a new algorithm for colour digital image watermarking that has shown to be resistant to JPEG compression, cropping, scaling, low-pass, median and removal attack.
Abstract: This paper presents a new algorithm for colour digital image watermarking. The 24 bits/pixel RGB images are used and the watermark is placed on the green channel of the RGB image. The green channel is chosen after an analytical investigation process was carried out using some popular measurement metrics. The analysis and embedding processes have been carried out using the discrete cosine transform DCT. The new watermarking method has shown to be resistant to JPEG compression, cropping, scaling, low-pass, median and removal attack. This algorithm produces more than 65 dB of average PSNR.

Patent
14 Feb 2008
TL;DR: In this paper, a moving platform, at least two image capture devices, a position system, an event multiplexer and a computer system are used to capture images of the Earth.
Abstract: An image capture system for capturing images of an object, such as the Earth. The image capture system includes a moving platform, at least two image capture devices, a position system, an event multiplexer and a computer system. The image capture devices are mounted to the moving platform. Each image capture device has a sensor for capturing an image and an event channel providing an event signal indicating the capturing of an image by the sensor. The position system records data indicative of a position as a function of time related to the moving platform. The event multiplexer has at least two image capture inputs and at least one output port. Each image capture input receives event signals from the event channel of one of the image capture devices. The event multiplexer outputs information indicative of an order of events indicated by the event signals, and identification of image capture devices providing the event signals. The computer system receives and stores the information indicative of the order of events indicated by the event signals, and identification of image capture devices providing the event signals.

Patent
16 Jun 2008
TL;DR: In this article, a processing device for correcting at least one defect pixel value of an image sensor unit is proposed, where the pixel value is estimated by evaluating the values of neighbouring pixels of the defect pixel of the same pixel array.
Abstract: CMOS image sensors are usually suffering from fixed pattern noise and random defect pixels. However, for economical reasons and in order to increase the manufacturing yield, some random defective pixels are usually accepted even for professional devices. In this case, the defect pixels are usually corrected by signal processing. A processing device (15) for correcting of at least one defect pixel value of an image sensor unit is proposed, the image sensor unit comprising at least a first and a second pixel array (1, 2, 3), wherein the image sensor unit is embodied to project the same image onto each pixel array (1, 2, 3), the processing device (15) comprising at least a first and a second input channel (11, 12, 13) for receiving pixel values of the first and the second pixel array, respectively, wherein the processing device (15) is operable to exchange the defect pixel value by a corrected pixel value, wherein the corrected pixel value is estimated by evaluating the values of neighbouring pixels of the defect pixel of the same pixel array, wherein the corrected pixel value is estimated by evaluating values of a corresponding pixel and its neighbouring pixels of the second pixel array at the same location as the defect pixel of the first pixel array in respect to the projected image.

Proceedings ArticleDOI
22 Sep 2008
TL;DR: This paper presents a series of dialect/accent identification results for three sets of dialects with discriminatively trained Gaussian mixture models and feature compensation using eigen-channel decomposition and an approach to open set dialect scoring is introduced.
Abstract: This paper presents a series of dialect/accent identification results for three sets of dialects with discriminatively trained Gaussian mixture models and feature compensation using eigen-channel decomposition. The classification tasks evaluated in the paper include: 1) the Chinese language classes, 2) American and Indian accented English and 3) discrimination between three Arabic dialects. The first two tasks were evaluated on the 2007 NIST LRE corpus. The Arabic discrimination task was evaluated using data derived from the LDC Arabic set collected by Appen. Analysis is performed for the English accent problem studied and an approach to open set dialect scoring is introduced. The system resulted in equal error rates at or below 10% for each of the tasks studied.

Patent
10 Apr 2008
TL;DR: In this paper, per-channel image intensity correction includes linear interpolation of each channel of spectral data to generate corrected spectral data, which is a common technique for image intensity estimation.
Abstract: Techniques for per-channel image intensity correction includes linear interpolation of each channel of spectral data to generate corrected spectral data.

Patent
26 Sep 2008
TL;DR: Based on a multi-channel pixel intensity data set generated by a synthetic vision system, a single-channel image of terrain comprised of a plurality of intensities of one color may be generated as mentioned in this paper.
Abstract: A present novel and non-trivial system, apparatus, and method for generating HUD image data from synthetic image data is disclosed. Based on a multi-channel pixel intensity data set generated by a synthetic vision system, a single-channel pixel intensity data set representative of a lighted solid image of terrain comprised of a plurality of intensities of one color may be generated. The single-channel pixel intensity image data set may be determined as a function of multi-channel pixel intensity data set and channel weighting, where channel weighting may be based on sky and/or terrain colors employed by an SVS. Based on the multi-channel pixel intensity data set, a three-dimensional perspective scene outside the aircraft may be presented to the pilot on a HUD combiner. Also, the multi-channel pixel intensity data set may be modified by using at least one chroma key, where such chroma key may be assigned to a specific multi-channel pixel intensity value.

Patent
23 May 2008
TL;DR: The use of a Bayer pattern array in digital image sensors to enhance the dynamic range of the sensors is disclosed in this article, where the authors reveal that each Bayer pattern can include three different pixels having a first exposure, and a fourth pixel (which is the same color as one of the other pixels in the array) having a second exposure.
Abstract: The use of a Bayer pattern array in digital image sensors to enhance the dynamic range of the sensors is disclosed. Each Bayer pattern in the array can include three different pixels having a first exposure, and a fourth pixel (which is the same color as one of the other pixels in the array) having a second exposure. The dynamic range of the Bayer pattern array can be enhanced by using different exposure times for the pixels. Each pixel can capture only one channel (i.e. either red (R), green (G) or blue (B) light). Interpolation of neighboring pixels, including those having different exposure times, can enable the pixels in the Bayer pattern array to generate missing color information and effectively become a color pixel, and can allow the Bayer pattern array to have a higher dynamic range.

Proceedings ArticleDOI
07 Jul 2008
TL;DR: This work proposes to extract pixel-based evolutions from SITS data by using two different symbolic techniques based on data mining techniques that aim at extracting frequent sequential patterns.
Abstract: Nowadays, there is a growing need for processing huge volumes of observation data due to the increase in size, in resolution, in spectral channel number and in acquisition frequency of remote sensing images. When data is gathered over time for a same geographical zone, this data is said to be a Satellite Image Time Series (SITS). The informational content of SITS is rich because the observed scene is described both in time and in space. In order to exhibit potential interesting spatio-temporal patterns, we propose to extract pixel-based evolutions from SITS data by using two different symbolic techniques. The first one is based on data mining techniques that aim at extracting frequent sequential patterns (e.g.,). The second one relies on the use of tries (e.g.,) for classifying pixels according to their evolution in time. Encouraging experiments on a SPOT SITS are detailed.

Proceedings ArticleDOI
14 Oct 2008
TL;DR: A method for the automatic extraction of blood vessels from retinal images, while capturing points of intersection/overlap and endpoints of the vascular tree is presented.
Abstract: In this paper we present a method for the automatic extraction of blood vessels from retinal images, while capturing points of intersection/overlap and endpoints of the vascular tree. The algorithm performance is evaluated through a comparison with handmade segmented images available on the STARE project database (STructured Analysis of the REtina). The algorithm is performed on the green channel of the RGB triad. The green channel can be used to represent the illumination component. The matched filter is used to enhance vessels w.r.t. the background. The separation between vessels and background is accomplished by a threshold operator based on gaussian probability density function. The length filtering removes pixels and isolated segments from the resulting image. Finally endpoints, intersections and overlapping vessels are extracted.

Proceedings ArticleDOI
01 Dec 2008
TL;DR: The improved gist model is proposed that improves the classification accuracy around 7% but also reduce the testing cost around 50 times in comparing with the original gist model proposed by Siagian and Itti in TPAMI 2007.
Abstract: In this paper, we unify C1 units and the locality preserving projections (LPP) into the conventional gist model for scene classification. For the improved gist model, we first utilize the C1 units, intensity channel and color channel of color image to represent the color image with the high dimensional feature, then we project high dimensional samples to a low dimensional subspace via LPP to preserve both the local geometry and the discriminate information, and finally, we apply the nearest neighbour rule with the Euclidean distance for classification. Experimental results based on the USC scene database not only demonstrate that the proposed gist improves the classification accuracy around 7% but also reduce the testing cost around 50 times in comparing with the original gist model proposed by Siagian and Itti in TPAMI 2007.

Patent
30 Jun 2008
TL;DR: In this article, a first capture unit captures: a visible first color component of a visible left image combined with a fluorescence left image from first light from one channel in the endoscope; a visible second colour component of the visible left images from the first light; and a visible third color component for the visible right image from the second light.
Abstract: An endoscope with a stereoscopic optical channel is held and positioned by a robotic surgical system. A first capture unit captures: a visible first color component of a visible left image combined with a fluorescence left image from first light from one channel in the endoscope; a visible second color component of the visible left image from the first light; and a visible third color component of the visible left image from the first light. A second capture unit captures: a visible first color component of a visible right image combined with a fluorescence right image from second light from the other channel in the endoscope; a visible second color component of the visible right image from the second light; and a visible third color component of the visible right image from the second light. An augmented stereoscopic outputs a real-time stereoscopic image including a three-dimensional presentation including the visible left and right images and the fluorescence left and right images.

Patent
29 Aug 2008
TL;DR: In this article, a video with uniform quality corresponding to a channel environment having a variable bit-rate can be provided, by using at least one of channel state information and a result of encoding a video in a predetermined encoding unit encoded in advance, whether or not to convert a video from RGB (red, green, blue) color format into a YCbCr color format is adaptively determined to perform encoding.
Abstract: Provided are video encoding and decoding methods and apparatuses for encoding a video by variably selecting one from two or more difference color formats. Accordingly, by using at least one of channel state information and a result of encoding a video in a predetermined encoding unit encoded in advance, whether or not to convert a video in a current encoding unit of an input RGB (red, green, blue) color format into a YCbCr color format is adaptively determined to perform encoding. Therefore, a video with uniform quality corresponding to a channel environment having a variable bit-rate can be provided.

Proceedings ArticleDOI
11 Feb 2008
TL;DR: This paper proposes a hybrid method to recognize facial expression by combing Adaboost, Skin color model and motion history image, and uses a support vector machine classifier to classify feature points representing different facial expressions using optical flow.
Abstract: Facial expression is a very useful channel for intelligent human computer communication. In this paper we propose a hybrid method to recognize facial expression. Our main contributions in this study are: first, face region is detected by combing Adaboost, Skin color model and motion history image; second, feature points representing different facial expressions are separated using optical flow; third, a support vector machine classifier is used to classify these feature point's info (location, distance, angle); last, tests to explore the whole facial expression recognition process are conducted and the results are satisfactory.

Patent
23 Jun 2008
TL;DR: In this article, a system and method of reducing noise in output image data is provided, where pixels which may produce noise are identified, and a mask associated with the image data are generated.
Abstract: A system and method of reducing noise in output image data is provided. Grayscale image data having a plurality of pixels is received and processed. During processing, pixels which may produce noise are identified, and a mask associated with the image data is generated. The mask provides information related to the pixels, such as opaque and transparent regions for overlaying the pixels. The image data and the mask are compressed and stored. The mask assists in preventing the identified pixels from being visible when the image data is output, thereby reducing the noise in the image.

01 Jan 2008
TL;DR: This thesis argues for using channel-coded feature maps in view-based pose estimation, where a continuous pose parameter is estimated from a query image given a number of training views with known pose, and identifies a key feature of the method -- that it is able to handle a learning situation where the correspondence structure between the input and output space is not completely known.
Abstract: This thesis is about channel-coded feature maps applied in view-based object recognition, tracking, and machine learning A channel-coded feature map is a soft histogram of joint spatial pixel posi

Proceedings ArticleDOI
03 Apr 2008
TL;DR: In this paper, the correlation between the target signatures in two different IR frequency bands (3-5 and 8-12 im) is used to construct a fused IR image with a reduced amount of clutter.
Abstract: We present an algorithm that produces a fused false color representation of a combined multiband IR and visual imaging system for maritime applications. Multispectral IR imaging techniques are increasingly deployed in maritime operations, to detect floating mines or to find small dinghies and swimmers during search and rescue operations. However, maritime backgrounds usually contain a large amount of clutter that severely hampers the detection of small targets. Our new algorithm deploys the correlation between the target signatures in two different IR frequency bands (3-5 and 8-12 im) to construct a fused IR image with a reduced amount of clutter. The fused IR image is then combined with a visual image in a false color RGB representation for display to a human operator. The algorithm works as follows. First, both individual IR bands are filtered with a morphological opening top-hat transform to extract small details. Second, a common image is extracted from the two filtered IR bands, and assigned to the red channel of an RGB image. Regions of interest that appear in both IR bands remain in this common image, while most uncorrelated noise details are filtered out. Third, the visual band is assigned to the green channel and, after multiplication with a constant (typically 1.6) also to the blue channel. Fourth, the brightness and colors of this intermediate false color image are renormalized by adjusting its first order statistics to those of a representative reference scene. The result of these four steps is a fused color image, with naturalistic colors (bluish sky and grayish water), in which small targets are clearly visible. Keywords: target detection; false color fusion, small targets, clutter, image fusion

Proceedings ArticleDOI
Zhiwei Li1, Lei Zhang1, Wei-Ying Ma1
26 Oct 2008
TL;DR: The experimental results and user studies show that the proposed online image ad delivery is a non-intrusive ads mode, and the proposed solution is practical, which opens multiple new research directions ranging from multimedia to web data mining.
Abstract: We present in this paper a new channel to deliver online advertisements along with Web images and show a new business model to monetize billions of Web images. The idea is intuitively inspired by image displaying processes on the Web, which typically require people to wait a few seconds before they see full resolution images. This is due to large file sizes and limited network bandwidth. To utilize idle time and the display area, we propose an innovative method for non-intrusively embedding ads into images in a visually pleasant manner. To maintain a smooth user experience, we utilize the thumbnail of the full-resolution image because it is small and visually similar to the full-resolution image. At the client side, a rendering engine first enlarges and blurs the thumbnail, and then blends the pre-chosen ads information into the enlarged image. Based on this idea, we propose three typical scenarios that can adopt the proposed image-advertising mode. More importantly, we can encourage providers of images or other users to participate in our online image ads service by tagging or annotating images. We envision revenue sharing with the providers participating in our service, and we expect that a large number of users will actively submit, tag and annotate images using the system. We have implemented a prototype image ads system, and conducted a series of experiments and user studies to evaluate such a new advertisement channel. The experimental results and user studies show that the proposed online image ad delivery is a non-intrusive ads mode, and the proposed solution is practical. This work also opens multiple new research directions ranging from multimedia to web data mining