Showing papers in "Eurasip Journal on Image and Video Processing in 2013"

PDF

Open Access

Journal Article•DOI•

Automated identification of animal species in camera trap images

[...]

Xiaoyuan Yu¹, Xiaoyuan Yu², Jiangping Wang², Roland Kays³, Roland Kays⁴, Roland Kays⁵, Patrick A. Jansen⁵, Patrick A. Jansen⁶, Tianjiang Wang¹, Thomas S. Huang² - Show less +6 more•Institutions (6)

Huazhong University of Science and Technology¹, University of Illinois at Urbana–Champaign², North Carolina Museum of Natural Sciences³, North Carolina State University⁴, Smithsonian Tropical Research Institute⁵, Wageningen University and Research Centre⁶

04 Sep 2013-Eurasip Journal on Image and Video Processing

TL;DR: An automated species identification method for wildlife pictures captured by remote camera traps that uses improved sparse coding spatial pyramid matching (ScSPM), which extracts dense SIFT descriptor and cell-structured LBP as the local features and generates global feature via weighted sparse coding and max pooling using multi-scale pyramid kernel.

...read moreread less

Abstract: Image sensors are increasingly being used in biodiversity monitoring, with each study generating many thousands or millions of pictures. Efficiently identifying the species captured by each image is a critical challenge for the advancement of this field. Here, we present an automated species identification method for wildlife pictures captured by remote camera traps. Our process starts with images that are cropped out of the background. We then use improved sparse coding spatial pyramid matching (ScSPM), which extracts dense SIFT descriptor and cell-structured LBP (cLBP) as the local features, that generates global feature via weighted sparse coding and max pooling using multi-scale pyramid kernel, and classifies the images by a linear support vector machine algorithm. Weighted sparse coding is used to enforce both sparsity and locality of encoding in feature space. We tested the method on a dataset with over 7,000 camera trap images of 18 species from two different field cites, and achieved an average classification accuracy of 82%. Our analysis demonstrates that the combination of SIFT and cLBP can serve as a useful technique for animal species recognition in real, complex scenarios.

...read moreread less

184 citations

Journal Article•DOI•

Robust gait-based gender classification using depth cameras

[...]

Laura Igual¹, Agata Lapedriza², Ricard Borràs•Institutions (2)

University of Barcelona¹, Open University of Catalonia²

02 Jan 2013-Eurasip Journal on Image and Video Processing

TL;DR: A new fast feature extraction strategy that uses the 3D point cloud obtained from the frames in a gait cycle that improves the accuracy significantly, compared with state-of-the-art systems which do not use depth information.

...read moreread less

Abstract: This article presents a new approach for gait-based gender recognition using depth cameras, that can run in real time. The main contribution of this study is a new fast feature extraction strategy that uses the 3D point cloud obtained from the frames in a gait cycle. For each frame, these points are aligned according to their centroid and grouped. After that, they are projected into their PCA plane, obtaining a representation of the cycle particularly robust against view changes. Then, final discriminative features are computed by first making a histogram of the projected points and then using linear discriminant analysis. To test the method we have used the DGait database, which is currently the only publicly available database for gait analysis that includes depth information. We have performed experiments on manually labeled cycles and over whole video sequences, and the results show that our method improves the accuracy significantly, compared with state-of-the-art systems which do not use depth information. Furthermore, our approach is insensitive to illumination changes, given that it discards the RGB information. That makes the method especially suitable for real applications, as illustrated in the last part of the experiments section.

...read moreread less

181 citations

Journal Article•DOI•

CU splitting early termination based on weighted SVM

[...]

Xiaolin Shen¹, Lu Yu¹•Institutions (1)

Zhejiang University¹

09 Jan 2013-Eurasip Journal on Image and Video Processing

TL;DR: This article proposes a CU splitting early termination algorithm to reduce the heavy computational burden on encoder and is modeled as a binary classification problem, on which a support vector machine (SVM) is applied.

...read moreread less

Abstract: High efficiency video coding (HEVC) is the latest video coding standard that has been developed by JCT-VC. It employs plenty of efficient coding algorithms (e.g., highly flexible quad-tree coding block partitioning), and outperforms H.264/AVC by 35–43% bitrate reduction. However, it imposes enormous computational complexity on encoder due to the optimization processing in the efficient coding tools, especially the rate distortion optimization on coding unit (CU), prediction unit, and transform unit. In this article, we propose a CU splitting early termination algorithm to reduce the heavy computational burden on encoder. CU splitting is modeled as a binary classification problem, on which a support vector machine (SVM) is applied. In order to reduce the impact of outliers as well as to maintain the RD performance while a misclassification occurs, RD loss due to misclassification is introduced as weights in SVM training. Efficient and representative features are extracted and optimized by a wrapper approach to eliminate dependency on video content as well as on encoding configurations. Experimental results show that the proposed algorithm can achieve about 44.7% complexity reduction on average with only 1.35% BD-rate increase under the “random access” configuration, and 41.9% time saving with 1.66% BD-rate increase under the “low delay” setting, compared with the HEVC reference software.

...read moreread less

165 citations

Journal Article•DOI•

A comparative study of face landmarking techniques

[...]

Oya Celiktutan¹, Sezer Ulukaya¹, Bulent Sankur¹•Institutions (1)

Boğaziçi University¹

07 Mar 2013-Eurasip Journal on Image and Video Processing

TL;DR: The purpose of this survey is to give an overview of landmarking algorithms and their progress over the last decade, categorize them and show comparative performance statistics of the state of the art.

...read moreread less

Abstract: Face landmarking, defined as the detection and localization of certain characteristic points on the face, is an important intermediary step for many subsequent face processing operations that range from biometric recognition to the understanding of mental states. Despite its conceptual simplicity, this computer vision problem has proven extremely challenging due to inherent face variability as well as the multitude of confounding factors such as pose, expression, illumination and occlusions. The purpose of this survey is to give an overview of landmarking algorithms and their progress over the last decade, categorize them and show comparative performance statistics of the state of the art. We discuss the main trends and indicate current shortcomings with the expectation that this survey will provide further impetus for the much needed high-performance, real-life face landmarking operating at video rates.

...read moreread less

130 citations

Journal Article•DOI•

An efficient approach for robust multimodal retinal image registration based on UR-SIFT features and PIIFD descriptors

[...]

Zeinab Ghassabi¹, Jamshid Shanbehzadeh², Amin Sedaghat³, Emad Fatemizadeh⁴•Institutions (4)

Islamic Azad University¹, Kharazmi University², K.N.Toosi University of Technology³, Sharif University of Technology⁴

28 Apr 2013-Eurasip Journal on Image and Video Processing

TL;DR: A novel integrated approach which exploits features of uniform robust scale invariant feature transform (UR-SIFT) and PIIFD and is robust against low content contrast of color images and large content, appearance, and scale changes between color and other retinal image modalities like the fluorescein angiography.

...read moreread less

Abstract: Existing algorithms based on scale invariant feature transform (SIFT) and Harris corners such as edge-driven dual-bootstrap iterative closest point and Harris-partial intensity invariant feature descriptor (PIIFD) respectivley have been shown to be robust in registering multimodal retinal images. However, they fail to register color retinal images with other modalities in the presence of large content or scale changes. Moreover, the approaches need preprocessing operations such as image resizing to do well. This restricts the application of image registration for further analysis such as change detection and image fusion. Motivated by the need for efficient registration of multimodal retinal image pairs, this paper introduces a novel integrated approach which exploits features of uniform robust scale invariant feature transform (UR-SIFT) and PIIFD. The approach is robust against low content contrast of color images and large content, appearance, and scale changes between color and other retinal image modalities like the fluorescein angiography. Due to low efficiency of standard SIFT detector for multimodal images, the UR-SIFT algorithm extracts high stable and distinctive features in the full distribution of location and scale in images. Then, feature points are adequate and repeatable. Moreover, the PIIFD descriptor is symmetric to contrast, which makes it suitable for robust multimodal image registration. After the UR-SIFT feature extraction and the PIIFD descriptor generation in images, an initial cross-matching process is performed and followed by a mismatch elimination algorithm. Our dataset consists of 120 pairs of multimodal retinal images. Experiment results show the outperformance of the UR-SIFT-PIIFD over the Harris-PIIFD and similar algorithms in terms of efficiency and positional accuracy.

...read moreread less

97 citations

Journal Article•DOI•

Evaluation of noise robustness for local binary pattern descriptors in texture classification

[...]

Gustaf Kylberg¹, Ida-Maria Sintorn²•Institutions (2)

Uppsala University¹, Swedish University of Agricultural Sciences²

15 Apr 2013-Eurasip Journal on Image and Video Processing

TL;DR: The robustness to noise for the eight following LBP-based descriptors are evaluated; improved LBP, median binary patterns (MBP), local ternary patterns (LTP), improved LTP (ILTP), local quinary patterns, robust L BP, and fuzzy LBP (FLBP).

...read moreread less

Abstract: Local binary pattern (LBP) operators have become commonly used texture descriptors in recent years. Several new LBP-based descriptors have been proposed, of which some aim at improving robustness to noise. To do this, the thresholding and encoding schemes used in the descriptors are modified. In this article, the robustness to noise for the eight following LBP-based descriptors are evaluated; improved LBP, median binary patterns (MBP), local ternary patterns (LTP), improved LTP (ILTP), local quinary patterns, robust LBP, and fuzzy LBP (FLBP). To put their performance into perspective they are compared to three well-known reference descriptors; the classic LBP, Gabor filter banks (GF), and standard descriptors derived from gray-level co-occurrence matrices. In addition, a roughly five times faster implementation of the FLBP descriptor is presented, and a new descriptor which we call shift LBP is introduced as an even faster approximation to the FLBP. The texture descriptors are compared and evaluated on six texture datasets; Brodatz, KTH-TIPS2b, Kylberg, Mondial Marmi, UIUC, and a Virus texture dataset. After optimizing all parameters for each dataset the descriptors are evaluated under increasing levels of additive Gaussian white noise. The discriminating power of the texture descriptors is assessed using tenfolded cross-validation of a nearest neighbor classifier. The results show that several of the descriptors perform well at low levels of noise while they all suffer, to different degrees, from higher levels of introduced noise. In our tests, ILTP and FLBP show an overall good performance on several datasets. The GF are often very noise robust compared to the LBP-family under moderate to high levels of noise but not necessarily the best descriptor under low levels of added noise. In our tests, MBP is neither a good texture descriptor nor stable to noise.

...read moreread less

86 citations

Journal Article•DOI•

An automated chimpanzee identification system using face detection and recognition

[...]

Alexander Loos, Andreas Ernst

19 Aug 2013-Eurasip Journal on Image and Video Processing

TL;DR: An automated framework for photo identification of chimpanzees including face detection, face alignment, and face recognition is presented, which can be used by biologists, researchers, and gamekeepers to estimate population sizes faster and more precisely than the current frameworks.

...read moreread less

Abstract: Due to the ongoing biodiversity crisis, many species including great apes like chimpanzees are on the brink of extinction. Consequently, there is an urgent need to protect the remaining populations of threatened species. To overcome the catastrophic decline of biodiversity, biologists and gamekeepers recently started to use remote cameras and recording devices for wildlife monitoring in order to estimate the size of remaining populations. However, the manual analysis of the resulting image and video material is extremely tedious, time consuming, and cost intensive. To overcome the burden of time-consuming routine work, we have recently started to develop computer vision algorithms for automated chimpanzee detection and identification of individuals. Based on the assumption that humans and great apes share similar properties of the face, we proposed to adapt and extend face detection and recognition algorithms, originally developed to recognize humans, for chimpanzee identification. In this paper we do not only summarize our earlier work in the field, we also extend our previous approaches towards a more robust system which is less prone to difficult lighting situations, various poses, and expressions as well as partial occlusion by branches, leafs, or other individuals. To overcome the limitations of our previous work, we combine holistic global features and locally extracted descriptors using a decision fusion scheme. We present an automated framework for photo identification of chimpanzees including face detection, face alignment, and face recognition. We thoroughly evaluate our proposed algorithms on two datasets of captive and free-living chimpanzee individuals which were annotated by experts. In three experiments we show that the presented framework outperforms previous approaches in the field of great ape identification and achieves promising results. Therefore, our system can be used by biologists, researchers, and gamekeepers to estimate population sizes faster and more precisely than the current frameworks. Thus, the proposed framework for chimpanzee identification has the potential to open up new venues in efficient wildlife monitoring and can help researches to develop innovative protection schemes in the future.

...read moreread less

69 citations

Journal Article•DOI•

An analysis of single image defogging methods using a color ellipsoid framework

[...]

Kristofor B. Gibson¹, Kristofor B. Gibson², Truong Q. Nguyen²•Institutions (2)

Space and Naval Warfare Systems Center Pacific¹, University of California, Los Angeles²

01 Jul 2013-Eurasip Journal on Image and Video Processing

TL;DR: How several single image defogging methods work using a color ellipsoid framework using a Gaussian mixture model to account for multiple mixtures is explained, giving intuition in more complex observation windows, such as observations at depth discontinuities.

...read moreread less

Abstract: The goal of this article is to explain how several single image defogging methods work using a color ellipsoid framework. The foundation of the framework is the atmospheric dichromatic model which is analogous to the reflectance dichromatic model. A key step in single image defogging is the ability to estimate relative depth. Therefore, properties of the color ellipsoids are tied to depth cues within an image. This framework is then extended using a Gaussian mixture model to account for multiple mixtures which gives intuition in more complex observation windows, such as observations at depth discontinuities which is a common problem in single image defogging. A few single image defogging methods are analyzed within this framework and surprisingly tied together with a common approach in using a dark prior. A new single image defogging method based on the color ellipsoid framework is introduced and compared to existing methods.

...read moreread less

62 citations

Journal Article•DOI•

Automated detection of elephants in wildlife video

[...]

Matthias Zeppelzauer¹•Institutions (1)

Vienna University of Technology¹

12 Aug 2013-Eurasip Journal on Image and Video Processing

TL;DR: A fully automated method for the detection and tracking of elephants in wildlife video which has been collected by biologists in the field is proposed and shows that both near- and far-distant elephants can be detected and tracked reliably.

...read moreread less

Abstract: Biologists often have to investigate large amounts of video in behavioral studies of animals. These videos are usually not sufficiently indexed which makes the finding of objects of interest a time-consuming task. We propose a fully automated method for the detection and tracking of elephants in wildlife video which has been collected by biologists in the field. The method dynamically learns a color model of elephants from a few training images. Based on the color model, we localize elephants in video sequences with different backgrounds and lighting conditions. We exploit temporal clues from the video to improve the robustness of the approach and to obtain spatial and temporal consistent detections. The proposed method detects elephants (and groups of elephants) of different sizes and poses performing different activities. The method is robust to occlusions (e.g., by vegetation) and correctly handles camera motion and different lighting conditions. Experiments show that both near- and far-distant elephants can be detected and tracked reliably. The proposed method enables biologists efficient and direct access to their video collections which facilitates further behavioral and ecological studies. The method does not make hard constraints on the species of elephants themselves and is thus easily adaptable to other animal species.

...read moreread less

60 citations

Journal Article•DOI•

Supporting visual quality assessment with machine learning

[...]

Paolo Gastaldo¹, Rodolfo Zunino¹, Judith Redi²•Institutions (2)

University of Genoa¹, Delft University of Technology²

23 Sep 2013-Eurasip Journal on Image and Video Processing

TL;DR: This paper illustrates and exemplifies the good practices to be followed in using machine learning in modeling perceptual mechanisms and proves the ability of ML-based approaches to address visual quality assessment.

...read moreread less

Abstract: Objective metrics for visual quality assessment often base their reliability on the explicit modeling of the highly non-linear behavior of human perception; as a result, they may be complex and computationally expensive. Conversely, machine learning (ML) paradigms allow to tackle the quality assessment task from a different perspective, as the eventual goal is to mimic quality perception instead of designing an explicit model the human visual system. Several studies already proved the ability of ML-based approaches to address visual quality assessment; nevertheless, these paradigms are highly prone to overfitting, and their overall reliability may be questionable. In fact, a prerequisite for successfully using ML in modeling perceptual mechanisms is a profound understanding of the advantages and limitations that characterize learning machines. This paper illustrates and exemplifies the good practices to be followed.

...read moreread less

59 citations

Journal Article•DOI•

LBP-based periocular recognition on challenging face datasets

[...]

Gayathri Mahalingam¹, Karl Ricanek¹•Institutions (1)

University of North Carolina at Wilmington¹

01 Jul 2013-Eurasip Journal on Image and Video Processing

TL;DR: A novel face-based matcher composed of a multi-resolution hierarchy of patch-based feature descriptors for periocular recognition - recognition based on the soft tissue surrounding the eye orbit is developed.

...read moreread less

Abstract: This work develops a novel face-based matcher composed of a multi-resolution hierarchy of patch-based feature descriptors for periocular recognition - recognition based on the soft tissue surrounding the eye orbit. The novel patch-based framework for periocular recognition is compared against other feature descriptors and a commercial full-face recognition system against a set of four uniquely challenging face corpora. The framework, hierarchical three-patch local binary pattern, is compared against the three-patch local binary pattern and the uniform local binary pattern on the soft tissue area around the eye orbit. Each challenge set was chosen for its particular non-ideal face representations that may be summarized as matching against pose, illumination, expression, aging, and occlusions. The MORPH corpora consists of two mug shot datasets labeled Album 1 and Album 2. The Album 1 corpus is the more challenging of the two due to its incorporation of print photographs (legacy) captured with a variety of cameras from the late 1960s to 1990s. The second challenge dataset is the FRGC still image set. Corpus three, Georgia Tech face database, is a small corpus but one that contains faces under pose, illumination, expression, and eye region occlusions. The final challenge dataset chosen is the Notre Dame Twins database, which is comprised of 100 sets of identical twins and 1 set of triplets. The proposed framework reports top periocular performance against each dataset, as measured by rank-1 accuracy: (1) MORPH Album 1, 33.2%; (2) FRGC, 97.51%; (3) Georgia Tech, 92.4%; and (4) Notre Dame Twins, 98.03%. Furthermore, this work shows that the proposed periocular matcher (using only a small section of the face, about the eyes) compares favorably to a commercial full-face matcher.

...read moreread less

Journal Article•DOI•

Novel coarse-to-fine dual scale technique for tuberculosis cavity detection in chest radiographs

[...]

Tao Xu¹, Irene Cheng¹, Richard Long¹, Mrinal Mandal¹•Institutions (1)

University of Alberta¹

08 Jan 2013-Eurasip Journal on Image and Video Processing

TL;DR: This article proposes an efficient coarse-to-fine dual scale technique for cavity detection in chest radiographs that outperforms other existing techniques with respect to true cavity detection rate and segmentation accuracy.

...read moreread less

Abstract: Although many lung disease diagnostic procedures can benefit from computer-aided detection (CAD), current CAD systems are mainly designed for lung nodule detection. In this article, we focus on tuberculosis (TB) cavity detection because of its highly infectious nature. Infectious TB, such as adult-type pulmonary TB (APTB) and HIV-related TB, continues to be a public health problem of global proportion, especially in the developing countries. Cavities in the upper lung zone provide a useful cue to radiologists for potential infectious TB. However, the superimposed anatomical structures in the lung field hinder effective identification of these cavities. In order to address the deficiency of existing computer-aided TB cavity detection methods, we propose an efficient coarse-to-fine dual scale technique for cavity detection in chest radiographs. Gaussian-based matching, local binary pattern, and gradient orientation features are applied at the coarse scale, while circularity, gradient inverse coefficient of variation and Kullback–Leibler divergence measures are applied at the fine scale. Experimental results demonstrate that the proposed technique outperforms other existing techniques with respect to true cavity detection rate and segmentation accuracy.

...read moreread less

Journal Article•DOI•

Multimedia content analysis for emotional characterization of music video clips

[...]

Ashkan Yazdani¹, Evangelos Skodras², Nikolaos D. Fakotakis², Touradj Ebrahimi¹•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, University of Patras²

30 Apr 2013-Eurasip Journal on Image and Video Processing

TL;DR: Using the proposed methodology, a relatively high performance (up to 90%) of affect recognition is obtained and several fusion techniques are used to combine the information extracted from the audio and video contents of music video clips.

...read moreread less

Abstract: Nowadays, tags play an important role in the search and retrieval process in multimedia content sharing social networks. As the amount of multimedia contents explosively increases, it is a challenging problem to find a content that will be appealing to the users. Furthermore, the retrieval of multimedia contents, which can match users’ current mood or affective state, can be of great interest. One approach to indexing multimedia contents is to determine the potential affective state, which they can induce in users. In this paper, multimedia content analysis is performed to extract affective audio and visual cues from different music video clips. Furthermore, several fusion techniques are used to combine the information extracted from the audio and video contents of music video clips. We show that using the proposed methodology, a relatively high performance (up to 90%) of affect recognition is obtained.

...read moreread less

Journal Article•DOI•

Design and FPGA implementation of a wireless hyperchaotic communication system for secure real-time image transmission

[...]

Said Sadoudi¹, Camel Tanougast², Mohamed Salah Azzaz¹, Abbas Dandache²•Institutions (2)

École Normale Supérieure¹, University of Lorraine²

29 Jul 2013-Eurasip Journal on Image and Video Processing

TL;DR: The obtained experimental results show the relevance of the idea of combining XBee (Zigbee or Wireless Fidelity) protocol, known for its high noise immunity, to secure hyperchaotic communications.

...read moreread less

Abstract: In this paper, we propose and demonstrate experimentally a new wireless digital encryption hyperchaotic communication system based on radio frequency (RF) communication protocols for secure real-time data or image transmission. A reconfigurable hardware architecture is developed to ensure the interconnection between two field programmable gate array development platforms through XBee RF modules. To ensure the synchronization and encryption of data between the transmitter and the receiver, a feedback masking hyperchaotic synchronization technique based on a dynamic feedback modulation has been implemented to digitally synchronize the encrypter hyperchaotic systems. The obtained experimental results show the relevance of the idea of combining XBee (Zigbee or Wireless Fidelity) protocol, known for its high noise immunity, to secure hyperchaotic communications. In fact, we have recovered the information data or image correctly after real-time encrypted data or image transmission tests at a maximum distance (indoor range) of more than 30 m and with maximum digital modulation rate of 625,000 baud allowing a wireless encrypted video transmission rate of 25 images per second with a spatial resolution of 128 × 128 pixels. The obtained performance of the communication system is suitable for secure data or image transmissions in wireless sensor networks.

...read moreread less

Journal Article•DOI•

Face recognition using color local binary pattern from mutually independent color channels

[...]

Gholamreza Anbarjafari¹•Institutions (1)

Cyprus International University¹

22 Jan 2013-Eurasip Journal on Image and Video Processing

TL;DR: A high performance face recognition system based on local binary pattern (LBP) using the probability distribution functions (PDFs) of pixels in different mutually independent color channels which are robust to frontal homogenous illumination and planer rotation is proposed.

...read moreread less

Abstract: In this article, a high performance face recognition system based on local binary pattern (LBP) using the probability distribution functions (PDFs) of pixels in different mutually independent color channels which are robust to frontal homogenous illumination and planer rotation is proposed. The illumination of faces is enhanced by using the state-of-the-art technique which is using discrete wavelet transform and singular value decomposition. After equalization, face images are segmented by using local successive mean quantization transform followed by skin color-based face detection system. Kullback–Leibler distance between the concatenated PDFs of a given face obtained by LBP and the concatenated PDFs of each face in the database is used as a metric in the recognition process. Various decision fusion techniques have been used in order to improve the recognition rate. The proposed system has been tested on the FERET, HP, and Bosphorus face databases. The proposed system is compared with conventional and the state-of-the-art techniques. The recognition rates obtained using FVF approach for FERET database is 99.78% compared with 79.60 and 68.80% for conventional gray-scale LBP and principle component analysis-based face recognition techniques, respectively.

...read moreread less

Journal Article•DOI•

Automatic landmark point detection and tracking for human facial expressions

[...]

Yun Tie¹, Ling Guan¹•Institutions (1)

Ryerson University¹

04 Feb 2013-Eurasip Journal on Image and Video Processing

TL;DR: An efficient and robust method for facial landmark detection and tracking from video sequences using a kernel correlation analysis approach to find the detection likelihood by maximizing a similarity criterion between the target points and the candidate points.

...read moreread less

Abstract: Facial landmarks are a set of salient points, usually located on the corners, tips or mid points of the facial components. Reliable facial landmarks and their associated detection and tracking algorithms can be widely used for representing the important visual features for face registration and expression recognition. In this paper we propose an efficient and robust method for facial landmark detection and tracking from video sequences. We select 26 landmark points on the facial region to facilitate the analysis of human facial expressions. They are detected in the first input frame by the scale invariant feature based detectors. Multiple Differential Evolution-Markov Chain (DE-MC) particle filters are applied for tracking these points through the video sequences. A kernel correlation analysis approach is proposed to find the detection likelihood by maximizing a similarity criterion between the target points and the candidate points. The detection likelihood is then integrated into the tracker’s observation likelihood. Sampling efficiency is improved and minimal amount of computation is achieved by using the intermediate results obtained in particle allocations. Three public databases are used for experiments and the results demonstrate the effectiveness of our method.

...read moreread less

Journal Article•DOI•

Real-time lane departure warning system based on a single FPGA

[...]

Xiangjing An¹, Erke Shang¹, Jinze Song¹, Jian Li¹, Hangen He¹ - Show less +1 more•Institutions (1)

National University of Defense Technology¹

04 Jul 2013-Eurasip Journal on Image and Video Processing

TL;DR: A camera-based lane departure warning system implemented on a field programmable gate array (FPGA) device used as a driver assistance system, which effectively prevents accidents given that it is endowed with the advantages of FPGA technology, including high performance for digital image processing applications, compactness, and low cost.

...read moreread less

Abstract: This paper presents a camera-based lane departure warning system implemented on a field programmable gate array (FPGA) device. The system is used as a driver assistance system, which effectively prevents accidents given that it is endowed with the advantages of FPGA technology, including high performance for digital image processing applications, compactness, and low cost. The main contributions of this work are threefold. (1) An improved vanishing point-based steerable filter is introduced and implemented on an FPGA device. Using the vanishing point to guide the orientation at each pixel, this algorithm works well in complex environments. (2) An improved vanishing point-based parallel Hough transform is proposed. Unlike the traditional Hough transform, our improved version moves the coordinate origin to the estimated vanishing point to reduce storage requirements and enhance detection capability. (3) A prototype based on the FPGA is developed. With improvements in the vanishing point-based steerable filter and vanishing point-based parallel Hough transform, the prototype can be used in complex weather and lighting conditions. Experiments conducted on an evaluation platform and on actual roads illustrate the effective performance of the proposed system.

...read moreread less

Journal Article•DOI•

Selecting scenes for 2D and 3D subjective video quality tests

[...]

Margaret H. Pinson¹, Marcus Barkowsky², Patrick Le Callet²•Institutions (2)

National Telecommunications and Information Administration¹, University of Nantes²

28 Aug 2013-Eurasip Journal on Image and Video Processing

TL;DR: A semi-automatic selection process for content sets for subjective experiments will be proposed for three-dimensional testing, a newer field that requires new considerations for scene selection.

...read moreread less

Abstract: This paper presents recommended techniques for choosing video sequences for subjective experiments. Subjective video quality assessment is a well-understood field, yet scene selection is often driven by convenience or content availability. Three-dimensional testing is a newer field that requires new considerations for scene selection. The impact of experiment design on best practices for scene selection will also be considered. A semi-automatic selection process for content sets for subjective experiments will be proposed.

...read moreread less

Journal Article•DOI•

Detecting and tracking honeybees in 3D at the beehive entrance using stereo vision

[...]

Guillaume Chiron¹, Petra Gomez-Krämer¹, Michel Ménard¹•Institutions (1)

University of La Rochelle¹

05 Nov 2013-Eurasip Journal on Image and Video Processing

TL;DR: This work presents a stereo vision-based system that is able to detect bees at the beehive entrance and is sufficiently reliable for tracking, and proposes a detect-before-track approach that employs two innovating methods: hybrid segmentation using both intensity and depth images, and tuned 3D multi-target tracking based on the Kalman filter and Global Nearest Neighbor.

...read moreread less

Abstract: In response to recent needs of biologists, we lay the foundations for a real-time stereo vision-based system for monitoring flying honeybees in three dimensions at the beehive entrance. Tracking bees is a challenging task as they are numerous, small, and fast-moving targets with chaotic motion. Contrary to current state-of-the-art approaches, we propose to tackle the problem in 3D space. We present a stereo vision-based system that is able to detect bees at the beehive entrance and is sufficiently reliable for tracking. Furthermore, we propose a detect-before-track approach that employs two innovating methods: hybrid segmentation using both intensity and depth images, and tuned 3D multi-target tracking based on the Kalman filter and Global Nearest Neighbor. Tests on robust ground truths for segmentation and tracking have shown that our segmentation and tracking methods clearly outperform standard 2D approaches.

...read moreread less

Journal Article•DOI•

Detection and tracking of moving objects in a maritime environment using level set with shape priors

[...]

Duncan Frost¹, Jules-Raymond Tapamo¹•Institutions (1)

University of KwaZulu-Natal¹

26 Jul 2013-Eurasip Journal on Image and Video Processing

TL;DR: It is shown that the developed video tracking system outperforms level set-based systems that do not use prior shape knowledge, working well even where these systems fail.

...read moreread less

Abstract: Over the years, maritime surveillance has become increasingly important due to the recurrence of piracy. While surveillance has traditionally been a manual task using crew members in lookout positions on parts of the ship, much work is being done to automate this task using digital cameras coupled with a computer that uses image processing techniques that intelligently track object in the maritime environment. One such technique is level set segmentation which evolves a contour to objects of interest in a given image. This method works well but gives incorrect segmentation results when a target object is corrupted in the image. This paper explores the possibility of factoring in prior knowledge of a ship’s shape into level set segmentation to improve results, a concept that is unaddressed in maritime surveillance problem. It is shown that the developed video tracking system outperforms level set-based systems that do not use prior shape knowledge, working well even where these systems fail.

...read moreread less

Journal Article•DOI•

Fast restoration of natural images corrupted by high-density impulse noise

[...]

Hossein Hosseini¹, Farokh Marvasti¹•Institutions (1)

Sharif University of Technology¹

02 Apr 2013-Eurasip Journal on Image and Video Processing

TL;DR: In this article, a two-stage method for high density noise suppression while preserving the image details is proposed, where the first stage applies an iterative impulse detector, exploiting the image entropy, to identify the corrupted pixels and then employs an adaptive iterative mean filter to restore them.

...read moreread less

Abstract: In this paper, we suggest a general model for the fixed-valued impulse noise and propose a two-stage method for high density noise suppression while preserving the image details. In the first stage, we apply an iterative impulse detector, exploiting the image entropy, to identify the corrupted pixels and then employ an Adaptive Iterative Mean filter to restore them. The filter is adaptive in terms of the number of iterations, which is different for each noisy pixel, according to the Euclidean distance from the nearest uncorrupted pixel. Experimental results show that the proposed filter is fast and outperforms the best existing techniques in both objective and subjective performance measures.

...read moreread less

Journal Article•DOI•

Fault diagnosis of induction motors utilizing local binary pattern-based texture analysis

[...]

Rifat Shahriar¹, Tanveer Ahsan¹, Uipil Chong¹•Institutions (1)

University of Ulsan¹

08 May 2013-Eurasip Journal on Image and Video Processing

TL;DR: Comparative analysis reveals that LBP8,1 is the most suitable texture analysis operator for the proposed system due to its perfect classification performance along with the lowest degree of computational complexity.

...read moreread less

Abstract: Fault diagnosis of induction motors in the practical industrial fields is always a challenging task due to the difficulty that lies in exact identification of fault signatures at various motor operating conditions in the presence of background noise produced by other mechanical subsystems. Several signal processing approaches have been adopted so far to mitigate the effect of this background noise in the acquired sensor signal so that fault-related features can be extracted effectively. Addressing this issue, this paper proposes a new approach for fault diagnosis of induction motors utilizing two-dimensional texture analysis based on local binary patterns (LBPs). Firstly, time domain vibration signals acquired from the operating motor are converted into two-dimensional gray-scale images. Then, discriminating texture features are extracted from these images employing LBP operator. These local feature descriptors are later utilized by multi-class support vector machine to identify faults of induction motors. The efficient texture analysis capability as well as the gray-scale invariance property of the LBP operators enables the proposed system to achieve impressive diagnostic performance even in the presence of high background noise. Comparative analysis reveals that LBP8,1 is the most suitable texture analysis operator for the proposed system due to its perfect classification performance along with the lowest degree of computational complexity.

...read moreread less

Journal Article•DOI•

Human recognition based on retinal images and using new similarity function

[...]

Amin Dehghani¹, Zeinab Ghassabi², Hamid Abrishami Moghddam³, Mohammad Shahram Moin•Institutions (3)

University of Tehran¹, Islamic Azad University², K.N.Toosi University of Technology³

31 Oct 2013-Eurasip Journal on Image and Video Processing

TL;DR: Experimental results on a database, including 480 retinal images obtained from 40 subjects of DRIVE dataset and 40 subjects from STARE dataset, demonstrated an average true recognition accuracy rate equal to 100% for the proposed method.

...read moreread less

Abstract: This paper presents a new human recognition method based on features extracted from retinal images. The proposed method is composed of some steps including feature extraction, phase correlation technique, and feature matching for recognition. In the proposed method, Harris corner detector is used for feature extraction. Then, phase correlation technique is applied to estimate the rotation angle of head or eye movement in front of a retina fundus camera. Finally, a new similarity function is used to compute the similarity between features of different retina images. Experimental results on a database, including 480 retinal images obtained from 40 subjects of DRIVE dataset and 40 subjects from STARE dataset, demonstrated an average true recognition accuracy rate equal to 100% for the proposed method. The success rate and number of images used in the proposed method show the effectiveness of the proposed method in comparison to the counterpart methods.

...read moreread less

Journal Article•DOI•

Counter-forensics of SIFT-based copy-move detection by means of keypoint classification

[...]

Irene Amerini¹, Mauro Barni², Roberto Caldelli¹, Andrea Costanzo²•Institutions (2)

University of Florence¹, University of Siena²

15 Apr 2013-Eurasip Journal on Image and Video Processing

TL;DR: There is a significant advantage in using SIFT classification, the classification-based attack is robust against different SIFT implementations, and the attack is able to impair a state-of-the-art SIFT-based copy-move detector in realistic cases.

...read moreread less

Abstract: Copy-move forgeries are very common image manipulations that are often carried out with malicious intents. Among the techniques devised by the ‘Image Forensic’ community, those relying on scale invariant feature transform (SIFT) features are the most effective ones. In this paper, we approach the copy-move scenario from the perspective of an attacker whose goal is to remove such features. The attacks conceived so far against SIFT-based forensic techniques implicitly assume that all SIFT keypoints have similar properties. On the contrary, we base our attacking strategy on the observation that it is possible to classify them in different typologies. Also, one may devise attacks tailored to each specific SIFT class, thus improving the performance in terms of removal rate and visual quality. To validate our ideas, we propose to use a SIFT classification scheme based on the gray scale histogram of the neighborhood of SIFT keypoints. Once the classification is performed, we then attack the different classes by means of class-specific methods. Our experiments lead to three interesting results: (1) there is a significant advantage in using SIFT classification, (2) the classification-based attack is robust against different SIFT implementations, and (3) we are able to impair a state-of-the-art SIFT-based copy-move detector in realistic cases.

...read moreread less

Journal Article•DOI•

Real-time single-pass connected components analysis algorithm

[...]

Fei Zhao¹, Huan zhang Lu¹, Zhiyong Zhang¹•Institutions (1)

National University of Defense Technology¹

22 Apr 2013-Eurasip Journal on Image and Video Processing

TL;DR: A real-time single-pass CCA algorithm that adopts the pixel as a scan unit while the line as a labeling unit and manages the correspondence of labels between adjacent rows by designing a multi-layer-index structure is proposed.

...read moreread less

Abstract: Due to the demand for real-time processing in real-time automatic target recognition (RTATR) systems, fast connected components analysis (CCA) is significant to RTATR performance improvement. Conventional single-pass CCA algorithms need horizontal blanking periods to resolve the equivalence, which are difficult to be applied when the streamed data is transmitted without horizontal blanking periods. In this paper, a real-time single-pass CCA algorithm is proposed. Unlike the conventional ones, we adopt the pixel as a scan unit while the line as a labeling unit and manage the correspondence of labels between adjacent rows by designing a multi-layer-index structure. Equivalence is resolved when the image is scanning, without extra processing time. The proposed algorithm is suitable for hardware acceleration, and the streamed image data can be processed during image transmission without horizontal blanking periods. Experimental results indicate that the hardware acceleration of algorithm achieves real-time CCA in RTATR system.

...read moreread less

Journal Article•DOI•

Depth-image-based rendering with spatial and temporal texture synthesis for 3DTV

[...]

Ming Xi¹, Lianghao Wang¹, Qingqing Yang¹, Dongxiao Li¹, Ming Zhang¹ - Show less +1 more•Institutions (1)

Zhejiang University¹

11 Feb 2013-Eurasip Journal on Image and Video Processing

TL;DR: A depth-image-based rendering (DIBR) method with spatial and temporal texture synthesis is presented, which combines the temporally stationary scene information extracted from the input video and spatial texture in the current frame to fill the disoccluded areas in the virtual views.

...read moreread less

Abstract: A depth-image-based rendering (DIBR) method with spatial and temporal texture synthesis is presented in this article. Theoretically, the DIBR algorithm can be used to generate arbitrary virtual views of the same scene in a three-dimensional television system. But the disoccluded area, which is occluded in the original views and becomes visible in the virtual views, makes it very difficult to obtain high image quality in the extrapolated views. The proposed view synthesis method combines the temporally stationary scene information extracted from the input video and spatial texture in the current frame to fill the disoccluded areas in the virtual views. Firstly, the current texture image and a stationary scene image, which is extracted from the input video, are warped to the same virtual perspective position by the DIBR method. Then, the two virtual images are merged together to reduce the hole regions and maintain the temporal consistency of these areas. Finally, an oriented exemplar-based inpainting method is utilized to eliminate the remaining holes. Experimental results are shown to demonstrate the performance and advantage of the proposed method compared with other view synthesis methods.

...read moreread less

Journal Article•DOI•

Performing scalable lossy compression on pixel encrypted images

[...]

Xiangui Kang¹, Xiangui Kang², Anjie Peng¹, Xianyu Xu³, Xiaochun Cao² - Show less +1 more•Institutions (3)

Sun Yat-sen University¹, Chinese Academy of Sciences², China Telecom³

21 May 2013-Eurasip Journal on Image and Video Processing

TL;DR: The experimental results show that the proposed scheme achieves much better performance than the existing lossy compression scheme for pixel-value encrypted images and also similar performance as the state-of-the-art lossy compressed for pixel permutation-based encrypted images.

...read moreread less

Abstract: Compression of encrypted data draws much attention in recent years due to the security concerns in a service-oriented environment such as cloud computing. We propose a scalable lossy compression scheme for images having their pixel value encrypted with a standard stream cipher. The encrypted data are simply compressed by transmitting a uniformly subsampled portion of the encrypted data and some bitplanes of another uniformly subsampled portion of the encrypted data. At the receiver side, a decoder performs content-adaptive interpolation based on the decrypted partial information, where the received bit plane information serves as the side information that reflects the image edge information, making the image reconstruction more precise. When more bit planes are transmitted, higher quality of the decompressed image can be achieved. The experimental results show that our proposed scheme achieves much better performance than the existing lossy compression scheme for pixel-value encrypted images and also similar performance as the state-of-the-art lossy compression for pixel permutation-based encrypted images. In addition, our proposed scheme has the following advantages: at the decoder side, no computationally intensive iteration and no additional public orthogonal matrix are needed. It works well for both smooth and texture-rich images.

...read moreread less

Journal Article•DOI•

Stereoscopic 3D video quality assessment based on depth maps and video motion

[...]

J.P. Lopez¹, Juan Antonio Rodrigo¹, David Jimenez¹, José Manuel Menéndez¹•Institutions (1)

Technical University of Madrid¹

01 Dec 2013-Eurasip Journal on Image and Video Processing

TL;DR: Techniques to assess the objective quality for stereoscopic 3D video content related to motion and depth map features are proposed and guidelines are obtained after applying the algorithm to quantify the impact over viewer's experience when common cases happen.

...read moreread less

Abstract: In this paper, we propose techniques to assess the objective quality for stereoscopic 3D video content, related to motion and depth map features. An analysis has been carried out in order to understand what causes the generation of visual discomfort in the viewer's eye when visualizing a 3D video. Motion is an important feature affecting 3D experience but is also often the cause of visual discomfort. Guidelines are obtained after applying the algorithm to quantify the impact over viewer's experience when common cases happen, such as high motion sequences, scene changes with abrupt parallax changes, or complete absence of stereoscopy.

...read moreread less

Journal Article•DOI•

Background initialization and foreground segmentation for bootstrapping video sequences

[...]

Han-Hui Hsiao¹, Jin-Jang Leou¹•Institutions (1)

National Chung Cheng University¹

28 Feb 2013-Eurasip Journal on Image and Video Processing

TL;DR: An effective background initialization and foreground segmentation approach for bootstrapping video sequences is proposed, in which a side-match measure is used to determine whether the background is exposed.

...read moreread less

Abstract: In this study, an effective background initialization and foreground segmentation approach for bootstrapping video sequences is proposed. First, a modified block representation approach is used to classify each block of the current video frame into one of four categories, namely, “background,” “still object,” “illumination change,” and “moving object.” Then, a new background updating scheme is developed, in which a side-match measure is used to determine whether the background is exposed. Finally, using the edge information, an improved noise removal and shadow suppression procedure with two morphological operations is adopted to enhance the final segmented foreground. Based on the experimental results obtained in this study, as compared with three comparison approaches, the proposed approach produces better background initialization and foreground segmentation results.

...read moreread less

Journal Article•DOI•

LBPV descriptors-based automatic ACR/BIRADS classification approach

[...]

Alima Damak Masmoudi¹, Norhene Gargouri Ben Ayed¹, Dorra Sellami Masmoudi¹, Riad Abid•Institutions (1)

University of Sfax¹

17 Apr 2013-Eurasip Journal on Image and Video Processing

TL;DR: A novel methodology for automatic American College of Radiology Breast Imaging Reporting and Data System classification using local binary pattern variance descriptor that characterizes the local density in different types of breast tissue patterns information into the LBP histogram is presented.

...read moreread less

Abstract: Mammogram tissue density has been found to be a strong indicator for breast cancer risk. Efforts in computer vision of breast parenchymal pattern have been made in order to improve the diagnostic accuracy by radiologists. Motivated by recent results in mammogram tissue density classification, a novel methodology for automatic American College of Radiology Breast Imaging Reporting and Data System classification using local binary pattern variance descriptor is presented in this article. The proposed approach characterizes the local density in different types of breast tissue patterns information into the LBP histogram. The performance of macro-calcification detection methods is developed using FARABI database. Performance results are given in terms of receiver operating characteristic. The area under curve of the corresponding approach has been found to be 79%.

...read moreread less