scispace - formally typeset
Search or ask a question

Showing papers in "Signal, Image and Video Processing in 2008"


Journal ArticleDOI
TL;DR: Experiments show that the proposed feature selection system improves semantic performance results in image retrieval systems and is compared against competing techniques from the literature.
Abstract: In this article, we propose a novel system for feature selection, which is one of the key problems in content-based image indexing and retrieval as well as various other research fields such as pattern classification and genomic data analysis. The proposed system aims at enhancing semantic image retrieval results, decreasing retrieval process complexity, and improving the overall system usability for end-users of multimedia search engines. Three feature selection criteria and a decision method construct the feature selection system. Two novel feature selection criteria based on inner-cluster and intercluster relations are proposed in the article. A majority voting-based method is adapted for efficient selection of features and feature combinations. The performance of the proposed criteria is assessed over a large image database and a number of features, and is compared against competing techniques from the literature. Experiments show that the proposed feature selection system improves semantic performance results in image retrieval systems.

49 citations


Journal ArticleDOI
TL;DR: A methodology for semantic indexing and retrieval of images, based on techniques of image segmentation and classification combined with fuzzy reasoning is proposed, which allows for the extraction of rich implicit knowledge used for global image classification.
Abstract: The effective management and exploitation of multimedia documents requires the extraction of the underlying semantics. Multimedia analysis algorithms can produce fairly rich, though imprecise information about a multimedia document which most of the times remains unexploited. In this paper we propose a methodology for semantic indexing and retrieval of images, based on techniques of image segmentation and classification combined with fuzzy reasoning. In the proposed knowledge-assisted analysis architecture a segmentation algorithm firstly generates a set of over-segmented regions. After that, a region classification process is employed to assign semantic labels using a confidence degree and simultaneously merge regions based on their semantic similarity. This information comprises the assertional component of a fuzzy knowledge base which is used for the refinement of mistakenly classified regions and also for the extraction of rich implicit knowledge used for global image classification. This knowledge about images is stored in a semantic repository permitting image retrieval and ranking.

47 citations


Journal ArticleDOI
TL;DR: A new threshold is presented for better estimating a signal by sparse transform and soft thresholding and it is shown that, when the number of observations is large, this upper bound is from about twice to four times smaller than the standard upper bounds given for the universal and the minimax thresholds.
Abstract: A new threshold is presented for better estimating a signal by sparse transform and soft thresholding. This threshold derives from a non-parametric statistical approach dedicated to the detection of a signal with unknown distribution and unknown probability of presence in independent and additive white Gaussian noise. This threshold is called the detection threshold and is particularly appropriate for selecting the few observations, provided by the sparse transform, whose amplitudes are sufficiently large to consider that they contain information about the signal. An upper bound for the risk of the soft thresholding estimation is computed when the detection threshold is used. For a wide class of signals, it is shown that, when the number of observations is large, this upper bound is from about twice to four times smaller than the standard upper bounds given for the universal and the minimax thresholds. Many real-world signals belong to this class, as illustrated by several experimental results.

30 citations


Journal ArticleDOI
TL;DR: A new gradient-based optimal operator dedicated to accurate estimation of the direction toward the axis of cylindrical objects when this axis coincides with one of the mask reference axes is presented.
Abstract: This paper introduces low-level operators in the context of detecting cylindrical axis in 3D images. Knowing the axis of a cylinder is particularly useful since its location, length and curvature derive from this knowledge. This paper introduces a new gradient-based optimal operator dedicated to accurate estimation of the direction toward the axis. The operator relies on Finite Impulse Response filters. The approach is presented first in a 2D context, thus providing optimal gradient masks for locating the center of circular objects. Then, a 3D extension is provided, allowing the exact estimation of the orientation toward the axis of cylindrical objects when this axis coincides with one of the mask reference axes. Applied to more general cylinders and to noisy data, the operator still provides accurate estimation and outperforms classical gradient operators.

24 citations


Journal ArticleDOI
TL;DR: An improved dynamic programming (DP) segmentation technique for detecting the intima-media layer of the far wall of the common carotid artery (CCA) of longitudinal and transversal ultrasound (US) images using optimal search technique is presented.
Abstract: An improved dynamic programming (DP) segmentation technique for detecting the intima-media layer of the far wall of the common carotid artery (CCA) of longitudinal and transversal ultrasound (US) images using optimal search technique is presented here. The algorithm is developed considering the normalization and smoothing for estimating the intima media thickness (IMT) of the normal and abnormal subjects. The segmentation features of different subjects obtained using the proposed technique have been compared with the manual measurements. The results show that an inter-observer error and a coefficient of variation are found as ±0.035 mm and 3.55%, respectively. The magnitudes of the IMT values have been used to explore the rate of prediction of blockage existing in the cerebrovascular and cardiovascular pathologies, and also hypertension and atherosclerosis.

23 citations


Journal ArticleDOI
TL;DR: This paper investigates the use of 2D discrete wavelet transform (2D-DWT) for the compression of omnidirectional 3D integral images (OII) and achieves better rate-distortion performance and reconstructs the images with much better image quality at very low bit rates than previously reported 3D- DCT-based scheme.
Abstract: Three dimensional (3D) integral imaging is a method that allows the display of full colour images with continuous parallax within a wide viewing zone. Due to the significant quantity of data required to represent a captured 3D integral image with high resolution, image compression becomes mandatory for the storage and transmission of integral images. This paper investigates the use of 2D discrete wavelet transform (2D-DWT) for the compression of omnidirectional 3D integral images (OII). The method requires the extraction of different viewpoint images from the integral image. A single viewpoint image is constructed by extracting one pixel from each microlens, then each viewpoint image is decomposed using a 2D-DWT. The resulting array of coefficients contains several frequency bands. The lower frequency bands of the viewpoint images are assembled and compressed using a 3D discrete cosine transform (3D-DCT) followed by Huffman coding. Whereas, the remaining higher bands are fed directly into a quantisation process followed by arithmetic coding. Simulations are performed on a set of several grey level 3D OII using a uniform scalar quantizer with deadzone. It was found that the algorithm achieves better rate-distortion performance and reconstructs the images with much better image quality at very low bit rates than previously reported 3D-DCT-based scheme.

20 citations


Journal ArticleDOI
TL;DR: An innovative concept of generating three- dimensional interactive multimedia educational games that combine the excitement and looks of popular computer games with the educational potential of e-learning is described and the concept’s realization by a software system called S.M.I.L.E.: Smart Multipurpose Interactive Learning Environment is described.
Abstract: Education accompanies us throughout our whole life. Many innovations in education have originated from modern technologies. However, the majority of learners—especially children or teenagers—find studying from electronic educational sources and web-based information systems less exciting than playing today’s popular computer games that, conversely, lack signs of education. In this paper, we describe an innovative concept of generating three- dimensional interactive multimedia educational games that combine the excitement and looks of popular computer games with the educational potential of e-learning, and the concept’s realization by a software system called S.M.I.L.E.: Smart Multipurpose Interactive Learning Environment. One of its key features is the automatic generation of games based on a model created by teachers without needing them to be familiar with programming or game design. Moreover, we consider various learners’ abilities and features that enable different users (including handicapped) to learn effectively by playing educational games easily created by teachers. We follow the idea that everyone needs access to quality education and are convinced that by enabling cooperative education not just among learners, but also between handicapped and able-bodied ones, we bring the humane dimension into education.

20 citations


Journal ArticleDOI
TL;DR: A cross-layer model— formulated using interoperable description formats—for the adaptation of scalable H.264/MPEG-4 AVC content in a video streaming system operating on a Wireless LAN access network without QoS mechanisms performs quite well in adapting to limited bandwidth and varying network conditions.
Abstract: This paper presents a cross-layer model— formulated using interoperable description formats—for the adaptation of scalable H.264/MPEG-4 AVC (i.e., SVC) content in a video streaming system operating on a Wireless LAN access network without QoS mechanisms. SVC content adaptation on the server takes place on the application layer using an adaptation process compliant with the MPEG-21 Digital Item Adaptation (DIA) standard, based on input comprised of MPEG-21 DIA descriptions of content and usage environment parameters. The latter descriptions integrate information from different layers, e.g., device characteristics and packet loss rate, in an attempt to increase the interoperability of this cross-layer model, thus making it applicable to other models. For the sake of deriving model parameters, performance measurements from two wireless access point models were taken in account. Throughout the investigation it emerged that the behavior of the system strongly depends on the access point. Therefore, we investigated the use of end-to-end-based rate control algorithms for steering the content adaptation. Simulations of rate adaptation algorithms were subsequently performed, leading to the conclusion that a TFRC-based adaptation technique (TCP-Friendly Rate Control) performs quite well in adapting to limited bandwidth and varying network conditions. In the paper we demonstrate how TFRC-based content adaptation can be realized using MPEG-21 tools.

19 citations


Journal ArticleDOI
TL;DR: A novel region-based retrieval framework—dynamic region matching (DRM) to bridge the semantic gap in content-based image retrieval from two aspects: irrelevant visual contents and unsupervised feature extraction and similarity ranking method.
Abstract: This paper considers the semantic gap in content-based image retrieval from two aspects: (1) irrelevant visual contents (e.g. background) scatter the mapping from image to human perception; (2) unsupervised feature extraction and similarity ranking method can not accurately reveal users’ image perception. This paper proposes a novel region-based retrieval framework—dynamic region matching (DRM) to bridge the semantic gap. (1) To address the first issue, a probabilistic fuzzy region matching algorithm is adopted to retrieve and match images precisely at object level, which copes with the problem of inaccurate segmentation. (2) To address the second issue, a “FeatureBoost” algorithm is proposed to construct an effective “eigen” feature set in relevance feedback (RF) process. And the significance of each region is dynamically updated in RF learning to automatically capture users’ region of interest (ROI). (3) User’s retrieval purpose is predicted using a novel log-learning algorithm, which predicts users’ retrieval target in the feature space using the accumulated user operations. Extensive experiments have been conducted on Corel image database with over 10,000 images. The promising experimental results reveal the effectiveness of our scheme in bridging the semantic gap.

18 citations


Journal ArticleDOI
TL;DR: A text-to-speech synthesis system for modern standard Arabic based on artificial neural networks and residual excited LPC coder and a residual-excited all pole vocal tract model and a prosodic-information synthesizer based on neural networks are described.
Abstract: Text-to-speech conversion has traditionally been performed either by concatenating short samples of speech or by using rule-based systems to convert a phonetic representation of speech into an acoustic representation, which is then converted into speech. This paper describes a text-to-speech synthesis system for modern standard Arabic based on artificial neural networks and residual excited LPC coder. The networks offer a storage-efficient means of synthesis without the need for explicit rule enumeration. These neural networks require large prosodically labeled continuous speech databases in their training stage. As such databases are not available for the Arabic language, we have developed one for this purpose. Thus, we discuss various stages undertaken for this development process. In addition to interpolation capabilities of neural networks, a linear interpolation of the coder parameters is performed to create smooth transitions at segment boundaries. A residual-excited all pole vocal tract model and a prosodic-information synthesizer based on neural networks are also described in this paper.

17 citations


Journal ArticleDOI
TL;DR: An extension of this approach using the fractional Fourier transform (FRFT) called here as the method of alternating projections in the FRFT domains (MAPFD) is proposed, showing that the mean square error (MSE) between the true signal and the extrapolated signal obtained from the given signal is a function of the angle parameter of theFRFT.
Abstract: The need for extrapolation of signals in time domain or frequency domain often arises in many applications in the area of signal and image processing. One of the approaches used for the extrapolation of the signals is the method of alternating projections (MAP) in conventional Fourier domains (CFD). Here we propose an extension of this approach using the fractional Fourier transform (FRFT) called here as the method of alternating projections in the FRFT domains (MAPFD). It is shown through the simulation results that the mean square error (MSE) between the true signal and the extrapolated signal obtained from the given signal is a function of the angle parameter of the FRFT, and the MAPFD gives lower MSE than the MAP in the CFD for the class of signals bandlimited in the FRFT domains, e.g., chirp signals. Moreover, the performance of the extrapolation using the MAPFD is shown to be shift-variant along the time axis.

Journal ArticleDOI
TL;DR: It is demonstrated, via simulation results, that the pre-processed compression procedure is computationally efficient and can significantly enhance detection performance.
Abstract: This paper deals with the constant false alarm rate (CFAR) radar detection of targets embedded in Pearson distributed clutter. We develop new CFAR detection algorithms-notably cell averaging (CA), greatest of selection (GO) and smallest of selection SO-CFAR operating in Pearson measurements based on a non-linear compression method for spiky clutter reduction. The technique is similar to that used in non uniform quantization where a different law is used. It consists of compressing the output square law detector noisy signal with respect to a non-linear law in order to reduce the effect of impulsive noise level. Thus, it can be used as a pre-processing step to improve the performance of automatic target detection especially in lower generalised signal-to-noise ratio (GSNR). The performance characteristics of the proposed CFAR detectors are presented for different values of the compression parameter. We demonstrate, via simulation results, that the pre-processed compression procedure is computationally efficient and can significantly enhance detection performance.

Journal ArticleDOI
TL;DR: The use of genetic algorithms (GAs) tool for the solution of distributed constant false alarm rate ( CFAR) detection for Weibull clutter statistics is considered and an approximate expression of the probability of detection of the ordered statistics CFAR (OS-CFAR) detector in WeIBull clutter is derived.
Abstract: The use of genetic algorithms (GAs) tool for the solution of distributed constant false alarm rate (CFAR) detection for Weibull clutter statistics is considered. An approximate expression of the probability of detection (PD) of the ordered statistics CFAR (OS-CFAR) detector in Weibull clutter is derived. Optimal threshold values of distributed maximum likelihood CFAR (ML-CFAR) detectors and distributed OS-CFAR detectors with a known shape parameter of the background statistics are obtained using GA tool. For the distributed ML-CFAR detection, we consider also the case when the shape parameter is unknown of the Weibull distribution. A performance assessment is carried out, and the results are compared and given as a function of the shape parameter and of system parameters.

Journal ArticleDOI
TL;DR: This work proposes a novel cross-layer monitoring architecture that utilizes a new Network QoS to PQoS mapping framework at the application level and aims to provide perceived service performance verification with respect to the QoS guarantees that have been specified in contractual agreements between providers and end-users.
Abstract: One of the future visions of multimedia networking is the provision of multimedia content at a variety of quality and price levels. Of the many approaches to this issue, one of the most predominant techniques is the concept of Perceived Quality of Service (PQoS), which extends the traditional engineering-based QoS concept to the perceptual satisfaction that the user receives from the reception of multimedia content. In this context, PQoS monitoring is becoming crucial to media service providers (SPs) for providing not only quantified PQoS-based services, but also service assurance based on multimedia content adaptation across heterogeneous networks. This work proposes a novel cross-layer monitoring architecture that utilizes a new Network QoS (NQoS) to PQoS mapping framework at the application level. The resulting QoS monitoring should allow the content delivery system to take sophisticated actions for real time media content adaptation, and aims to provide perceived service performance verification with respect to the QoS guarantees that have been specified in contractual agreements between providers and end-users. A subsequent performance evaluation of the proposed model conducted using a real test-bed environment demonstrates both the accuracy and feasibility of the network level measurements, the NQoS to PQoS mapping and the overall feasibility of the proposed end-to-end monitoring solution.

Journal ArticleDOI
TL;DR: Experimental results show that, by using the RA algorithm, the number of bit requests over the feedback channel are significantly reduced—and hence, the decoder complexity and the latency—are significantly reduced, and a very near-to-optimal rate-distortion performance is maintained.
Abstract: In some video coding applications, it is desirable to reduce the complexity of the video encoder at the expense of a more complex decoder. Wyner–Ziv (WZ) video coding is a new paradigm that aims to achieve this. To allocate a proper number of bits to each frame, most WZ video coding algorithms use a feedback channel, which allows the decoder to request additional bits when needed. However, due to these multiple bit requests, the complexity and the latency of WZ video decoders increase massively. To overcome these problems, in this paper we propose a rate allocation (RA) algorithm for pixel-domain WZ video coders. This algorithm estimates at the encoder the number of bits needed for the decoding of every frame while still keeping the encoder complexity low. Experimental results show that, by using our RA algorithm, the number of bit requests over the feedback channel—and hence, the decoder complexity and the latency—are significantly reduced. Meanwhile, a very near-to-optimal rate-distortion performance is maintained.

Journal ArticleDOI
TL;DR: This paper proposes an improvement of the threshold optimization in distributed ordered statistics constant false alarm rate and censored mean level detector using Evolutionary Strategies (ESs); an (μ + λ) evolution strategy, by which a self-adaptation mutation is used.
Abstract: This paper proposes an improvement of the threshold optimization in distributed ordered statistics constant false alarm rate and censored mean level detector using Evolutionary Strategies (ESs). The target is assumed to be Rayleigh distributed and the observations are independent from sensor to sensor. Two fusion rules; “AND” and “OR” were considered. An ES was tested and a comparison with a genetic algorithm improved by a tournament selection was also analyzed. Among a variety of evolution strategies, the most popular proposed in the literature are the strategy (μ, λ) and the strategy (μ + λ). We proposed an (μ + λ) evolution strategy, by which a self-adaptation mutation is used. The results showed that, although the ES is more difficult to implement and is in a certain manner slower than the GA, it improves the performance of the system.

Journal ArticleDOI
TL;DR: Modifications and extensions are presented of the designs and architectures of wavelet packet based synthesis and analysis pairs of filter bank trees (Sablatash and Lodge in Digital Signal Process 13: 58–92, 2003) that can be used as transmultiplexers that exhibit a number of advantages over the previous designs and address three shortcomings.
Abstract: A review is presented first of the evolution of transmultiplexers since about 1966, in the context of a long progression of theoretical advances and developments leading to recent proposals to fundamentally improve OFDM type systems using principles of perfect reconstruction filter (PRF) banks. The equivalence of transmultiplexers to OFDM type multi-user systems is discussed. The desirable goals for performance and implementation of transmultiplexers or multiband, multiuser communication systems that are addressed and met in this paper using filter bank trees are set down. Then modifications and extensions are presented of the designs and architectures of wavelet packet based synthesis and analysis pairs of filter bank trees (Sablatash and Lodge in Digital Signal Process 13: 58–92, 2003) that can be used as transmultiplexers. These exhibit a number of advantages over the previous designs and address three shortcomings of the designs used to illustrate basic principles in Sablatash and Lodge (Digital Signal Process 13:58–92, 2003). The first of these is the asymmetry of the magnitude frequency responses of the multiplexer channels, which is addressed using a symmetric design for a lowpass and highpass quadrature mirror filter (QMF) pair described herein. The second is the problem of minimizing the total delay of the signal in passing through the analysis and synthesis filter banks. This is addressed using an architecture involving DFT polyphase synthesis filter banks to replace the wideband VSB filters at the roots of the two identical synthesis filter bank trees, but results in the multiplexer having fewer levels. In this way a tradeoff is effected of lower delay and complexity with fewer levels of bandwidth on demand. At the receiver matching DFT polyphase analysis filters and the other matching analysis filters are implemented. The third shortcoming is the difficulty in designing a synchronization scheme if the filters in the synthesis and analysis filter banks have non-linear phase. This is addressed by designing linear phase filters that do not affect the ISI to any significant degree for communication purposes, although exact perfect reconstruction is lost, but greatly ease and improve the design of the synchronization scheme. Relationships of this paper and its advantages over recent research studies and IEEE 802.22 standards proposals using PR filter banks for multi-user systems to greatly improve on OFDM systems are discussed.

Journal ArticleDOI
TL;DR: DRCVQ with arithmetic coding of codevector indexes and Laplacian smoothener can outperform the state-of-art Wavemesh for non-smooth meshes while performing slightly worse for smooth meshes while using MPS as codevector search acceleration scheme so that the compression scheme is real-time.
Abstract: The transmission and storage of large amounts of vertex geometry data are required for rendering geometrically detailed 3D models. To alleviate bandwidth requirements, vector quantisation (VQ) is an effective lossy vertex data compression technique for triangular meshes. This paper presents a novel vertex encoding algorithm using the dynamically restricted codebook-based vector quantisation (DRCVQ). In DRCVQ, a parameter is used to control the encoding quality to get the desired compression rate in a range with only one codebook, instead of using different levels of codebooks to get different compression rate. During the encoding process, the indexes of the preceding encoded residual vectors which have high correlation with the current input vector are prestored in a FIFO so that both the codevector searching range and bit rate are averagely reduced. The proposed scheme also incorporates a very effective Laplacian smooth operator. Simulation results show that for various size of mesh models, DRCVQ can reduce PSNR degradation of about 2.5–6 dB at 10 bits per vertex comparative to the conventional vertex encoding method with stationary codebooks and, DRCVQ with arithmetic coding of codevector indexes and Laplacian smoothener can outperform the state-of-art Wavemesh for non-smooth meshes while performing slightly worse for smooth meshes. In addition, we use MPS as codevector search acceleration scheme so that the compression scheme is real-time.

Journal ArticleDOI
TL;DR: Three specific rational-coefficient BWFBs with attractive features are obtained by adjusting the parameters and are systematically verified to exhibit performance competitive to several state-of-the-art BW FBs for image compression, and yet require lower computational costs.
Abstract: We had presented a simple technique, which is based on the theory of Diophantine equation, for parametrization of popular biorthogonal wavelet filter banks (BWFBs) having the linear phase and arbitrary multiplicity of vanishing moments (VMs), and constructed a type of parametric BWFBs with one free parameter [15]. Here we generalize this technique to the case of two parameters, and construct a type of parametric BWFBs with two free parameters. The closed-form parameter expressions of the BWFBs are derived, with which any two-parameter family of BWFBs having preassigned VMs can be constructed, and six families, i.e., 9/11, 10/10, 13/11, 10/14, 17/11, and 10/18 families, are considered here. Two parameters provide two degrees of freedom to optimize the resulting BWFBs with respect to other criteria. In particular, in each family, three specific rational-coefficient BWFBs with attractive features are obtained by adjusting the parameters: the first is not only very close to a quadrature mirror filter (QMF) bank, but has optimum coding gain; the second possesses characteristics that are close to the irrational BWFB with maximum VMs by Cohen et al.; and the last which has binary coefficients can realize a multiplication-free discrete wavelet transform. In addition, two BWFBs are systematically verified to exhibit performance competitive to several state-of-the-art BWFBs for image compression, and yet require lower computational costs.

Journal ArticleDOI
TL;DR: Two separate strategies for fast codebook searching are developed by exploiting the properties of SOM and then combined to develop the proposed method for improved overall performance, which is demonstrated with spatial vector quantization of gray-scale images.
Abstract: We propose a novel method for fast codebook searching in self-organizing map (SOM)-generated codebooks. This method performs a non-exhaustive search of the codebook to find a good match for an input vector. While performing an exhaustive search in a large codebook with high dimensional vectors, the encoder faces a significant computational barrier. Due to its topology preservation property, SOM holds a good promise of being utilized for fast codebook searching. This aspect of SOM remained largely unexploited till date. In this paper we first develop two separate strategies for fast codebook searching by exploiting the properties of SOM and then combine these strategies to develop the proposed method for improved overall performance. Though the method is general enough to be applied for any kind of signal domain, in the present paper we demonstrate its efficacy with spatial vector quantization of gray-scale images.

Journal ArticleDOI
TL;DR: A new method for extracting the system phase from the bispectrum of the system output has been proposed, based on the complete bispectral data computed in the frequency domain and modified group delay.
Abstract: In this paper, a new method for extracting the system phase from the bispectrum of the system output has been proposed. This is based on the complete bispectral data computed in the frequency domain and modified group delay. The frequency domain bispectrum computation improves the frequency resolution and the modified group delay reduces the variance preserving the frequency resolution. The use of full bispectral data also reduces the variance as it is used for averaging. For the proposed method at a signal to noise ratio of 5dB, the reduction in root mean square error is in the range of 1.5–7 times over the other methods considered.

Journal ArticleDOI
TL;DR: A variational Bayesian algorithm is derived, in which the number of sources will be estimated through two approaches: automatic relevance determination and comparison of the optimized negative free energy.
Abstract: Most traditional multichannel blind deconvolution algorithms rely on some assumptions on the mixing model, e.g. the number of sources is known a priori; and the mixing environment is noise-free. Unfortunately, these assumptions are not necessarily true in practice. In this paper, we will relax the assumption placed on the number of sources by studying a state space mixing model where the number of sources is assumed to be unknown but not greater than the number of sensors. Based on this mixing model, we will formulate the estimation of the number of sources problem as a model order selection problem. Model comparison, as a common method of model order selection, usually involves the evaluation of multi-variable integrals which is computationally intractable. A variational Bayesian method is therefore used to overcome this multi-variable integral issue. The problem is solved by approximating the true, complicated posteriors with a set of independent, simple, tractable posteriors. To realize the objective of optimal approximation, we maximize an objective function called negative free energy. We will derive a variational Bayesian algorithm, in which the number of sources will be estimated through two approaches: automatic relevance determination and comparison of the optimized negative free energy. The proposed variational Bayesian algorithm will be evaluated on both artificially generated examples, and practical signals.

Journal ArticleDOI
TL;DR: It is demonstrated how iterative development and evaluation can reveal areas where interface adaptation can play a useful role in enhancing the system’s usability.
Abstract: This paper describes the process of iterative design and comprehensive evaluation of the Meeting Miner, a tool for browsing of recorded multimedia meetings. The Meeting Miner provides access to the speech content of recordings by tracking non-verbal interaction events collected in real time during online collaborative meeting activities. It emphasises semantic relationships between speech and discrete actions performed during the meeting and aggregates information through patterns of co-location and co-occurrence of actions. We report on the experience gained through developing functionality to enhance the user’s browsing experience, requirements regarding information feedback and the importance of flexibility in the browsing tool. In particular, we demonstrate how iterative development and evaluation can reveal areas where interface adaptation can play a useful role in enhancing the system’s usability.

Journal ArticleDOI
TL;DR: A novel dilation-run image coding algorithm is developed by taking the advantages of both schemes, in which the clustered significant coefficients are extracted by using the morphological dilation operation and the insignificant coefficients between the extracted clusters are coded byUsing the run-length coding method.
Abstract: The run-length coding and the morphological representation are two classical schemes for wavelet image coding. The run-length coders have the advantage of simplicity by recording the lengths of zero-runs between significant wavelet coefficients but at the expense of yielding an inferior rate-distortion performance. The morphology-based coders, on the other hand, utilize the morphological dilation operation to delineate the clusters of significant coefficients for improving coding performance. In this paper, a novel dilation-run image coding algorithm is developed by taking the advantages of both schemes, in which the clustered significant coefficients are extracted by using the morphological dilation operation and the insignificant coefficients between the extracted clusters are coded by using the run-length coding method. The proposed dilation-run image coder is implemented in the framework of bitplane coding for producing embedded bitstreams. Compared with several state-of-the-art wavelet image coding methods, the proposed dilation-run image coding method achieves comparable rate-distortion coding performance, especially more attractive for fingerprint type of imageries.

Journal ArticleDOI
TL;DR: This work shows that by using a specific sign modulator the BDWT filter bank can be realized by only two biorthogonal filters, and introduces a fast convolution algorithm for implementation of the filter modules.
Abstract: Biorthogonal discrete wavelet transform (BDWT) has gained general acceptance as an image processing tool. For example, the JPEG2000 standard is completely based on the BDWT. In BDWT, the scaling (low-pass) and wavelet (high-pass) filters are symmetric and linear phase. In this work we show that by using a specific sign modulator the BDWT filter bank can be realized by only two biorthogonal filters. The analysis and synthesis parts use the same scaling and wavelet filters, which simplifies especially VLSI designs of the biorthogonal DWT/IDWT transceiver units. Utilizing the symmetry of the scaling and the wavelet filters we introduce a fast convolution algorithm for implementation of the filter modules. In multiplexer–demultiplexer VLSI applications both functions can be constructed via two running BDWT filters and the sign modulator.

Journal ArticleDOI
TL;DR: A graph based model based on implicit feedback mined from the interactions of previous users of the video search system is used to provide recommendations to aid users in their search tasks, to improve the quality of the results that users find and in doing so also help users to explore a large and difficult information space.
Abstract: In this paper, we present a novel approach to aid users in the difficult task of video search. We use a graph based model based on implicit feedback mined from the interactions of previous users of our video search system to provide recommendations to aid users in their search tasks. This approach means that users are not burdened with providing explicit feedback, while still getting the benefits of recommendations. The goal of this approach is to improve the quality of the results that users find, and in doing so also help users to explore a large and difficult information space. In particular we wish to make the challenging task of video search much easier for users. The results of our evaluation indicate that we achieved our goals, the performance of the users in retrieving relevant videos improved, and users were able to explore the collection to a greater extent.

Journal ArticleDOI
TL;DR: The results of simulations show that the patchwork algorithm yields a perceptually acceptable texture in a shorter expected running time than the other two algorithms; however, Multipatch is the most efficient in terms of obtaining a good quality texture image.
Abstract: We propose a novel sampling-based texture synthesis algorithm called Multipatch, which improves on the results of previous sampling-based algorithms by using patches of different size, and by minimizing global pasting errors. A key feature of the proposed algorithm is that it always converges to a local minimum. Multipatch, the patchwork algorithm, and Wei and Levoy’s multi-resolution texture synthesis algorithm, which is based on a tree-structured vector quantization method, are statistically analyzed and subjectively evaluated. The results of simulations show that the patchwork algorithm yields a perceptually acceptable texture in a shorter expected running time than the other two algorithms; however, Multipatch is the most efficient in terms of obtaining a good quality texture image.

Journal ArticleDOI
Jingqun Li1
TL;DR: It is shown that a band- limited signal can be reconstructed from two undersampled sequences of the signal, and a 2-channel multirate undersampling and reconstruction method for band-limited signals is proposed.
Abstract: In this paper, we show that a band-limited signal can be reconstructed from two undersampled sequences of the signal, and further propose a 2-channel multirate undersampling and reconstruction method for band-limited signals. With such an approach, the sampling rate can be reduced to about one half of the Nyquist sampling rate. This method also has the advantage that no analog filters are needed to decompose the signal into two components with narrower bandwidths. Furthermore, the parallel processing of the two sampled sequences will lead to improved system performances. Finally, we consider the synchronization between the two sampled sequences, and show that the two A/D converters can be readily synchronized.