scispace - formally typeset
Search or ask a question
Author

Ante Jukic

Other affiliations: Apple Inc.
Bio: Ante Jukic is an academic researcher from University of Oldenburg. The author has contributed to research in topics: Linear prediction & Speech enhancement. The author has an hindex of 9, co-authored 24 publications receiving 397 citations. Previous affiliations of Ante Jukic include Apple Inc..

Papers
More filters
Journal ArticleDOI
TL;DR: This paper proposes a simple algorithm for Tucker factorization of a tensor with missing data and its application to low-$$n$$n-rank tensor completion and demonstrates in several numerical experiments that the proposed algorithm performs well even when the ranks are significantly overestimated.
Abstract: The problem of tensor completion arises often in signal processing and machine learning. It consists of recovering a tensor from a subset of its entries. The usual structural assumption on a tensor that makes the problem well posed is that the tensor has low rank in every mode. Several tensor completion methods based on minimization of nuclear norm, which is the closest convex approximation of rank, have been proposed recently, with applications mostly in image inpainting problems. It is often stated in these papers that methods based on Tucker factorization perform poorly when the true ranks are unknown. In this paper, we propose a simple algorithm for Tucker factorization of a tensor with missing data and its application to low-$$n$$n-rank tensor completion. The algorithm is similar to previously proposed method for PARAFAC decomposition with missing data. We demonstrate in several numerical experiments that the proposed algorithm performs well even when the ranks are significantly overestimated. Approximate reconstruction can be obtained when the ranks are underestimated. The algorithm outperforms nuclear norm minimization methods when the fraction of known elements of a tensor is low.

125 citations

Journal ArticleDOI
TL;DR: This paper proposes to model the desired speech signal using a general sparse prior that can be represented in a convex form as a maximization over scaled complex Gaussian distributions, which can be interpreted as a generalization of the commonly used time-varying Gaussian model.
Abstract: The quality of speech signals recorded in an enclosure can be severely degraded by room reverberation. In this paper, we focus on a class of blind batch methods for speech dereverberation in a noiseless scenario with a single source, which are based on multi-channel linear prediction in the short-time Fourier transform domain. Dereverberation is performed by maximum-likelihood estimation of the model parameters that are subsequently used to recover the desired speech signal. Contrary to the conventional method, we propose to model the desired speech signal using a general sparse prior that can be represented in a convex form as a maximization over scaled complex Gaussian distributions. The proposed model can be interpreted as a generalization of the commonly used time-varying Gaussian model. Furthermore, we reformulate both the conventional and the proposed method as an optimization problem with an ep-norm cost function, emphasizing the role of sparsity in the considered speech dereverberation methods. Experimental evaluation in different acoustic scenarios show that the proposed approach results in an improved performance compared to the conventional approach in terms of instrumental measures for speech quality.

97 citations

Journal ArticleDOI
TL;DR: Experimental results show that the proposed system is effective in suppressing both reverberation and noise while improving the speech quality, and the achieved improvements are particularly significant in conditions with high reverberation times.
Abstract: This paper presents a system aiming at joint dereverberation and noise reduction by applying a combination of a beamformer with a single-channel spectral enhancement scheme. First, a minimum variance distortionless response beamformer with an online estimated noise coherence matrix is used to suppress noise and reverberation. The output of this beamformer is then processed by a single-channel spectral enhancement scheme, based on statistical room acoustics, minimum statistics, and temporal cepstrum smoothing, to suppress residual noise and reverberation. The evaluation is conducted using the REVERB challenge corpus, designed to evaluate speech enhancement algorithms in the presence of both reverberation and noise. The proposed system is evaluated using instrumental speech quality measures, the performance of an automatic speech recognition system, and a subjective evaluation of the speech quality based on a MUSHRA test. The performance achieved by beamforming, single-channel spectral enhancement, and their combination are compared, and experimental results show that the proposed system is effective in suppressing both reverberation and noise while improving the speech quality. The achieved improvements are particularly significant in conditions with high reverberation times.

60 citations

Journal ArticleDOI
TL;DR: An adaptive speech dereverberation method based on constrained sparse multichannel linear prediction (MCLP), minimizing the mixed mixed $\ell _{2,p}$ norm of the desired component and using a statistical model for late reverberation to limit the power of the MCLP-based estimate.
Abstract: In this letter, we present an adaptive speech dereverberation method based on constrained sparse multichannel linear prediction (MCLP), minimizing the mixed $\ell _{2,p}$ norm of the desired component. In order to prevent overestimation of the undesired reverberant component, possibly leading to severe distortions of the output, we propose to use a statistical model for late reverberation to limit the power of the MCLP-based estimate. The resulting constrained optimization problem is solved by using the alternating direction method of multipliers, resulting in two variants of the dereverberation algorithm. Simulation results show that the proposed constraint increases the robustness with respect to parameter selection and improves the usability for dynamic scenarios in comparison to the unconstrained method.

33 citations

Proceedings ArticleDOI
04 May 2014
TL;DR: Experimental results, obtained using measured impulse responses, indicate that the proposed approach could be used to improve the dereverberation performance compared to the classical technique.
Abstract: Reverberation has a considerable impact on the quality and intelligibility of captured speech signals. In this paper we present an approach for blind multi-microphone speech dereverberation based on the weighted prediction error method, where the reverberant observations are modeled using multi-channel linear prediction in the short-time Fourier transform domain. Instead of using the commonly employed Gaussian distribution for the desired speech signal, the proposed approach uses a Laplacian distribution which is known to be more accurate in modeling speech signals. Maximum-likelihood estimation is used for estimating the model parameters, leading to a linear programming optimization problem. Experimental results, obtained using measured impulse responses, indicate that the proposed approach could be used to improve the dereverberation performance compared to the classical technique.

31 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Book
21 Feb 1970

986 citations

Journal ArticleDOI
TL;DR: In this paper, the authors survey the recent advances and transformative potential of machine learning (ML) including deep learning, in the field of acoustics and highlight ML developments in four acoustICS research areas: source localization in speech processing, source localization from ocean acoustic, bioacoustics, and environmental sounds in everyday scenes.
Abstract: Acoustic data provide scientific and engineering insights in fields ranging from biology and communications to ocean and Earth science. We survey the recent advances and transformative potential of machine learning (ML), including deep learning, in the field of acoustics. ML is a broad family of techniques, which are often based in statistics, for automatically detecting and utilizing patterns in data. Relative to conventional acoustics and signal processing, ML is data-driven. Given sufficient training data, ML can discover complex relationships between features and desired labels or actions, or between features themselves. With large volumes of training data, ML can discover models describing complex acoustic phenomena such as human speech and reverberation. ML in acoustics is rapidly developing with compelling results and significant future promise. We first introduce ML, then highlight ML developments in four acoustics research areas: source localization in speech processing, source localization in ocean acoustics, bioacoustics, and environmental sounds in everyday scenes.

162 citations

Journal ArticleDOI
TL;DR: It is shown how constrained multiblock tensor decomposition methods are able to extract similar or statistically dependent common features that are shared by all blocks, by incorporating the multiway nature of data.
Abstract: With the increasing availability of various sensor technologies, we now have access to large amounts of multi-block (also called multi-set, multi-relational, or multi-view) data that need to be jointly analyzed to explore their latent connections. Various component analysis methods have played an increasingly important role for the analysis of such coupled data. In this paper, we first provide a brief review of existing matrix-based (two-way) component analysis methods for the joint analysis of such data with a focus on biomedical applications. Then, we discuss their important extensions and generalization to multi-block multiway (tensor) data. We show how constrained multi-block tensor decomposition methods are able to extract similar or statistically dependent common features that are shared by all blocks, by incorporating the multiway nature of data. Special emphasis is given to the flexible common and individual feature analysis of multi-block data with the aim to simultaneously extract common and individual latent components with desired properties and types of diversity. Illustrative examples are given to demonstrate their effectiveness for biomedical data analysis.

153 citations