Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

Pattern Recognition and Machine Learning

To say that this is the best book on the quantum theory of fields is no praise, since to my knowledge it is the only book on this subject But it is a very good and most useful book The original was written in German and appeared in 1942 This is a translation with some minor changes A few remarks have been added, concerning meson theory and nuclear forces, also footnotes referring to modern work in this field, and finally an appendix on the symmetrization of the energy momentum tensor according to Belinfante Quantum Theory of Fields Prof Gregor Wentzel Translated from the German by Charlotte Houtermans and J M Jauch Pp ix + 224, (New York and London: Interscience Publishers, Inc, 1949) 36s

Quantum Theory of Fields

https://bura.brunel.ac.uk/bitstream/2438/14221/1/FullText.pdf

A survey of deep neural network architectures and their applications

In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. The great success of deep learning is mainly due to its scalability to encode large-scale data and to maneuver billions of model parameters. However, it is a challenge to deploy these cumbersome deep models on devices with limited resources, e.g., mobile phones and embedded devices, not only because of the high computational complexity but also the large storage requirements. To this end, a variety of model compression and acceleration techniques have been developed. As a representative type of model compression and acceleration, knowledge distillation effectively learns a small student model from a large teacher model. It has received rapid increasing attention from the community. This paper provides a comprehensive survey of knowledge distillation from the perspectives of knowledge categories, training schemes, teacher-student architecture, distillation algorithms, performance comparison and applications. Furthermore, challenges in knowledge distillation are briefly reviewed and comments on future research are discussed and forwarded.

Knowledge Distillation: A Survey

Noise spectrum estimation is a fundamental component of speech enhancement and speech recognition systems. We present an improved minima controlled recursive averaging (IMCRA) approach, for noise estimation in adverse environments involving nonstationary noise, weak speech components, and low input signal-to-noise ratio (SNR). The noise estimate is obtained by averaging past spectral power values, using a time-varying frequency-dependent smoothing parameter that is adjusted by the signal presence probability. The speech presence probability is controlled by the minima values of a smoothed periodogram. The proposed procedure comprises two iterations of smoothing and minimum tracking. The first iteration provides a rough voice activity detection in each frequency band. Then, smoothing in the second iteration excludes relatively strong speech components, which makes the minimum tracking during speech activity robust. We show that in nonstationary noise environments and under low SNR conditions, the IMCRA approach is very effective. In particular, compared to a competitive method, it obtains a lower estimation error, and when integrated into a speech enhancement system achieves improved speech quality and lower residual noise.

/pdf/noise-spectrum-estimation-in-adverse-environments-improved-k9jxzivyvw.pdf

Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging

In this letter, we develop a robust voice activity detector (VAD) for the application to variable-rate speech coding. The developed VAD employs the decision-directed parameter estimation method for the likelihood ratio test. In addition, we propose an effective hang-over scheme which considers the previous observations by a first-order Markov process modeling of speech occurrences. According to our simulation results, the proposed VAD shows significantly better performances than the G.729B VAD in low signal-to-noise ratio (SNR) and vehicular noise environments.

A statistical model-based voice activity detection

One of the key issues in practical speech processing is to achieve robust voice activity detection (VAD) against the background noise. Most of the statistical model-based approaches have tried to employ the Gaussian assumption in the discrete Fourier transform (DFT) domain, which, however, deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models. In addition to the Gaussian model, we also incorporate the complex Laplacian and Gamma probability density functions to our analysis of statistical properties. With a goodness-of-fit tests, we analyze the statistical properties of the DFT spectra of the noisy speech under various noise conditions. Based on the statistical analysis, the likelihood ratio test under the given statistical models is established for the purpose of VAD. Since the statistical characteristics of the speech signal are differently affected by the noise types and levels, to cope with the time-varying environments, our approach is aimed at finding adaptively an appropriate statistical model in an online fashion. The performance of the proposed VAD approaches in both the stationary and nonstationary noise environments is evaluated with the aid of an objective measure.

Voice activity detection based on multiple statistical models

In this letter, we propose a novel speech enhancement technique based on global soft decision. The proposed approach provides a unified framework for such procedures as speech absence probability (SAP) computation, spectral gain modification, and noise spectrum estimation using the same statistical model assumption. Performances of the proposed enhancement algorithm are evaluated by subjective tests under various environments and show better results compared with the IS-127 standard enhancement method.

Spectral enhancement based on global soft decision

Speech recognition in noisy environments using first-order vector Taylor series

In this letter, we propose a new statistical model, two-sided generalized gamma distribution (G/spl Gamma/D) for an efficient parametric characterization of speech spectra. G/spl Gamma/D forms a generalized class of parametric distributions, including the Gaussian, Laplacian, and Gamma probability density functions (pdfs) as special cases. We also propose a computationally inexpensive online maximum likelihood (ML) parameter estimation algorithm for G/spl Gamma/D. Likelihoods, coefficients of variation (CVs), and Kolmogorov-Smirnov (KS) tests show that G/spl Gamma/D can model the distribution of the real speech signal more accurately than the conventional Gaussian, Laplacian, Gamma, or generalized Gaussian distribution (GGD).

Nam Soo Kim

Papers

A statistical model-based voice activity detection

Voice activity detection based on multiple statistical models

Spectral enhancement based on global soft decision

Speech recognition in noisy environments using first-order vector Taylor series

Statistical modeling of speech signals based on generalized gamma distribution