scispace - formally typeset
Search or ask a question
Author

Hongxue Wang

Bio: Hongxue Wang is an academic researcher from Shanghai University. The author has contributed to research in topics: Audio signal processing & Speech coding. The author has an hindex of 1, co-authored 1 publications receiving 8 citations.

Papers
More filters
Proceedings ArticleDOI
16 Jul 2012
TL;DR: An improved audio fingerprinting extraction algorithm which was proposed by Shazam company is proposed, which uses a combinatorial hashed time-frequency analysis of the audio, yielding unusual properties in which multiple tracks mixed together may each be identified.
Abstract: Audio fingerprinting, like human fingerprint, identifies audio clips from a large number of databases successfully, even when the audio signals are slightly or seriously distorted. In the paper, based on 2-D, we propose an improved audio fingerprinting extraction algorithm which was proposed by Shazam company. The algorithm uses a combinatorial hashed time-frequency analysis of the audio, yielding unusual properties in which multiple tracks mixed together may each be identified. The results of experiment verify the improvement in the retrieval speed and accuracy.

8 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: An inclusive survey on key indoor technologies and techniques is carried out with to view to explore their various benefits, limitations, and areas for improvement, and advocates hybridization of technologies as an effective approach to achieve reliable IoT-based indoor systems.

88 citations

Journal ArticleDOI
09 Jan 2018-Sensors
TL;DR: audio fingerprinting techniques that can be used with the acoustic data acquired from mobile devices using data acquired with mobile devices published between 2002 and 2017 are reviewed.
Abstract: An increase in the accuracy of identification of Activities of Daily Living (ADL) is very important for different goals of Enhanced Living Environments and for Ambient Assisted Living (AAL) tasks. This increase may be achieved through identification of the surrounding environment. Although this is usually used to identify the location, ADL recognition can be improved with the identification of the sound in that particular environment. This paper reviews audio fingerprinting techniques that can be used with the acoustic data acquired from mobile devices. A comprehensive literature search was conducted in order to identify relevant English language works aimed at the identification of the environment of ADLs using data acquired with mobile devices, published between 2002 and 2017. In total, 40 studies were analyzed and selected from 115 citations. The results highlight several audio fingerprinting techniques, including Modified discrete cosine transform (MDCT), Mel-frequency cepstrum coefficients (MFCC), Principal Component Analysis (PCA), Fast Fourier Transform (FFT), Gaussian mixture models (GMM), likelihood estimation, logarithmic moduled complex lapped transform (LMCLT), support vector machine (SVM), constant Q transform (CQT), symmetric pairwise boosting (SPB), Philips robust hash (PRH), linear discriminant analysis (LDA) and discrete cosine transform (DCT).

26 citations

Journal ArticleDOI
TL;DR: Combined with the linear prediction-minimum mean squared error (LP-MMSE), an efficient perceptual hashing algorithm based on improved spectral entropy for speech authentication was proposed in this paper and Experimental results show that the proposed algorithm was better than other existing methods in compactness.
Abstract: Combined with the linear prediction-minimum mean squared error (LP-MMSE), an efficient perceptual hashing algorithm based on improved spectral entropy for speech authentication was proposed in this paper. The linear prediction analysis is conducted on speech signal after preprocessing, framing and adding windows, and obtained the minimum mean squared error coefficient matrix. And then, the spectral entropy parameter matrix of each frame is calculated by using improved spectral entropy method. And the final binary perceptual hashing sequence is generated based on the above two matrices, and the speech authentication is completed. Comparing the experimental results of combining the Teager energy operator (TEO) with the linear predictive coefficients (LPC), LP-MMSE and line spectrum pair (LSP) coefficient respectively, it can be seen that the proposed algorithm had a good compromise between robustness, discrimination and authentication efficiency, and the proposed algorithm can meet the requirement of real-time speech authentication in speech communication. Experimental results show that the proposed algorithm was better than other existing methods in compactness.

16 citations

Patent
12 Jan 2018
TL;DR: In this article, an audio matching method and device and electronic equipment is described. And audio matching results of the audio segments are merged to obtain a matching result of the to-be-matched audio data.
Abstract: The invention discloses an audio matching method and device and electronic equipment. The method includes the steps that firstly, to-be-matched audio data is acquired; secondly, the to-be-matched audio data is segmented to obtain multiple to-be-matched audio segments after segmentation; thirdly, audio fingerprint features of each to-be-matched audio segment are extracted, according to the extracted audio fingerprint features, audio matching is conducted on each to-be-matched audio segment by using a pre-built audio matching library, and audio matching results of the to-be-matched audio segments are obtained; fourthly, the audio matching results of the to-be-matched audio segments are merged to obtain a matching result of the to-be-matched audio data. By means of the method, the audio retrieval efficiency can be improved.

11 citations

Proceedings ArticleDOI
24 Oct 2013
TL;DR: A fast algorithm is proposed to the audio content-based retrieval with the fingerprint technique, based on the extraction of the frequency features of the audio and a hash function, which has a high success rate and a response time lower than other techniques.
Abstract: Fingerprinting is one of the most used techniques for searching and identification audio with a wide spectrum of applications. Different algorithms defines different fingerprint extraction and the match techniques, with different efficiency, computational load, robustness, response time and location search. Nowadays music audio retrieval faces two main challenges in order to be efficient: robustness and speed. This article proposes a fast algorithm to the audio content-based retrieval with the fingerprint technique, based on the extraction of the frequency features of the audio and a hash function. Experiments determined a high success rate and a response time lower than other techniques, optimal to real time applications like monitoring radio stations or songs identifying.

5 citations