Showing papers on "Word error rate published in 2010"

PDF

Open Access

Proceedings Article•

Recurrent neural network based language model

[...]

Tomas Mikolov¹, Martin Karafiat¹, Lukas Burget¹, Jan Cernocký, Sanjeev Khudanpur² - Show less +1 more•Institutions (2)

Brno University of Technology¹, Johns Hopkins University²

01 Jan 2010

TL;DR: Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.

...read moreread less

Abstract: A new recurrent neural network based language model (RNN LM) with applications to speech recognition is presented. Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Speech recognition experiments show around 18% reduction of word error rate on the Wall Street Journal task when comparing models trained on the same amount of data, and around 5% on the much harder NIST RT05 task, even when the backoff model is trained on much more data than the RNN LM. We provide ample empirical evidence to suggest that connectionist language models are superior to standard n-gram techniques, except their high computational (training) complexity. Index Terms: language modeling, recurrent neural networks, speech recognition

...read moreread less

5,751 citations

Journal Article•DOI•

Enhanced Local Texture Feature Sets for Face Recognition Under Difficult Lighting Conditions

[...]

Xiaoyang Tan¹, Bill Triggs²•Institutions (2)

Nanjing University¹, Centre national de la recherche scientifique²

01 Jun 2010-IEEE Transactions on Image Processing

TL;DR: This work presents a simple and efficient preprocessing chain that eliminates most of the effects of changing illumination while still preserving the essential appearance details that are needed for recognition, and improves robustness by adding Kernel principal component analysis (PCA) feature extraction and incorporating rich local appearance cues from two complementary sources.

...read moreread less

Abstract: Making recognition more reliable under uncontrolled lighting conditions is one of the most important challenges for practical face recognition systems. We tackle this by combining the strengths of robust illumination normalization, local texture-based face representations, distance transform based matching, kernel-based feature extraction and multiple feature fusion. Specifically, we make three main contributions: 1) we present a simple and efficient preprocessing chain that eliminates most of the effects of changing illumination while still preserving the essential appearance details that are needed for recognition; 2) we introduce local ternary patterns (LTP), a generalization of the local binary pattern (LBP) local texture descriptor that is more discriminant and less sensitive to noise in uniform regions, and we show that replacing comparisons based on local spatial histograms with a distance transform based similarity metric further improves the performance of LBP/LTP based face recognition; and 3) we further improve robustness by adding Kernel principal component analysis (PCA) feature extraction and incorporating rich local appearance cues from two complementary sources-Gabor wavelets and LBP-showing that the combination is considerably more accurate than either feature set alone. The resulting method provides state-of-the-art performance on three data sets that are widely used for testing recognition under difficult illumination conditions: Extended Yale-B, CAS-PEAL-R1, and Face Recognition Grand Challenge version 2 experiment 4 (FRGC-204). For example, on the challenging FRGC-204 data set it halves the error rate relative to previously published methods, achieving a face verification rate of 88.1% at 0.1% false accept rate. Further experiments show that our preprocessing method outperforms several existing preprocessors for a range of feature sets, data sets and lighting conditions.

...read moreread less

2,981 citations

Proceedings Article•

Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine

[...]

George E. Dahl¹, Marc'Aurelio Ranzato¹, Abdelrahman Mohamed¹, Geoffrey E. Hinton¹•Institutions (1)

University of Toronto¹

06 Dec 2010

TL;DR: This work uses the mean-covariance restricted Boltzmann machine (mcRBM) to learn features of speech data that serve as input into a standard DBN, and achieves a phone error rate superior to all published results on speaker-independent TIMIT to date.

...read moreread less

Abstract: Straightforward application of Deep Belief Nets (DBNs) to acoustic modeling produces a rich distributed representation of speech data that is useful for recognition and yields impressive results on the speaker-independent TIMIT phone recognition task. However, the first-layer Gaussian-Bernoulli Restricted Boltzmann Machine (GRBM) has an important limitation, shared with mixtures of diagonal-covariance Gaussians: GRBMs treat different components of the acoustic input vector as conditionally independent given the hidden state. The mean-covariance restricted Boltzmann machine (mcRBM), first introduced for modeling natural images, is a much more representationally efficient and powerful way of modeling the covariance structure of speech data. Every configuration of the precision units of the mcRBM specifies a different precision matrix for the conditional distribution over the acoustic space. In this work, we use the mcRBM to learn features of speech data that serve as input into a standard DBN. The mcRBM features combined with DBNs allow us to achieve a phone error rate of 20.5%, which is superior to all published results on speaker-independent TIMIT to date.

...read moreread less

326 citations

Journal Article•DOI•

Super-human multi-talker speech recognition: A graphical modeling approach

[...]

John R. Hershey¹, Steven J. Rennie¹, Peder A. Olsen¹, Trausti Kristjansson²•Institutions (2)

IBM¹, Google²

01 Jan 2010-Computer Speech & Language

TL;DR: A system that can separate and recognize the simultaneous speech of two people recorded in a single channel is presented and how belief-propagation reduces the complexity of temporal inference from exponential to linear in the number of sources and the size of the language model is shown.

...read moreread less

187 citations

Journal Article•DOI•

Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates

[...]

Sharon Goldwater¹, Dan Jurafsky², Christopher D. Manning²•Institutions (2)

University of Edinburgh¹, Stanford University²

01 Mar 2010-Speech Communication

TL;DR: It is proposed that doubly confusable pairs, rather than high neighborhood densit y, may better explain phonetic neighborhood errors in human speech processing.

...read moreread less

164 citations

Patent•

Methods and systems for adapting a model for a speech recognition system

[...]

Keith Braho, Jeffrey Pike, Lori Pike

11 Oct 2010

TL;DR: In this article, a method for identifying possible errors made by a speech recognition system without using a transcript of words input to the system is described. But this method does not consider the use of a word-to-word model.

...read moreread less

Abstract: Methods are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. A method for model adaptation for a speech recognition system includes determining an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system. The method may further include adjusting an adaptation, of the model for the word or various models for the various words, based on the error rate. Apparatus are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. An apparatus for model adaptation for a speech recognition system includes a processor adapted to estimate an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system. The apparatus may further include a controller adapted to adjust an adaptation of the model for the word or various models for the various words, based on the error rate.

...read moreread less

164 citations

Journal Article•DOI•

Modeling coarticulation in EMG-based continuous speech recognition

[...]

Tanja Schultz¹, Michael Wand¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Apr 2010-Speech Communication

TL;DR: The new approach of phonetic feature bundling for modeling coarticulation in EMG-based speech recognition is described and results on theEMG-PIT corpus, a multiple speaker large vocabulary database of silent and audible EMG speech recordings, which was recently collected are reported.

...read moreread less

161 citations

Proceedings Article•

Direct Loss Minimization for Structured Prediction

[...]

Tamir Hazan¹, Joseph Keshet¹, David McAllester¹•Institutions (1)

Toyota Technological Institute at Chicago¹

06 Dec 2010

TL;DR: A theorem stating that a certain perceptron-like learning rule, involving features vectors derived from loss-adjusted inference, directly corresponds to the gradient of task loss is stated.

...read moreread less

Abstract: In discriminative machine learning one is interested in training a system to optimize a certain desired measure of performance, or loss. In binary classification one typically tries to minimizes the error rate. But in structured prediction each task often has its own measure of performance such as the BLEU score in machine translation or the intersection-over-union score in PASCAL segmentation. The most common approaches to structured prediction, structural SVMs and CRFs, do not minimize the task loss: the former minimizes a surrogate loss with no guarantees for task loss and the latter minimizes log loss independent of task loss. The main contribution of this paper is a theorem stating that a certain perceptron-like learning rule, involving features vectors derived from loss-adjusted inference, directly corresponds to the gradient of task loss. We give empirical results on phonetic alignment of a standard test set from the TIMIT corpus, which surpasses all previously reported results on this problem.

...read moreread less

144 citations

Proceedings Article•

A Hybrid Rule/Model-Based Finite-State Framework for Normalizing SMS Messages

[...]

Richard Beaufort¹, Sophie Roekhaut², Louise-Amélie Cougnon¹, Cédrick Fairon¹•Institutions (2)

Université catholique de Louvain¹, University of Mons²

11 Jul 2010

TL;DR: This paper presents a method that shares similarities with both spell checking and machine translation approaches on normalizing SMS messages, and is entirely based on models trained from a corpus.

...read moreread less

Abstract: In recent years, research in natural language processing has increasingly focused on normalizing SMS messages. Different well-defined approaches have been proposed, but the problem remains far from being solved: best systems achieve a 11% Word Error Rate. This paper presents a method that shares similarities with both spell checking and machine translation approaches. The normalization part of the system is entirely based on models trained from a corpus. Evaluated in French by 10-fold-cross validation, the system achieves a 9.3% Word Error Rate and a 0.83 BLEU score.

...read moreread less

135 citations

An i-vector extractor suitable for speaker recognition with both microphone and telephone speech

[...]

Mohammed Senoussaoui, Patrick Kenny, Najim Dehak, Pierre Dumouchel

01 Jan 2010

TL;DR: This paper proposes a new architecture for text-independent speaker verification systems that are satisfactorily trained by virtue of a limited amount of application-specific data, supplemented with a sufficient amount of training data from some other context.

...read moreread less

Abstract: It is widely believed that speaker verification systems perform better when there is sufficient background training data to deal with nuisance effects of transmission channels. It is also known that these systems perform at their best when the sound environment of the training data is similar to that of the context of use (test context). For some applications however, training data from the same type of sound environment is scarce, whereas a considerable amount of data from a different type of environment is available. In this paper, we propose a new architecture for text-independent speaker verification systems that are satisfactorily trained by virtue of a limited amount of application-specific data, supplemented with a sufficient amount of training data from some other context. This architecture is based on the extraction of parameters (i-vectors) from a low-dimensional space (total variability space) proposed by Dehak [1]. Our aim is to extend Dehak’s work to speaker recognition on sparse data, namely microphone speech. The main challenge is to overcome the fact that insufficient application-specific data is available to accurately estimate the total variability covariance matrix. We propose a method based on Joint Factor Analysis (JFA) to estimate microphone eigenchannels (sparse data) with telephone eigenchannels (sufficient data). For classification, we experimented with the following two approaches: Support Vector Machines (SVM) and Cosine Distance Scoring (CDS) classifier, based on cosine distances. We present recognition results for the part of female voices in the interview data of the NIST 2008 SRE. The best performance is obtained when our system is fused with the state-of-the-art JFA. We achieve 13% relative improvement on equal error rate and the minimum value of detection cost function decreases from 0.0219 to 0.0164.

...read moreread less

110 citations

Proceedings Article•DOI•

Is physics-based liveness detection truly possible with a single image?

[...]

Jiamin Bai¹, Tian-Tsong Ng², Xinting Gao², Yun Q. Shi³•Institutions (3)

University of California, Berkeley¹, Institute for Infocomm Research Singapore², New Jersey Institute of Technology³

03 Aug 2010

TL;DR: This work proposes and validate a novel physics-based method to detect images recaptured from printed material using only a single image, and shows that the classifier can be generalizable to contrast enhanced recapture images and LCD screen recaptured images without re-training, demonstrating the robustness of the approach.

...read moreread less

Abstract: Face recognition is an increasingly popular method for user authentication. However, face recognition is susceptible to playback attacks. Therefore, a reliable way to detect malicious attacks is crucial to the robustness of the system. We propose and validate a novel physics-based method to detect images recaptured from printed material using only a single image. Micro-textures present in printed paper manifest themselves in the specular component of the image. Features extracted from this component allows a linear SVM classifier to achieve 2.2% False Acceptance Rate and 13% False Rejection Rate (6.7% Equal Error Rate). We also show that the classifier can be generalizable to contrast enhanced recaptured images and LCD screen recaptured images without re-training, demonstrating the robustness of our approach.1

...read moreread less

Proceedings Article•DOI•

The MITLL NIST LRE 2009 language recognition system

[...]

Pedro A. Torres-Carrasquillo¹, Elliot Singer¹, Terry P. Gleason¹, Alan V. McCree¹, Douglas A. Reynolds¹, Fred Richardson¹, Douglas E. Sturim¹ - Show less +3 more•Institutions (1)

Massachusetts Institute of Technology¹

14 Mar 2010

TL;DR: This paper presented a description of the MIT Lincoln Laboratory language recognition system submitted to the NIST 2009 Language Recognition Evaluation (LRE), which consists of a fusion of three core recognizers, two based on spectral similarity and one based on tokenization.

...read moreread less

Abstract: This paper presents a description of the MIT Lincoln Laboratory language recognition system submitted to the NIST 2009 Language Recognition Evaluation (LRE). This system consists of a fusion of three core recognizers, two based on spectral similarity and one based on tokenization. The 2009 LRE differed from previous ones in that test data included narrowband segments from worldwide Voice of America broadcasts as well as conventional recorded conversational telephone speech. Results are presented for the 23-language closed-set and open-set detection tasks at the 30, 10, and 3 second durations along with a discussion of the language-pair task. On the 30 second 23-language closed set detection task, the system achieved a 1.64 average error rate.

...read moreread less

Proceedings Article•DOI•

Successive impulsive noise suppression in OFDM

[...]

Anil Mengi¹, A. J. Han Vinck¹•Institutions (1)

University of Duisburg-Essen¹

28 Mar 2010

TL;DR: The iterative receiver design is extended to enable a fast convergence for N ≫ 64 and to improve the error rate performance for N ≤ 64 and includes a novel low complexity syndrome decoder which uses the redundancy that is transmitted for synchronization or other purposes.

...read moreread less

Abstract: We consider orthogonal frequency division multiplexing (OFDM) for high data rate narrowband power line communication (PLC) in the frequency bands up to 500 kHz. In narrowband PLC, the performance is strongly influenced by the impulsive noise with very large amplitudes with short durations. Simple iterative impulsive noise suppression algorithms can effectively improve the error rate performance in OFDM systems. However, the convergence speed depends on the number of subcarriers, N. For N ≤ 256, the algorithms converge slowly or not even at all. In this paper, we extend the iterative receiver design to enable a fast convergence for N ≫ 64 and to improve the error rate performance for N ≤ 64. These extensions include 1) a clipping and nulling technique at the input of the iterative algorithm 2) a novel low complexity syndrome decoder which uses the redundancy that is transmitted for synchronization or other purposes. Simulation results are provided to show the improvement in error rate

...read moreread less

Journal Article•DOI•

Combining local, regional and global matchers for a template protected on-line signature verification system

[...]

Loris Nanni, Emanuele Maiorana¹, Alessandra Lumini, Patrizio Campisi¹•Institutions (1)

Roma Tre University¹

01 May 2010-Expert Systems With Applications

TL;DR: An on-line signature authentication system based on an ensemble of local, regional, and global matchers is presented and a template protection scheme employing the BioHashing and the BioConvolving approaches, two well known template protection techniques for biometric recognition, is discussed.

...read moreread less

Abstract: In this work an on-line signature authentication system based on an ensemble of local, regional, and global matchers is presented. Specifically, the following matching approaches are taken into account: the fusion of two local methods employing Dynamic Time Warping, a Hidden Markov Model based approach where each signature is described by means of its regional properties, and a Linear Programming Descriptor classifier trained by global features. Moreover, a template protection scheme employing the BioHashing and the BioConvolving approaches, two well known template protection techniques for biometric recognition, is discussed. The reported experimental results, evaluated on the public MCYT signature database, show that our best ensemble obtains an impressive Equal Error Rate of 3%, when only five genuine signatures are acquired for each user during enrollment. Moreover, when the proposed protected system is taken into account, the Equal Error Rate achieved in the worst case scenario, that is,when an ''impostor'' is able to steal the hash keys, is equal to 4.51%, whereas an Equal Error Rate ~0 can be obtained when nobody steals the hash keys.

...read moreread less

Journal Article•DOI•

Unsupervised Equalization of Lombard Effect for Speech Recognition in Noisy Adverse Environments

[...]

Hynek Boril¹, John H. L. Hansen¹•Institutions (1)

University of Texas at Dallas¹

01 Aug 2010-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: Novel unsupervised frequency domain and cepstral domain equalizations that increase ASR resistance to LE are proposed and incorporated in a recognition scheme employing a codebook of noisy acoustic models and provide an absolute word error rate reduction on 10-dB signal-to-noise ratio data.

...read moreread less

Abstract: In the presence of environmental noise, speakers tend to adjust their speech production in an effort to preserve intelligible communication. The noise-induced speech adjustments, called Lombard effect (LE), are known to severely impact the accuracy of automatic speech recognition (ASR) systems. The reduced performance results from the mismatch between the ASR acoustic models trained typically on noise-clean neutral (modal) speech and the actual parameters of noisy LE speech. In this study, novel unsupervised frequency domain and cepstral domain equalizations that increase ASR resistance to LE are proposed and incorporated in a recognition scheme employing a codebook of noisy acoustic models. In the frequency domain, short-time speech spectra are transformed towards neutral ASR acoustic models in a maximum-likelihood fashion. Simultaneously, dynamics of cepstral samples are determined from the quantile estimates and normalized to a constant range. A codebook decoding strategy is applied to determine the noisy models best matching the actual mixture of speech and noisy background. The proposed algorithms are evaluated side by side with conventional compensation schemes on connected Czech digits presented in various levels of background car noise. The resulting system provides an absolute word error rate (WER) reduction on 10-dB signal-to-noise ratio data of 8.7% and 37.7% for female neutral and LE speech, respectively, and of 8.7% and 32.8% for male neutral and LE speech, respectively, when compared to the baseline recognizer employing perceptual linear prediction (PLP) coefficients and cepstral mean and variance normalization.

...read moreread less

Proceedings Article•

Extending the METEOR Machine Translation Evaluation Metric to the Phrase Level

[...]

Michael Denkowski¹, Alon Lavie¹•Institutions (1)

Carnegie Mellon University¹

02 Jun 2010

TL;DR: It is shown that Meteor-next improves correlation with HTER over baseline metrics, including earlier versions of Meteor, and approaches the correlation level of a state-of-the-art metric, TER-plus (TERp).

...read moreread less

Abstract: This paper presents Meteor-next, an extended version of the Meteor metric designed to have high correlation with post-editing measures of machine translation quality. We describe changes made to the metric's sentence aligner and scoring scheme as well as a method for tuning the metric's parameters to optimize correlation with human-targeted Translation Edit Rate (HTER). We then show that Meteor-next improves correlation with HTER over baseline metrics, including earlier versions of Meteor, and approaches the correlation level of a state-of-the-art metric, TER-plus (TERp).

...read moreread less

Proceedings Article•DOI•

Speaker identification by combining MFCC and phase information in noisy environments

[...]

Longbiao Wang¹, Kazue Minami², Kazumasa Yamamoto², Seiichi Nakagawa²•Institutions (2)

Shizuoka University¹, Toyohashi University of Technology²

14 Mar 2010

TL;DR: The effectiveness of phase information for noisy environments on speaker identification in noisy environments with integrated MFCC with phase information is described.

...read moreread less

Abstract: In conventional speaker recognition methods based on MFCC, the phase information has been ignored. Recently, we proposed a method that integrated MFCC with the phase information on a speaker recognition method. Using the phase information, the speaker identification error rate was reduced by 78% for clean speech. In this paper, we describe the effectiveness of phase information for noisy environments on speaker identification. Integrationg MFCC with phase information, the speaker error identification rates were reduced by 20%∼70% in comparison with using only MFCC in noisy environments.

...read moreread less

Proceedings Article•DOI•

Cover song detection: From high scores to general classification

[...]

Suman V. Ravuri¹, Daniel P. W. Ellis²•Institutions (2)

University of California, Berkeley¹, Columbia University²

14 Mar 2010

TL;DR: A cover song detection system that does not require prior knowledge of the number of cover songs in a test set in order to identify cover(s) to a reference song and performs classification using either a support vector machine (SVM) or multi-layer perceptron (MLP).

...read moreread less

Abstract: Existing cover song detection systems require prior knowledge of the number of cover songs in a test set in order to identify cover(s) to a reference song. We describe a system that does not require such prior knowledge. The input to the system is a reference track and test track, and the output is a binary classification of whether the inputs are either a reference and a cover or a reference and a non-cover. The system differs from state-of-the-art detectors by calculating multiple input features, performing a novel type of test song normalization in order to combat against “impostor” tracks, and performing classification using either a support vector machine (SVM) or multi-layer perceptron (MLP). On the covers80 test set, the system achieves an equal error rate of 10%, compared to 21.3% achieved by the 2007 LabROSA cover song detection system.

...read moreread less

Journal Article•DOI•

Distributed consensus-based demodulation: algorithms and error analysis

[...]

Hao Zhu¹, Alfonso Cano¹, Georgios B. Giannakis¹•Institutions (1)

University of Minnesota¹

01 Jun 2010-IEEE Transactions on Wireless Communications

TL;DR: This paper deals with distributed demodulation of space-time transmissions of a common message from a multi-antenna access point (AP) to a wireless sensor network with distinct merits in terms of error performance and resilience to non-ideal inter-sensor links.

...read moreread less

Abstract: This paper deals with distributed demodulation of space-time transmissions of a common message from a multi-antenna access point (AP) to a wireless sensor network. Based on local message exchanges with single-hop neighboring sensors, two algorithms are developed for distributed demodulation. In the first algorithm, sensors consent on the estimated symbols. By relaxing the finite-alphabet constraints on the symbols, the demodulation task is formulated as a distributed convex optimization problem that is solved iteratively using the method of multipliers. Distributed versions of the centralized zero-forcing (ZF) and minimum mean-square error (MMSE) demodulators follow as special cases. In the second algorithm, sensors iteratively reach consensus on the average (cross-) covariances of locally available per-sensor data vectors with the corresponding AP-to-sensor channel matrices, which constitute sufficient statistics for maximum likelihood demodulation. Distributed versions of the sphere decoding algorithm and the ZF/MMSE demodulators are also developed. These algorithms offer distinct merits in terms of error performance and resilience to non-ideal inter-sensor links. In both cases, the per-iteration error performance is analyzed, and the approximate number of iterations needed to attain a prescribed error rate are quantified. Simulated tests verify the analytical claims. Interestingly, only a few consensus iterations (roughly as many as the number of sensors), suffice for the distributed demodulators to approach the performance of their centralized counterparts.

...read moreread less

Journal Article•DOI•

Optimizing Automatic Speech Recognition for Low-Proficient Non-Native Speakers

[...]

Joost van Doremalen¹, Catia Cucchiarini¹, Helmer Strik¹•Institutions (1)

Radboud University Nijmegen¹

01 Jan 2010-Eurasip Journal on Audio, Speech, and Music Processing

TL;DR: Two experiments aimed at selecting utterances from lists of responses indicate that the decoding process can be improved by optimizing the language model and the acoustic models, thus reducing the utterance error rate from 29–26% to 10–8%.

...read moreread less

Abstract: Computer-Assisted Language Learning (CALL) applications for improving the oral skills of low-proficient learners have to cope with non-native speech that is particularly challenging. Since unconstrained non-native ASR is still problematic, a possible solution is to elicit constrained responses from the learners. In this paper, we describe experiments aimed at selecting utterances from lists of responses. The first experiment on utterance selection indicates that the decoding process can be improved by optimizing the language model and the acoustic models, thus reducing the utterance error rate from 29-26% to 10-8%. Since giving feedback on incorrectly recognized utterances is confusing, we verify the correctness of the utterance before providing feedback. The results of the second experiment on utterance verification indicate that combining duration-related features with a likelihood ratio (LR) yield an equal error rate (EER) of 10.3%, which is significantly better than the EER for the other measures in isolation.

...read moreread less

Book Chapter•DOI•

An alternative to error correction for SRAM-like PUFs

[...]

Maximilian Hofer¹, Christoph Boehm¹•Institutions (1)

Graz University of Technology¹

17 Aug 2010

TL;DR: Stable-PUF-marking is proposed as an alternative to error correction to get reproducible (i.e. stable) outputs from physical unclonable functions (PUF).

...read moreread less

Abstract: We propose a new technique called stable-PUF-marking as an alternative to error correction to get reproducible (i.e. stable) outputs from physical unclonable functions (PUF). The concept is based on the influence of the mismatch on the stability of the PUF-cells' output. To use this fact, cells providing a high mismatch between their crucial transistors are selected to substantially lower the error rate. To verify the concept, a statistical view to this approach is given. Furthermore, an SRAM-like PUF implementation is suggested that puts the approach into practice.

...read moreread less

Proceedings Article•DOI•

A method of pedestrian dead reckoning using action recognition

[...]

Masakatsu Kourogi¹, Tomoya Ishikawa¹, Takeshi Kurata¹•Institutions (1)

National Institute of Advanced Industrial Science and Technology¹

04 May 2010

TL;DR: Experimental results show that accuracy of the action recognition is more than 95% by cross validation test on the training data set, and error rate of the PDR localization is reduced from 4% of the walking distance to 2% in the total scenario within the office environment by using the results of action recognition to adjust the estimated location.

...read moreread less

Abstract: We present a method of estimating the location and orientation of a pedestrian which simultaneously recognizing his/her actions with a single low-cost inertial measurement unit (IMU) mounted at the waist of the user. Some of the actions other than walking locomotion, such as standing up from/sitting down on a chair, and bending over to slip through obstacles, taken by the pedestrians can be mostly seen at the particular locations where the objects and building facilities to induce the actions are placed. Conversely, by knowing the current location and its attribute about possibly taken actions, the action recognition process can be improved with the contextual information since prior knowledge about occurrence of actions is given as the attribute in the map. Additionally, when the posture (such as sitting, standing and getting to one knee) of the pedestrians is known, falsely recognized actions can be rejected. Experimental results show that accuracy of the action recognition on six types of the action (forward walking, backward walking, side stepping, sitting down on/standing up from a chair, going downstairs/upstairs and bending over) is more than 95% by cross validation test on the training data set, and the results also show that error rate of the PDR localization is reduced from 4% of the walking distance to 2% in the total scenario within the office environment by using the results of action recognition to adjust the estimated location.

...read moreread less

Journal Article•DOI•

String-to-dependency statistical machine translation

[...]

Libin Shen¹, Jinxi Xu¹, Ralph Weischedel¹•Institutions (1)

BBN Technologies¹

01 Dec 2010-Computational Linguistics

TL;DR: A novel string-to-dependency algorithm for statistical machine translation that employs a target dependency language model during decoding to exploit long distance word relations, which cannot be modeled with a traditional n-gram language model.

...read moreread less

Abstract: We propose a novel string-to-dependency algorithm for statistical machine translation. This algorithm employs a target dependency language model during decoding to exploit long distance word relations, which cannot be modeled with a traditional n-gram language model. Experiments show that the algorithm achieves significant improvement in MT performance over a state-of-the-art hierarchical string-to-string system on NIST MT06 and MT08 newswire evaluation sets.

...read moreread less

Proceedings Article•

Error Detection for Statistical Machine Translation Using Linguistic Features

[...]

Deyi Xiong¹, Min Zhang¹, Haizhou Li¹•Institutions (1)

Institute for Infocomm Research Singapore¹

11 Jul 2010

TL;DR: The experimental results show that linguistic features alone outperform word posterior probability based confidence estimation in error detection and linguistic features can further provide complementary information when combined with word confidence scores, which collectively reduce the classification error rate and improve the F measure.

...read moreread less

Abstract: Automatic error detection is desired in the post-processing to improve machine translation quality. The previous work is largely based on confidence estimation using system-based features, such as word posterior probabilities calculated from N-best lists or word lattices. We propose to incorporate two groups of linguistic features, which convey information from outside machine translation systems, into error detection: lexical and syntactic features. We use a maximum entropy classifier to predict translation errors by integrating word posterior probability feature and linguistic features. The experimental results show that 1) linguistic features alone outperform word posterior probability based confidence estimation in error detection; and 2) linguistic features can further provide complementary information when combined with word confidence scores, which collectively reduce the classification error rate by 18.52% and improve the F measure by 16.37%.

...read moreread less

Journal Article•DOI•

Fusion of threshold rules for target detection in wireless sensor networks

[...]

Mengxia Zhu¹, Song Ding², Qishi Wu³, Richard R. Brooks⁴, Nageswara S. V. Rao⁵, S. Sitharama Iyengar² - Show less +2 more•Institutions (5)

Southern Illinois University Carbondale¹, Louisiana State University², University of Memphis³, Clemson University⁴, Oak Ridge National Laboratory⁵

02 Mar 2010-ACM Transactions on Sensor Networks

TL;DR: The Monte Carlo-based simulation results show that the proposed approach significantly improves target detection performance, and can also be used to guide the actual threshold selection in practical sensor network implementation under certain error rate constraints.

...read moreread less

Abstract: We propose a binary decision fusion rule that reaches a global decision on the presence of a target by integrating local decisions made by multiple sensors. Without requiring a priori probability of target presence, the fusion threshold bounds derived using Chebyshev's inequality ensure a higher hit rate and lower false alarm rate compared to the weighted averages of individual sensors. The Monte Carlo-based simulation results show that the proposed approach significantly improves target detection performance, and can also be used to guide the actual threshold selection in practical sensor network implementation under certain error rate constraints.

...read moreread less

Book Chapter•DOI•

Word and Image

[...]

W.J.T. Mitchell

01 Mar 2010

Journal Article•DOI•

Error Rates of the Maximum-Likelihood Detector for Arbitrary Constellations: Convex/Concave Behavior and Applications

[...]

Sergey Loyka¹, Victoria Kostina², Francois Gagnon³•Institutions (3)

University of Ottawa¹, Princeton University², École Normale Supérieure³

01 Apr 2010-IEEE Transactions on Information Theory

TL;DR: Applications of the results are discussed, which include optimum power allocation in spatial multiplexing systems, optimum power/time sharing to decrease or increase (jamming problem) error rate, an implication for fading channels, and optimization of a unitary-precoded OFDM system.

...read moreread less

Abstract: Motivated by a recent surge of interest in convex optimization techniques, convexity/concavity properties of error rates of the maximum likelihood detector operating in the AWGN channel are studied and extended to frequency-flat slow-fading channels. Generic conditions are identified under which the symbol error rate (SER) is convex/concave for arbitrary multidimensional constellations. In particular, the SER is convex in SNR for any one- and two-dimensional constellation, and also in higher dimensions at high SNR. Pairwise error probability and bit error rate are shown to be convex at high SNR, for arbitrary constellations and bit mapping. Universal bounds for the SER first and second derivatives are obtained, which hold for arbitrary constellations and are tight for some of them. Applications of the results are discussed, which include optimum power allocation in spatial multiplexing systems, optimum power/time sharing to decrease or increase (jamming problem) error rate, an implication for fading channels (?fading is never good in low dimensions?) and optimization of a unitary-precoded OFDM system. For example, the error rate bounds of a unitary-precoded OFDM system with QPSK modulation, which reveal the best and worst precoding, are extended to arbitrary constellations, which may also include coding. The reported results also apply to the interference channel under Gaussian approximation, to the bit error rate when it can be expressed or approximated as a nonnegative linear combination of individual symbol error rates, and to coded systems.

...read moreread less

Journal Article•DOI•

Exact error rate analysis of equal gain and selection diversity for coherent free-space optical systems on strong turbulence channels

[...]

Mingbo Niu¹, Julian Cheng¹, Jonathan F. Holzman¹•Institutions (1)

University of British Columbia¹

21 Jun 2010-Optics Express

TL;DR: The exact bit-error rate for binary phase-shift keying and outage probability are developed for equal gain diversity and closed-form expressions of diversity order and coding gain are provided with both diversity receptions.

...read moreread less

Abstract: Exact error rate performances are studied for coherent free-space optical communication systems under strong turbulence with diversity reception. Equal gain and selection diversity are considered as practical schemes to mitigate turbulence. The exact bit-error rate for binary phase-shift keying and outage probability are developed for equal gain diversity. Analytical expressions are obtained for the bit-error rate of differential phase-shift keying and asynchronous frequency-shift keying, as well as for outage probability using selection diversity. Furthermore, we provide the closed-form expressions of diversity order and coding gain with both diversity receptions. The analytical results are verified by computer simulations and are suitable for rapid error rates calculation.

...read moreread less

Proceedings Article•DOI•

Online adaptive modulation and coding with support vector machines

[...]

Robert C. Daniels¹, Robert W. Heath¹•Institutions (1)

University of Texas at Austin¹

12 Apr 2010

TL;DR: A novel online support vector machine algorithm, compatible with accurate multidimensional link quality metrics, that is able to optimize AMC to the unique (potentially dynamic) hardware characteristics of each wireless device in selective channels is proposed.

...read moreread less

Abstract: Optimizing the performance of adaptive modulation and coding (AMC) in practice has proven challenging. Prior research has struggled to find link quality metrics that are suitable for look-up-tables and simultaneously provide an injective mapping to error rate in wireless links that feature selective channels with hardware nonlinearities and non-Gaussian noise effects. This paper proposes a novel online support vector machine algorithm, compatible with accurate multidimensional link quality metrics, that is able to optimize AMC to the unique (potentially dynamic) hardware characteristics of each wireless device in selective channels. IEEE 802.11n simulations show that our proposed algorithm allows each individual wireless device to optimize the operating point in the rate/reliability tradeoff through frame-by-frame error evaluation. These simulations also show that our algorithm displays identical performance to alternative online AMC algorithms while drastically reducing complexity.

...read moreread less

Journal Article•DOI•

Re-structuring, re-labeling, and re-aligning for syntax-based machine translation

[...]

Wei Wang, Jonathan May, Kevin Knight, Daniel Marcu

01 Jun 2010-Computational Linguistics

TL;DR: Three modifications to the MT training data are presented to improve the accuracy of a state-of-the-art syntax MT system: re-structuring changes the syntactic structure of training parse trees to enable reuse of substructures; re-labeling alters bracket labels to enrich rule application context; and re-aligning unifies word alignment across sentences to remove bad word alignments and refine good ones.

...read moreread less

Abstract: This article shows that the structure of bilingual material from standard parsing and alignment tools is not optimal for training syntax-based statistical machine translation (SMT) systems. We present three modifications to the MT training data to improve the accuracy of a state-of-the-art syntax MT system: re-structuring changes the syntactic structure of training parse trees to enable reuse of substructures; re-labeling alters bracket labels to enrich rule application context; and re-aligning unifies word alignment across sentences to remove bad word alignments and refine good ones. Better structures, labels, and word alignments are learned by the EM algorithm. We show that each individual technique leads to improvement as measured by BLEU, and we also show that the greatest improvement is achieved by combining them. We report an overall 1.48 BLEU improvement on the NIST08 evaluation set over a strong baseline in Chinese/English translation.

...read moreread less

Collapse