scispace - formally typeset
Search or ask a question

Showing papers on "Word error rate published in 2010"


Proceedings Article
01 Jan 2010
TL;DR: Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.
Abstract: A new recurrent neural network based language model (RNN LM) with applications to speech recognition is presented. Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Speech recognition experiments show around 18% reduction of word error rate on the Wall Street Journal task when comparing models trained on the same amount of data, and around 5% on the much harder NIST RT05 task, even when the backoff model is trained on much more data than the RNN LM. We provide ample empirical evidence to suggest that connectionist language models are superior to standard n-gram techniques, except their high computational (training) complexity. Index Terms: language modeling, recurrent neural networks, speech recognition

5,751 citations


Journal ArticleDOI
TL;DR: This work presents a simple and efficient preprocessing chain that eliminates most of the effects of changing illumination while still preserving the essential appearance details that are needed for recognition, and improves robustness by adding Kernel principal component analysis (PCA) feature extraction and incorporating rich local appearance cues from two complementary sources.
Abstract: Making recognition more reliable under uncontrolled lighting conditions is one of the most important challenges for practical face recognition systems. We tackle this by combining the strengths of robust illumination normalization, local texture-based face representations, distance transform based matching, kernel-based feature extraction and multiple feature fusion. Specifically, we make three main contributions: 1) we present a simple and efficient preprocessing chain that eliminates most of the effects of changing illumination while still preserving the essential appearance details that are needed for recognition; 2) we introduce local ternary patterns (LTP), a generalization of the local binary pattern (LBP) local texture descriptor that is more discriminant and less sensitive to noise in uniform regions, and we show that replacing comparisons based on local spatial histograms with a distance transform based similarity metric further improves the performance of LBP/LTP based face recognition; and 3) we further improve robustness by adding Kernel principal component analysis (PCA) feature extraction and incorporating rich local appearance cues from two complementary sources-Gabor wavelets and LBP-showing that the combination is considerably more accurate than either feature set alone. The resulting method provides state-of-the-art performance on three data sets that are widely used for testing recognition under difficult illumination conditions: Extended Yale-B, CAS-PEAL-R1, and Face Recognition Grand Challenge version 2 experiment 4 (FRGC-204). For example, on the challenging FRGC-204 data set it halves the error rate relative to previously published methods, achieving a face verification rate of 88.1% at 0.1% false accept rate. Further experiments show that our preprocessing method outperforms several existing preprocessors for a range of feature sets, data sets and lighting conditions.

2,981 citations


Proceedings Article
06 Dec 2010
TL;DR: This work uses the mean-covariance restricted Boltzmann machine (mcRBM) to learn features of speech data that serve as input into a standard DBN, and achieves a phone error rate superior to all published results on speaker-independent TIMIT to date.
Abstract: Straightforward application of Deep Belief Nets (DBNs) to acoustic modeling produces a rich distributed representation of speech data that is useful for recognition and yields impressive results on the speaker-independent TIMIT phone recognition task. However, the first-layer Gaussian-Bernoulli Restricted Boltzmann Machine (GRBM) has an important limitation, shared with mixtures of diagonal-covariance Gaussians: GRBMs treat different components of the acoustic input vector as conditionally independent given the hidden state. The mean-covariance restricted Boltzmann machine (mcRBM), first introduced for modeling natural images, is a much more representationally efficient and powerful way of modeling the covariance structure of speech data. Every configuration of the precision units of the mcRBM specifies a different precision matrix for the conditional distribution over the acoustic space. In this work, we use the mcRBM to learn features of speech data that serve as input into a standard DBN. The mcRBM features combined with DBNs allow us to achieve a phone error rate of 20.5%, which is superior to all published results on speaker-independent TIMIT to date.

326 citations


Journal ArticleDOI
TL;DR: A system that can separate and recognize the simultaneous speech of two people recorded in a single channel is presented and how belief-propagation reduces the complexity of temporal inference from exponential to linear in the number of sources and the size of the language model is shown.

187 citations


Journal ArticleDOI
TL;DR: It is proposed that doubly confusable pairs, rather than high neighborhood densit y, may better explain phonetic neighborhood errors in human speech processing.

164 citations


Patent
11 Oct 2010
TL;DR: In this article, a method for identifying possible errors made by a speech recognition system without using a transcript of words input to the system is described. But this method does not consider the use of a word-to-word model.
Abstract: Methods are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. A method for model adaptation for a speech recognition system includes determining an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system. The method may further include adjusting an adaptation, of the model for the word or various models for the various words, based on the error rate. Apparatus are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. An apparatus for model adaptation for a speech recognition system includes a processor adapted to estimate an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system. The apparatus may further include a controller adapted to adjust an adaptation of the model for the word or various models for the various words, based on the error rate.

164 citations


Journal ArticleDOI
TL;DR: The new approach of phonetic feature bundling for modeling coarticulation in EMG-based speech recognition is described and results on theEMG-PIT corpus, a multiple speaker large vocabulary database of silent and audible EMG speech recordings, which was recently collected are reported.

161 citations


Proceedings Article
06 Dec 2010
TL;DR: A theorem stating that a certain perceptron-like learning rule, involving features vectors derived from loss-adjusted inference, directly corresponds to the gradient of task loss is stated.
Abstract: In discriminative machine learning one is interested in training a system to optimize a certain desired measure of performance, or loss. In binary classification one typically tries to minimizes the error rate. But in structured prediction each task often has its own measure of performance such as the BLEU score in machine translation or the intersection-over-union score in PASCAL segmentation. The most common approaches to structured prediction, structural SVMs and CRFs, do not minimize the task loss: the former minimizes a surrogate loss with no guarantees for task loss and the latter minimizes log loss independent of task loss. The main contribution of this paper is a theorem stating that a certain perceptron-like learning rule, involving features vectors derived from loss-adjusted inference, directly corresponds to the gradient of task loss. We give empirical results on phonetic alignment of a standard test set from the TIMIT corpus, which surpasses all previously reported results on this problem.

144 citations


Proceedings Article
11 Jul 2010
TL;DR: This paper presents a method that shares similarities with both spell checking and machine translation approaches on normalizing SMS messages, and is entirely based on models trained from a corpus.
Abstract: In recent years, research in natural language processing has increasingly focused on normalizing SMS messages. Different well-defined approaches have been proposed, but the problem remains far from being solved: best systems achieve a 11% Word Error Rate. This paper presents a method that shares similarities with both spell checking and machine translation approaches. The normalization part of the system is entirely based on models trained from a corpus. Evaluated in French by 10-fold-cross validation, the system achieves a 9.3% Word Error Rate and a 0.83 BLEU score.

135 citations


01 Jan 2010
TL;DR: This paper proposes a new architecture for text-independent speaker verification systems that are satisfactorily trained by virtue of a limited amount of application-specific data, supplemented with a sufficient amount of training data from some other context.
Abstract: It is widely believed that speaker verification systems perform better when there is sufficient background training data to deal with nuisance effects of transmission channels. It is also known that these systems perform at their best when the sound environment of the training data is similar to that of the context of use (test context). For some applications however, training data from the same type of sound environment is scarce, whereas a considerable amount of data from a different type of environment is available. In this paper, we propose a new architecture for text-independent speaker verification systems that are satisfactorily trained by virtue of a limited amount of application-specific data, supplemented with a sufficient amount of training data from some other context. This architecture is based on the extraction of parameters (i-vectors) from a low-dimensional space (total variability space) proposed by Dehak [1]. Our aim is to extend Dehak’s work to speaker recognition on sparse data, namely microphone speech. The main challenge is to overcome the fact that insufficient application-specific data is available to accurately estimate the total variability covariance matrix. We propose a method based on Joint Factor Analysis (JFA) to estimate microphone eigenchannels (sparse data) with telephone eigenchannels (sufficient data). For classification, we experimented with the following two approaches: Support Vector Machines (SVM) and Cosine Distance Scoring (CDS) classifier, based on cosine distances. We present recognition results for the part of female voices in the interview data of the NIST 2008 SRE. The best performance is obtained when our system is fused with the state-of-the-art JFA. We achieve 13% relative improvement on equal error rate and the minimum value of detection cost function decreases from 0.0219 to 0.0164.

110 citations


Proceedings ArticleDOI
03 Aug 2010
TL;DR: This work proposes and validate a novel physics-based method to detect images recaptured from printed material using only a single image, and shows that the classifier can be generalizable to contrast enhanced recapture images and LCD screen recaptured images without re-training, demonstrating the robustness of the approach.
Abstract: Face recognition is an increasingly popular method for user authentication. However, face recognition is susceptible to playback attacks. Therefore, a reliable way to detect malicious attacks is crucial to the robustness of the system. We propose and validate a novel physics-based method to detect images recaptured from printed material using only a single image. Micro-textures present in printed paper manifest themselves in the specular component of the image. Features extracted from this component allows a linear SVM classifier to achieve 2.2% False Acceptance Rate and 13% False Rejection Rate (6.7% Equal Error Rate). We also show that the classifier can be generalizable to contrast enhanced recaptured images and LCD screen recaptured images without re-training, demonstrating the robustness of our approach.1

Proceedings ArticleDOI
14 Mar 2010
TL;DR: This paper presented a description of the MIT Lincoln Laboratory language recognition system submitted to the NIST 2009 Language Recognition Evaluation (LRE), which consists of a fusion of three core recognizers, two based on spectral similarity and one based on tokenization.
Abstract: This paper presents a description of the MIT Lincoln Laboratory language recognition system submitted to the NIST 2009 Language Recognition Evaluation (LRE). This system consists of a fusion of three core recognizers, two based on spectral similarity and one based on tokenization. The 2009 LRE differed from previous ones in that test data included narrowband segments from worldwide Voice of America broadcasts as well as conventional recorded conversational telephone speech. Results are presented for the 23-language closed-set and open-set detection tasks at the 30, 10, and 3 second durations along with a discussion of the language-pair task. On the 30 second 23-language closed set detection task, the system achieved a 1.64 average error rate.

Proceedings ArticleDOI
28 Mar 2010
TL;DR: The iterative receiver design is extended to enable a fast convergence for N ≫ 64 and to improve the error rate performance for N ≤ 64 and includes a novel low complexity syndrome decoder which uses the redundancy that is transmitted for synchronization or other purposes.
Abstract: We consider orthogonal frequency division multiplexing (OFDM) for high data rate narrowband power line communication (PLC) in the frequency bands up to 500 kHz. In narrowband PLC, the performance is strongly influenced by the impulsive noise with very large amplitudes with short durations. Simple iterative impulsive noise suppression algorithms can effectively improve the error rate performance in OFDM systems. However, the convergence speed depends on the number of subcarriers, N. For N ≤ 256, the algorithms converge slowly or not even at all. In this paper, we extend the iterative receiver design to enable a fast convergence for N ≫ 64 and to improve the error rate performance for N ≤ 64. These extensions include 1) a clipping and nulling technique at the input of the iterative algorithm 2) a novel low complexity syndrome decoder which uses the redundancy that is transmitted for synchronization or other purposes. Simulation results are provided to show the improvement in error rate

Journal ArticleDOI
TL;DR: An on-line signature authentication system based on an ensemble of local, regional, and global matchers is presented and a template protection scheme employing the BioHashing and the BioConvolving approaches, two well known template protection techniques for biometric recognition, is discussed.
Abstract: In this work an on-line signature authentication system based on an ensemble of local, regional, and global matchers is presented. Specifically, the following matching approaches are taken into account: the fusion of two local methods employing Dynamic Time Warping, a Hidden Markov Model based approach where each signature is described by means of its regional properties, and a Linear Programming Descriptor classifier trained by global features. Moreover, a template protection scheme employing the BioHashing and the BioConvolving approaches, two well known template protection techniques for biometric recognition, is discussed. The reported experimental results, evaluated on the public MCYT signature database, show that our best ensemble obtains an impressive Equal Error Rate of 3%, when only five genuine signatures are acquired for each user during enrollment. Moreover, when the proposed protected system is taken into account, the Equal Error Rate achieved in the worst case scenario, that is,when an ''impostor'' is able to steal the hash keys, is equal to 4.51%, whereas an Equal Error Rate ~0 can be obtained when nobody steals the hash keys.

Journal ArticleDOI
TL;DR: Novel unsupervised frequency domain and cepstral domain equalizations that increase ASR resistance to LE are proposed and incorporated in a recognition scheme employing a codebook of noisy acoustic models and provide an absolute word error rate reduction on 10-dB signal-to-noise ratio data.
Abstract: In the presence of environmental noise, speakers tend to adjust their speech production in an effort to preserve intelligible communication. The noise-induced speech adjustments, called Lombard effect (LE), are known to severely impact the accuracy of automatic speech recognition (ASR) systems. The reduced performance results from the mismatch between the ASR acoustic models trained typically on noise-clean neutral (modal) speech and the actual parameters of noisy LE speech. In this study, novel unsupervised frequency domain and cepstral domain equalizations that increase ASR resistance to LE are proposed and incorporated in a recognition scheme employing a codebook of noisy acoustic models. In the frequency domain, short-time speech spectra are transformed towards neutral ASR acoustic models in a maximum-likelihood fashion. Simultaneously, dynamics of cepstral samples are determined from the quantile estimates and normalized to a constant range. A codebook decoding strategy is applied to determine the noisy models best matching the actual mixture of speech and noisy background. The proposed algorithms are evaluated side by side with conventional compensation schemes on connected Czech digits presented in various levels of background car noise. The resulting system provides an absolute word error rate (WER) reduction on 10-dB signal-to-noise ratio data of 8.7% and 37.7% for female neutral and LE speech, respectively, and of 8.7% and 32.8% for male neutral and LE speech, respectively, when compared to the baseline recognizer employing perceptual linear prediction (PLP) coefficients and cepstral mean and variance normalization.

Proceedings Article
02 Jun 2010
TL;DR: It is shown that Meteor-next improves correlation with HTER over baseline metrics, including earlier versions of Meteor, and approaches the correlation level of a state-of-the-art metric, TER-plus (TERp).
Abstract: This paper presents Meteor-next, an extended version of the Meteor metric designed to have high correlation with post-editing measures of machine translation quality. We describe changes made to the metric's sentence aligner and scoring scheme as well as a method for tuning the metric's parameters to optimize correlation with human-targeted Translation Edit Rate (HTER). We then show that Meteor-next improves correlation with HTER over baseline metrics, including earlier versions of Meteor, and approaches the correlation level of a state-of-the-art metric, TER-plus (TERp).

Proceedings ArticleDOI
14 Mar 2010
TL;DR: The effectiveness of phase information for noisy environments on speaker identification in noisy environments with integrated MFCC with phase information is described.
Abstract: In conventional speaker recognition methods based on MFCC, the phase information has been ignored. Recently, we proposed a method that integrated MFCC with the phase information on a speaker recognition method. Using the phase information, the speaker identification error rate was reduced by 78% for clean speech. In this paper, we describe the effectiveness of phase information for noisy environments on speaker identification. Integrationg MFCC with phase information, the speaker error identification rates were reduced by 20%∼70% in comparison with using only MFCC in noisy environments.

Proceedings ArticleDOI
14 Mar 2010
TL;DR: A cover song detection system that does not require prior knowledge of the number of cover songs in a test set in order to identify cover(s) to a reference song and performs classification using either a support vector machine (SVM) or multi-layer perceptron (MLP).
Abstract: Existing cover song detection systems require prior knowledge of the number of cover songs in a test set in order to identify cover(s) to a reference song. We describe a system that does not require such prior knowledge. The input to the system is a reference track and test track, and the output is a binary classification of whether the inputs are either a reference and a cover or a reference and a non-cover. The system differs from state-of-the-art detectors by calculating multiple input features, performing a novel type of test song normalization in order to combat against “impostor” tracks, and performing classification using either a support vector machine (SVM) or multi-layer perceptron (MLP). On the covers80 test set, the system achieves an equal error rate of 10%, compared to 21.3% achieved by the 2007 LabROSA cover song detection system.

Journal ArticleDOI
TL;DR: This paper deals with distributed demodulation of space-time transmissions of a common message from a multi-antenna access point (AP) to a wireless sensor network with distinct merits in terms of error performance and resilience to non-ideal inter-sensor links.
Abstract: This paper deals with distributed demodulation of space-time transmissions of a common message from a multi-antenna access point (AP) to a wireless sensor network. Based on local message exchanges with single-hop neighboring sensors, two algorithms are developed for distributed demodulation. In the first algorithm, sensors consent on the estimated symbols. By relaxing the finite-alphabet constraints on the symbols, the demodulation task is formulated as a distributed convex optimization problem that is solved iteratively using the method of multipliers. Distributed versions of the centralized zero-forcing (ZF) and minimum mean-square error (MMSE) demodulators follow as special cases. In the second algorithm, sensors iteratively reach consensus on the average (cross-) covariances of locally available per-sensor data vectors with the corresponding AP-to-sensor channel matrices, which constitute sufficient statistics for maximum likelihood demodulation. Distributed versions of the sphere decoding algorithm and the ZF/MMSE demodulators are also developed. These algorithms offer distinct merits in terms of error performance and resilience to non-ideal inter-sensor links. In both cases, the per-iteration error performance is analyzed, and the approximate number of iterations needed to attain a prescribed error rate are quantified. Simulated tests verify the analytical claims. Interestingly, only a few consensus iterations (roughly as many as the number of sensors), suffice for the distributed demodulators to approach the performance of their centralized counterparts.

Journal ArticleDOI
TL;DR: Two experiments aimed at selecting utterances from lists of responses indicate that the decoding process can be improved by optimizing the language model and the acoustic models, thus reducing the utterance error rate from 29–26% to 10–8%.
Abstract: Computer-Assisted Language Learning (CALL) applications for improving the oral skills of low-proficient learners have to cope with non-native speech that is particularly challenging. Since unconstrained non-native ASR is still problematic, a possible solution is to elicit constrained responses from the learners. In this paper, we describe experiments aimed at selecting utterances from lists of responses. The first experiment on utterance selection indicates that the decoding process can be improved by optimizing the language model and the acoustic models, thus reducing the utterance error rate from 29-26% to 10-8%. Since giving feedback on incorrectly recognized utterances is confusing, we verify the correctness of the utterance before providing feedback. The results of the second experiment on utterance verification indicate that combining duration-related features with a likelihood ratio (LR) yield an equal error rate (EER) of 10.3%, which is significantly better than the EER for the other measures in isolation.

Book ChapterDOI
17 Aug 2010
TL;DR: Stable-PUF-marking is proposed as an alternative to error correction to get reproducible (i.e. stable) outputs from physical unclonable functions (PUF).
Abstract: We propose a new technique called stable-PUF-marking as an alternative to error correction to get reproducible (i.e. stable) outputs from physical unclonable functions (PUF). The concept is based on the influence of the mismatch on the stability of the PUF-cells' output. To use this fact, cells providing a high mismatch between their crucial transistors are selected to substantially lower the error rate. To verify the concept, a statistical view to this approach is given. Furthermore, an SRAM-like PUF implementation is suggested that puts the approach into practice.

Proceedings ArticleDOI
04 May 2010
TL;DR: Experimental results show that accuracy of the action recognition is more than 95% by cross validation test on the training data set, and error rate of the PDR localization is reduced from 4% of the walking distance to 2% in the total scenario within the office environment by using the results of action recognition to adjust the estimated location.
Abstract: We present a method of estimating the location and orientation of a pedestrian which simultaneously recognizing his/her actions with a single low-cost inertial measurement unit (IMU) mounted at the waist of the user. Some of the actions other than walking locomotion, such as standing up from/sitting down on a chair, and bending over to slip through obstacles, taken by the pedestrians can be mostly seen at the particular locations where the objects and building facilities to induce the actions are placed. Conversely, by knowing the current location and its attribute about possibly taken actions, the action recognition process can be improved with the contextual information since prior knowledge about occurrence of actions is given as the attribute in the map. Additionally, when the posture (such as sitting, standing and getting to one knee) of the pedestrians is known, falsely recognized actions can be rejected. Experimental results show that accuracy of the action recognition on six types of the action (forward walking, backward walking, side stepping, sitting down on/standing up from a chair, going downstairs/upstairs and bending over) is more than 95% by cross validation test on the training data set, and the results also show that error rate of the PDR localization is reduced from 4% of the walking distance to 2% in the total scenario within the office environment by using the results of action recognition to adjust the estimated location.

Journal ArticleDOI
TL;DR: A novel string-to-dependency algorithm for statistical machine translation that employs a target dependency language model during decoding to exploit long distance word relations, which cannot be modeled with a traditional n-gram language model.
Abstract: We propose a novel string-to-dependency algorithm for statistical machine translation. This algorithm employs a target dependency language model during decoding to exploit long distance word relations, which cannot be modeled with a traditional n-gram language model. Experiments show that the algorithm achieves significant improvement in MT performance over a state-of-the-art hierarchical string-to-string system on NIST MT06 and MT08 newswire evaluation sets.

Proceedings Article
11 Jul 2010
TL;DR: The experimental results show that linguistic features alone outperform word posterior probability based confidence estimation in error detection and linguistic features can further provide complementary information when combined with word confidence scores, which collectively reduce the classification error rate and improve the F measure.
Abstract: Automatic error detection is desired in the post-processing to improve machine translation quality. The previous work is largely based on confidence estimation using system-based features, such as word posterior probabilities calculated from N-best lists or word lattices. We propose to incorporate two groups of linguistic features, which convey information from outside machine translation systems, into error detection: lexical and syntactic features. We use a maximum entropy classifier to predict translation errors by integrating word posterior probability feature and linguistic features. The experimental results show that 1) linguistic features alone outperform word posterior probability based confidence estimation in error detection; and 2) linguistic features can further provide complementary information when combined with word confidence scores, which collectively reduce the classification error rate by 18.52% and improve the F measure by 16.37%.

Journal ArticleDOI
TL;DR: The Monte Carlo-based simulation results show that the proposed approach significantly improves target detection performance, and can also be used to guide the actual threshold selection in practical sensor network implementation under certain error rate constraints.
Abstract: We propose a binary decision fusion rule that reaches a global decision on the presence of a target by integrating local decisions made by multiple sensors. Without requiring a priori probability of target presence, the fusion threshold bounds derived using Chebyshev's inequality ensure a higher hit rate and lower false alarm rate compared to the weighted averages of individual sensors. The Monte Carlo-based simulation results show that the proposed approach significantly improves target detection performance, and can also be used to guide the actual threshold selection in practical sensor network implementation under certain error rate constraints.

Book ChapterDOI
01 Mar 2010

Journal ArticleDOI
TL;DR: Applications of the results are discussed, which include optimum power allocation in spatial multiplexing systems, optimum power/time sharing to decrease or increase (jamming problem) error rate, an implication for fading channels, and optimization of a unitary-precoded OFDM system.
Abstract: Motivated by a recent surge of interest in convex optimization techniques, convexity/concavity properties of error rates of the maximum likelihood detector operating in the AWGN channel are studied and extended to frequency-flat slow-fading channels. Generic conditions are identified under which the symbol error rate (SER) is convex/concave for arbitrary multidimensional constellations. In particular, the SER is convex in SNR for any one- and two-dimensional constellation, and also in higher dimensions at high SNR. Pairwise error probability and bit error rate are shown to be convex at high SNR, for arbitrary constellations and bit mapping. Universal bounds for the SER first and second derivatives are obtained, which hold for arbitrary constellations and are tight for some of them. Applications of the results are discussed, which include optimum power allocation in spatial multiplexing systems, optimum power/time sharing to decrease or increase (jamming problem) error rate, an implication for fading channels (?fading is never good in low dimensions?) and optimization of a unitary-precoded OFDM system. For example, the error rate bounds of a unitary-precoded OFDM system with QPSK modulation, which reveal the best and worst precoding, are extended to arbitrary constellations, which may also include coding. The reported results also apply to the interference channel under Gaussian approximation, to the bit error rate when it can be expressed or approximated as a nonnegative linear combination of individual symbol error rates, and to coded systems.

Journal ArticleDOI
TL;DR: The exact bit-error rate for binary phase-shift keying and outage probability are developed for equal gain diversity and closed-form expressions of diversity order and coding gain are provided with both diversity receptions.
Abstract: Exact error rate performances are studied for coherent free-space optical communication systems under strong turbulence with diversity reception. Equal gain and selection diversity are considered as practical schemes to mitigate turbulence. The exact bit-error rate for binary phase-shift keying and outage probability are developed for equal gain diversity. Analytical expressions are obtained for the bit-error rate of differential phase-shift keying and asynchronous frequency-shift keying, as well as for outage probability using selection diversity. Furthermore, we provide the closed-form expressions of diversity order and coding gain with both diversity receptions. The analytical results are verified by computer simulations and are suitable for rapid error rates calculation.

Proceedings ArticleDOI
12 Apr 2010
TL;DR: A novel online support vector machine algorithm, compatible with accurate multidimensional link quality metrics, that is able to optimize AMC to the unique (potentially dynamic) hardware characteristics of each wireless device in selective channels is proposed.
Abstract: Optimizing the performance of adaptive modulation and coding (AMC) in practice has proven challenging. Prior research has struggled to find link quality metrics that are suitable for look-up-tables and simultaneously provide an injective mapping to error rate in wireless links that feature selective channels with hardware nonlinearities and non-Gaussian noise effects. This paper proposes a novel online support vector machine algorithm, compatible with accurate multidimensional link quality metrics, that is able to optimize AMC to the unique (potentially dynamic) hardware characteristics of each wireless device in selective channels. IEEE 802.11n simulations show that our proposed algorithm allows each individual wireless device to optimize the operating point in the rate/reliability tradeoff through frame-by-frame error evaluation. These simulations also show that our algorithm displays identical performance to alternative online AMC algorithms while drastically reducing complexity.

Journal ArticleDOI
TL;DR: Three modifications to the MT training data are presented to improve the accuracy of a state-of-the-art syntax MT system: re-structuring changes the syntactic structure of training parse trees to enable reuse of substructures; re-labeling alters bracket labels to enrich rule application context; and re-aligning unifies word alignment across sentences to remove bad word alignments and refine good ones.
Abstract: This article shows that the structure of bilingual material from standard parsing and alignment tools is not optimal for training syntax-based statistical machine translation (SMT) systems. We present three modifications to the MT training data to improve the accuracy of a state-of-the-art syntax MT system: re-structuring changes the syntactic structure of training parse trees to enable reuse of substructures; re-labeling alters bracket labels to enrich rule application context; and re-aligning unifies word alignment across sentences to remove bad word alignments and refine good ones. Better structures, labels, and word alignments are learned by the EM algorithm. We show that each individual technique leads to improvement as measured by BLEU, and we also show that the greatest improvement is achieved by combining them. We report an overall 1.48 BLEU improvement on the NIST08 evaluation set over a strong baseline in Chinese/English translation.