Normalizing SMS: are Two Metaphors Better than One ?
Citations
1,351 citations
Cites background from "Normalizing SMS: are Two Metaphors ..."
...Like SMS (Kobus et al., 2008), tweets are particularly terse and difficult (See Table 1)....
[...]
187 citations
Cites methods from "Normalizing SMS: are Two Metaphors ..."
...We aim for a robust text normalization system with “broad coverage”, i.e., for any user-created nonstandard token, the system should be able to restore the correct word within its top n candidates (n = 1, 3, 10...)....
[...]
...(Kobus et al., 2008) showed that using a statistical MT system in combination with an analogy of the ASR system improved performance in French SMS normalization....
[...]
168 citations
[...]
164 citations
Cites background from "Normalizing SMS: are Two Metaphors ..."
...…use an idiosyncratic language subset with abbreviations, phonetic contractions, bad punctuation, emoticons, etc., which is different to the more traditional written 4http://razor.sourceforge.net/ 5http://www.cloudmark.com/ language more typically used in emails (Kobus et al., 2008; Ling, 2005)....
[...]
References
21,126 citations
6,008 citations
"Normalizing SMS: are Two Metaphors ..." refers methods in this paper
...…induce, based on statistical principles (Brown et al., 1990), an automatic word alignment of SMS tokens with their normalized counterparts; Moses (Koehn et al., 2007) is used to learn the various parameters of the phrase-based model, to optimize the weight combination and to perform the…...
[...]
...Preliminary experiments suggest that using n-best list outputs from Moses instead of just the one best could buy us an small additional WER decrease....
[...]
...Giza++ (Och and Ney, 2003) is used to induce, based on statistical principles (Brown et al., 1990), an automatic word alignment of SMS tokens with their normalized counterparts; Moses (Koehn et al., 2007) is used to learn the various parameters of the phrase-based model, to optimize the weight combination and to perform the translation using a multi-stack search algorithm; the SRI language model toolkit (Stolcke, 2002) is finally used to estimate statistical language models....
[...]
4,904 citations
4,402 citations
1,860 citations
"Normalizing SMS: are Two Metaphors ..." refers methods in this paper
...Giza++ (Och and Ney, 2003) is used to induce, based on statistical principles (Brown et al., 1990), an automatic word alignment of SMS tokens with their normalized counterparts; Moses (Koehn et al....
[...]
...Giza++ (Och and Ney, 2003) is used to induce, based on statistical principles (Brown et al., 1990), an automatic word alignment of SMS tokens with their normalized counterparts; Moses (Koehn et al., 2007) is used to learn the various parameters of the phrase-based model, to optimize the weight…...
[...]