Moving beyond the 'beads-on-a-string' model of speech
Citations
2,817 citations
Cites background from "Moving beyond the 'beads-on-a-strin..."
...Likewise, it has also been well understood for a long time that the use of phonetic or its finer state sequences, even with contextual dependency, in engineering speech recognition systems, is inadequate in representing such rich structure [86, 273, 355], and thus leaving a promising open direction to improve the speech recognition systems’ performance....
[...]
244 citations
207 citations
Cites background from "Moving beyond the 'beads-on-a-strin..."
...The standard approach to acoustic modeling continues to be the “beads on a string” model (Ostendorf, 1999) in which the speech signal is represented as a concatenation of phones....
[...]
...…Kingdom Karen Livescu MIT Computer Science and Artificial Intelligence Laboratory 32 Vassar Street, Room 32-G482 Cambridge MA 02139 USA Erik McDermott Nippon Telegraph and Telephone Corporation, NTT Communication Science Laboratories 2–4 Hikari-dai, Seika-cho, Soraku-gun Kyoto-fu 619-0237…...
[...]
126 citations
Cites methods from "Moving beyond the 'beads-on-a-strin..."
...In the field of ASR, AFs are often put forward as a more flexible alternative (Kirchhoff, 1999; Wester, 2003; Wester et al., 2001 )t o modelling the variation in speech using the standard ‘beads-on-a-string’ paradigm ( Ostendorf, 1999 ), in which the acoustic signal is described in terms of (linear sequences of) phones, and words as phone sequences....
[...]
...…a more flexible alternative (Kirchhoff, 1999; Wester, 2003; Wester et al., 2001) to modelling the variation in speech using the standard ‘beads-on-a-string’ paradigm (Ostendorf, 1999), in which the acoustic signal is described in terms of (linear sequences of) phones, and words as phone sequences....
[...]
109 citations
Cites background from "Moving beyond the 'beads-on-a-strin..."
...the “beadson-a-string” paradigm [1], makes it extremely difficult to model the variation that is present in spontaneous, conversational speech....
[...]
References
1,043 citations
"Moving beyond the 'beads-on-a-strin..." refers background in this paper
...First, it has been observed that certain sets of features tend to spread or modify together in groups that can be characterized by a hierarchical organization [34]....
[...]
781 citations
"Moving beyond the 'beads-on-a-strin..." refers background in this paper
...[20], which can learn both contextual and temporal structure (i....
[...]
680 citations
"Moving beyond the 'beads-on-a-strin..." refers background in this paper
...Improved acoustic models may require additional layers of hidden states at different time scales, mixed memory Markov models [41], a mixed continuous and discrete hidden state [42], a discrete event model [43], and/or other alternatives....
[...]
551 citations
"Moving beyond the 'beads-on-a-strin..." refers background in this paper
...However, such phenomena may be more directly described in terms of prosodic structure [38], i....
[...]
373 citations
"Moving beyond the 'beads-on-a-strin..." refers background in this paper
...Such short segments are quite frequent, as evidenced by distributional data in hand-labeled phonetic transcriptions [7] and by the high percentage of phones mapped to the minimum allowed duration in a forced alignment using a single-pronunciation dictionary (observed in several studies)....
[...]
...syllable onsets are most often preserved and codas are most often deleted [7]....
[...]