Open Access
LaBB-CAT: an Annotation Store
Robert Fromont,Jennifer Hay +1 more
- pp 113-117
Reads0
Chats0
TLDR
The annotation graph framework provides the new software, “LaBB-CAT”, greater flexibility for automatic and manual annotation of corpus data at various independent levels of granularity, and allows more sophisticated annotation structures, opening up new possibilities for corpus mining and conversion between tool formats.Abstract:
“ONZE Miner”, an open-source tool for storing and automatically annotating Transcriber transcripts, has been redeveloped to use “annotation graphs” as its data model. The annotation graph framework provides the new software, “LaBB-CAT”, greater flexibility for automatic and manual annotation of corpus data at various independent levels of granularity, and allows more sophisticated annotation structures, opening up new possibilities for corpus mining and conversion between tool formats.read more
Citations
More filters
Proceedings ArticleDOI
Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi.
TL;DR: The Montreal Forced Aligner (MFA) is an update to the Prosodylab-Aligner, and maintains its key functionality of trainability on new data, as well as incorporating improved architecture (triphone acoustic models and speaker adaptation), and other features.
Journal ArticleDOI
Tracking word frequency effects through 130 years of sound change.
TL;DR: The changing pronunciation of New Zealand English in a large set of recordings of speakers born over a 130 year period is analyzed and it is shown that low frequency words were at the forefront of these changes and higher frequency words lagged behind.
Journal ArticleDOI
Emu-sdms
TL;DR: The next iteration of the EMU system is introduced, although based on the core concepts of the legacy system, is a newly designed and almost entirely rewritten set of modern spoken language database management tools.
Journal ArticleDOI
The private life of stops: VOT in a real-time corpus of spontaneous Glaswegian
TL;DR: The authors used a semi-automated procedure for analyzing positive voice onset time (VOT) in spontaneous speech, and applied it to stressed syllable-initial stops from a real-and apparent-time corpus of naturallyoccurring spontaneous Glaswegian vernacular speech.
Journal Article
Gemination and degemination in English prefixation: Phonetic evidence for morphological organization
Sonia Ben Hedia,Ingo Plag +1 more
TL;DR: The authors investigated the gemination behavior of the English prefixes un-, negative in- and locative in-and found that the more segmentable the prefix the longer the nasal duration.
References
More filters
Proceedings ArticleDOI
Accurate Unlexicalized Parsing
Dan Klein,Christopher D. Manning +1 more
TL;DR: It is demonstrated that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar.
Posted Content
A Formal Framework for Linguistic Annotation
Steven Bird,Mark Liberman +1 more
TL;DR: The authors survey a wide variety of existing annotation formats and demonstrate a common conceptual core, the annotation graph, which provides a formal framework for constructing, maintaining and searching linguistic annotations, while remaining consistent with many alternative data structures and file formats.
Proceedings Article
Annotation by category - ELAN and ISO DCR
Han Sloetjes,Peter Wittenburg +1 more
TL;DR: The first steps that have been taken to provide users of the multimedia annotation tool ELAN, with the means to create references from tiers and annotations to data categories defined in the ISO Data Category Registry are described.