scispace - formally typeset
Open Access

LaBB-CAT: an Annotation Store

Robert Fromont, +1 more
- pp 113-117
Reads0
Chats0
TLDR
The annotation graph framework provides the new software, “LaBB-CAT”, greater flexibility for automatic and manual annotation of corpus data at various independent levels of granularity, and allows more sophisticated annotation structures, opening up new possibilities for corpus mining and conversion between tool formats.
Abstract
“ONZE Miner”, an open-source tool for storing and automatically annotating Transcriber transcripts, has been redeveloped to use “annotation graphs” as its data model. The annotation graph framework provides the new software, “LaBB-CAT”, greater flexibility for automatic and manual annotation of corpus data at various independent levels of granularity, and allows more sophisticated annotation structures, opening up new possibilities for corpus mining and conversion between tool formats.

read more

Citations
More filters
Proceedings ArticleDOI

Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi.

TL;DR: The Montreal Forced Aligner (MFA) is an update to the Prosodylab-Aligner, and maintains its key functionality of trainability on new data, as well as incorporating improved architecture (triphone acoustic models and speaker adaptation), and other features.
Journal ArticleDOI

Tracking word frequency effects through 130 years of sound change.

TL;DR: The changing pronunciation of New Zealand English in a large set of recordings of speakers born over a 130 year period is analyzed and it is shown that low frequency words were at the forefront of these changes and higher frequency words lagged behind.
Journal ArticleDOI

Emu-sdms

TL;DR: The next iteration of the EMU system is introduced, although based on the core concepts of the legacy system, is a newly designed and almost entirely rewritten set of modern spoken language database management tools.
Journal ArticleDOI

The private life of stops: VOT in a real-time corpus of spontaneous Glaswegian

TL;DR: The authors used a semi-automated procedure for analyzing positive voice onset time (VOT) in spontaneous speech, and applied it to stressed syllable-initial stops from a real-and apparent-time corpus of naturallyoccurring spontaneous Glaswegian vernacular speech.
Journal Article

Gemination and degemination in English prefixation: Phonetic evidence for morphological organization

TL;DR: The authors investigated the gemination behavior of the English prefixes un-, negative in- and locative in-and found that the more segmentable the prefix the longer the nasal duration.
References
More filters
Proceedings ArticleDOI

Accurate Unlexicalized Parsing

TL;DR: It is demonstrated that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar.
Posted Content

A Formal Framework for Linguistic Annotation

TL;DR: The authors survey a wide variety of existing annotation formats and demonstrate a common conceptual core, the annotation graph, which provides a formal framework for constructing, maintaining and searching linguistic annotations, while remaining consistent with many alternative data structures and file formats.
Proceedings Article

Annotation by category - ELAN and ISO DCR

TL;DR: The first steps that have been taken to provide users of the multimedia annotation tool ELAN, with the means to create references from tiers and annotations to data categories defined in the ISO Data Category Registry are described.
Related Papers (5)