Open Access
Using Pause Information for More Accurate Entity Recognition
Sahas Dendukuri,Pooja Chitkara,Joel Ruben Antony Moniz,Xiao Yang,Manos Tsagkias,Stephen Pulman +5 more
- pp 243-250
Reads0
Chats0
TLDR
This article showed that the linguistic observation on pauses can be used to improve accuracy in machine-learnt language understanding tasks and applied pause duration to enrich contextual embeddings to improve shallow parsing of entities.Abstract:
Entity tags in human-machine dialog are integral to natural language understanding (NLU) tasks in conversational assistants. However, current systems struggle to accurately parse spoken queries with the typical use of text input alone, and often fail to understand the user intent. Previous work in linguistics has identified a cross-language tendency for longer speech pauses surrounding nouns as compared to verbs. We demonstrate that the linguistic observation on pauses can be used to improve accuracy in machine-learnt language understanding tasks. Analysis of pauses in French and English utterances from a commercial voice assistant shows the statistically significant difference in pause duration around multi-token entity span boundaries compared to within entity spans. Additionally, in contrast to text-based NLU, we apply pause duration to enrich contextual embeddings to improve shallow parsing of entities. Results show that our proposed novel embeddings improve the relative error rate by up to 8% consistently across three domains for French, without any added annotation or alignment costs to the parser.read more
References
More filters
Proceedings ArticleDOI
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Proceedings ArticleDOI
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
TL;DR: The CoNLL-2003 shared task on NER as mentioned in this paper was the first NER task with language-independent named entity recognition (NER) data sets and evaluation method, and a general overview of the systems that participated in the task and their performance.
Proceedings Article
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
TL;DR: This work presents two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT, and uses a self-supervised loss that focuses on modeling inter-sentence coherence.
Book ChapterDOI
Text Chunking Using Transformation-Based Learning
Lance Ramshaw,Mitchell Marcus +1 more
TL;DR: This work has shown that the transformation-based learning approach can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive “baseNP” chunks.
Proceedings ArticleDOI
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
TL;DR: This article proposed Mockingjay, a new speech representation learning approach, where bidirectional Transformer encoders are pre-trained on a large amount of unlabeled speech.