Home
/
Authors
/
Steve Young

Author

Steve Young

Other affiliations: University of Cambridge

Bio: Steve Young is an academic researcher from Apple Inc.. The author has contributed to research in topics: Markov decision process & Liquid-crystal display. The author has an hindex of 4, co-authored 7 publications receiving 264 citations. Previous affiliations of Steve Young include University of Cambridge.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

[...]

Nikola Mrksic¹, Nikola Mrksic², Ivan Vulić², Diarmuid Ó Séaghdha¹, Ira Leviant³, Roi Reichart³, Milica Gasic², Anna Korhonen², Steve Young², Steve Young¹ - Show less +6 more•Institutions (3)

Apple Inc.¹, University of Cambridge², Technion – Israel Institute of Technology³

01 Sep 2017-Transactions of the Association for Computational Linguistics

TL;DR: This paper proposed an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources. But the method can make use of existing cross-lingual lexicons to construct high-quality vector spaces for a plethora of different languages, facilitating semantic transfer from high-to lower-resource ones.

...read moreread less

Abstract: We present Attract-Repel, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources Attract-Repel facilitates the use of constraints from mono- and cross-lingual resources, yielding semantically specialized cross-lingual vector spaces Our evaluation shows that the method can make use of existing cross-lingual lexicons to construct high-quality vector spaces for a plethora of different languages, facilitating semantic transfer from high- to lower-resource ones The effectiveness of our approach is demonstrated with state-of-the-art results on semantic similarity datasets in six languages We next show that Attract-Repel-specialized vectors boost performance in the downstream task of dialogue state tracking (DST) across multiple languages Finally, we show that cross-lingual vector spaces produced by our algorithm facilitate the training of multilingual DST models, which brings further performance improvements

...read moreread less

177 citations

Patent•

LCD error detection system

[...]

Steve Young¹, Nigel Foster¹•Institutions (1)

Apple Inc.¹

03 Jun 1988

TL;DR: In this article, an error detection system for liquid crystal display (LCD) devices is presented, which consists of a photo-scanning device coupled to a computer and a LCD display driver.

...read moreread less

Abstract: The present invention is an error detection system for liquid crystal display (LCD) devices. The system comprises a photo-scanning device coupled to a computer and a LCD display driver. Faulty LCD cells are detected by using the photo-scanner to produce two images. The first image is produced by displaying a predetermined pattern on the LCD device. This pattern is inverted to produce the second image. By analyzing these two images, faulty LCD cells are located. The results of the analysis give the location of the faulty LCD cells.

...read moreread less

69 citations

POMDP-Based Statistical Spoken Dialog Systems: A Review This paper presents the theory and practice of belief tracking, policy optimization, parameter estimation, and fast learning.

[...]

Steve Young, Milica Gasic, Blaise Thomson, Jason D. Williams

01 Jan 2013

TL;DR: Partially observable Markov decision processes (POMDPs) as discussed by the authors have been used to optimize the policy via a reward-driven process to reduce the cost of laboriously handcrafting complex dialog managers and provide robustness against the errors created by speech re- cognizers.

...read moreread less

Abstract: Statistical dialog systems (SDSs) are motivated by the need for a data-driven framework that reduces the cost of laboriously handcrafting complex dialog managers and that provides robustness against the errors created by speech re- cognizers operating in noisy environments. By including an explicit Bayesian model of uncertainty and by optimizing the policy via a reward-driven process, partially observable Markov decision processes (POMDPs) provide such a frame- work. However, exact model representation and optimization is computationally intractable. Hence, the practical application of POMDP-based systems requires efficient algorithms and carefully constructed approximations. This review article pro- vides an overview of the current state of the art in the devel- opment of POMDP-based spoken dialog systems.

...read moreread less

31 citations

Dataset•

Research data supporting "Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems"

[...]

Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Pei-Hao Su, David Vandyke, Steve Young - Show less +2 more

06 Oct 2015

TL;DR: This dataset is in JSON format and contains log files of interactions between a turn-taking spoken dialogue system and Amazon Mechanical turkers, collected from previous live trials.

...read moreread less

Abstract: This dataset is in JSON format and contains log files of interactions between a turn-taking spoken dialogue system and Amazon Mechanical turkers, collected from our previous live trials. It includes two application domains: San Francisco restaurants and hotels, each of them has around 1000 logs. The user responses are 1-best ASR hypothesis recognised by our ASR system, and the system responses were collected by running another round of data collection on AMT. The number of total collected system responses is around 5.1K for each domain. All users are anonymous.

...read moreread less

13 citations

Dataset•DOI•

Research data supporting "Discriminative spoken language understanding using word confusion networks"

[...]

Matthew Henderson, Milica Gasic, Blaise Thomson, Pirros Tsiakoulis, Kai Yu, Steve Young - Show less +2 more

01 Dec 2012

TL;DR: In this article, the authors reproduce the results in the paper, Discriminative Spoken Language Understanding Using Word Confusion Networks(Henderson et al.), or to train and test new semantic decoders.

...read moreread less

Abstract: This package contains the necessary data to reproduce the results in the paper, Discriminative Spoken Language Understanding Using Word Confusion Networks (Henderson et al.), or to train and test new semantic decoders.

...read moreread less

3 citations

Cited by

PDF

Open Access

More filters

Posted Content•

MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling

[...]

Paweł Budzianowski¹, Tsung-Hsien Wen¹, Bo-Hsiang Tseng¹, Iñigo Casanueva¹, Stefan Ultes¹, Osman Ramadan, Milica Gasic¹ - Show less +3 more•Institutions (1)

University of Cambridge¹

29 Sep 2018-arXiv: Computation and Language

TL;DR: The Multi-Domain Wizard-of-Oz dataset (MultiWOZ) as discussed by the authors is a fully-labeled collection of human-human written conversations spanning over multiple domains and topics.

...read moreread less

Abstract: Even though machine learning has become the major scene in dialogue research community, the real breakthrough has been blocked by the scale of data available. To address this fundamental obstacle, we introduce the Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written conversations spanning over multiple domains and topics. At a size of $10$k dialogues, it is at least one order of magnitude larger than all previous annotated task-oriented corpora. The contribution of this work apart from the open-sourced dataset labelled with dialogue belief states and dialogue actions is two-fold: firstly, a detailed description of the data collection procedure along with a summary of data structure and analysis is provided. The proposed data-collection pipeline is entirely based on crowd-sourcing without the need of hiring professional annotators; secondly, a set of benchmark results of belief tracking, dialogue act and response generation is reported, which shows the usability of the data and sets a baseline for future studies.

...read moreread less

623 citations

Journal Article•DOI•

A Rainfall-Runoff Model With LSTM-Based Sequence-to-Sequence Learning

[...]

Zhongrun Xiang¹, Jun Yan, Ibrahim Demir¹•Institutions (1)

University of Iowa¹

01 Jan 2020-Water Resources Research

TL;DR: The LSTM‐seq2seq model shows sufficient predictive power and could be used to improve forecast accuracy in short‐term flood forecast applications and the seq2seq method was demonstrated to be an effective method for time series predictions in hydrology.

...read moreread less

262 citations

Proceedings Article•DOI•

Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog

[...]

Sebastian Schuster¹, Sonal Gupta², Rushin Shah², Michael Lewis²•Institutions (2)

Max F. Perutz Laboratories¹, Facebook²

01 Jun 2019

TL;DR: This paper presents a new data set of 57k annotated utterances in English, Spanish, Spanish and Thai and uses this data set to evaluate three different cross-lingual transfer methods, finding that given several hundred training examples in the the target language, the latter two methods outperform translating the training data.

...read moreread less

Abstract: One of the first steps in the utterance interpretation pipeline of many task-oriented conversational AI systems is to identify user intents and the corresponding slots. Since data collection for machine learning models for this task is time-consuming, it is desirable to make use of existing data in a high-resource language to train models in low-resource languages. However, development of such models has largely been hindered by the lack of multilingual training data. In this paper, we present a new data set of 57k annotated utterances in English (43k), Spanish (8.6k) and Thai (5k) across the domains weather, alarm, and reminder. We use this data set to evaluate three different cross-lingual transfer methods: (1) translating the training data, (2) using cross-lingual pre-trained embeddings, and (3) a novel method of using a multilingual machine translation encoder as contextual word representations. We find that given several hundred training examples in the the target language, the latter two methods outperform translating the training data. Further, in very low-resource settings, multilingual contextual word representations give better results than using cross-lingual static embeddings. We also compare the cross-lingual methods to using monolingual resources in the form of contextual ELMo representations and find that given just small amounts of target language data, this method outperforms all cross-lingual methods, which highlights the need for more sophisticated cross-lingual methods.

...read moreread less

238 citations

Posted Content•

BAE: BERT-based Adversarial Examples for Text Classification

[...]

Siddhant Garg¹, Goutham Ramakrishnan²•Institutions (2)

Amazon.com¹, University of Wisconsin-Madison²

04 Apr 2020-arXiv: Computation and Language

TL;DR: This work presents BAE, a powerful black box attack for generating grammatically correct and semantically coherent adversarial examples, and shows that BAE performs a stronger attack on three widely used models for seven text classification datasets.

...read moreread less

Abstract: Modern text classification models are susceptible to adversarial examples, perturbed versions of the original text indiscernible by humans which get misclassified by the model. Recent works in NLP use rule-based synonym replacement strategies to generate adversarial examples. These strategies can lead to out-of-context and unnaturally complex token replacements, which are easily identifiable by humans. We present BAE, a black box attack for generating adversarial examples using contextual perturbations from a BERT masked language model. BAE replaces and inserts tokens in the original text by masking a portion of the text and leveraging the BERT-MLM to generate alternatives for the masked tokens. Through automatic and human evaluations, we show that BAE performs a stronger attack, in addition to generating adversarial examples with improved grammaticality and semantic coherence as compared to prior work.

...read moreread less

224 citations

Proceedings Article•DOI•

From zero to hero: On the limitations of zero-shot language transfer with multilingual transformers

[...]

Anne Lauscher¹, Vinit Ravishankar¹, Ivan Vulić¹, Goran Glavaš²•Institutions (2)

University of Mannheim¹, University of Cambridge²

01 May 2020

TL;DR: It is demonstrated that the inexpensive few-shot transfer (i.e., additional fine-tuning on a few target-language instances) is surprisingly effective across the board, warranting more research efforts reaching beyond the limiting zero-shot conditions.

...read moreread less

Abstract: Massively multilingual transformers (MMTs) pretrained via language modeling (e.g., mBERT, XLM-R) have become a default paradigm for zero-shot language transfer in NLP, offering unmatched transfer performance. Current evaluations, however, verify their efficacy in transfers (a) to languages with sufficiently large pretraining corpora, and (b) between close languages. In this work, we analyze the limitations of downstream language transfer with MMTs, showing that, much like cross-lingual word embeddings, they are substantially less effective in resource-lean scenarios and for distant languages. Our experiments, encompassing three lower-level tasks (POS tagging, dependency parsing, NER) and two high-level tasks (NLI, QA), empirically correlate transfer performance with linguistic proximity between source and target languages, but also with the size of target language corpora used in MMT pretraining. Most importantly, we demonstrate that the inexpensive few-shot transfer (i.e., additional fine-tuning on a few target-language instances) is surprisingly effective across the board, warranting more research efforts reaching beyond the limiting zero-shot conditions.

...read moreread less

218 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Collapse