K
Kurt Shuster
Researcher at Facebook
Publications - 39
Citations - 3894
Kurt Shuster is an academic researcher from Facebook. The author has contributed to research in topics: Computer science & Conversation. The author has an hindex of 16, co-authored 28 publications receiving 1561 citations.
Papers
More filters
Journal Article
OPT: Open Pre-trained Transformer Language Models
Susan Zhang,Stephen Roller,Naman Goyal,Mikel Artetxe,Moya Chen,Shuohui Chen,Christopher Dewan,Mona Zidan Diab,Xian Li,Xi Victoria Lin,Todor Mihaylov,Myle Ott,Sam Shleifer,Kurt Shuster,Daniel Simig,Punit Singh Koura,Anjali Sridhar,Tianlu Wang,Luke Zettlemoyer +18 more
TL;DR: This work presents Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which they aim to fully and responsibly share with interested researchers.
Proceedings Article
Wizard of Wikipedia: Knowledge-Powered Conversational Agents
TL;DR: The best performing dialogue models are able to conduct knowledgeable discussions on open-domain topics as evaluated by automatic metrics and human evaluations, while a new benchmark allows for measuring further improvements in this important research direction.
Posted Content
Recipes for building an open-domain chatbot
Stephen Roller,Emily Dinan,Naman Goyal,Da Ju,Mary Williamson,Yinhan Liu,Jing Xu,Myle Ott,Kurt Shuster,Eric Michael Smith,Y-Lan Boureau,Jason Weston +11 more
TL;DR: Human evaluations show the best models outperform existing approaches in multi-turn dialogue on engagingness and humanness measurements, and the limitations of this work are discussed by analyzing failure cases of the models.
Book ChapterDOI
The Second Conversational Intelligence Challenge (ConvAI2)
Emily Dinan,Varvara Logacheva,Valentin Malykh,Alexander H. Miller,Kurt Shuster,Jack Urbanek,Douwe Kiela,Arthur Szlam,Iulian Vlad Serban,Ryan Lowe,Ryan Lowe,Shrimai Prabhumoye,Alan W. Black,Alexander I. Rudnicky,Jason D. Williams,Joelle Pineau,Joelle Pineau,Mikhail S. Burtsev,Jason Weston +18 more
TL;DR: To improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics like perplexity to measure the performance across sequences of utterances (conversations)—in terms of repetition, consistency and balance of dialogue acts.
Proceedings Article
Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
TL;DR: This work develops a new transformer architecture, the Poly-encoder, that learns global rather than token level self-attention features, and shows that the models achieve state-of-the-art results on four tasks.