Home
/
Authors
/
Angelina McMillan-Major

Author

Angelina McMillan-Major

Bio: Angelina McMillan-Major is an academic researcher. The author has contributed to research in topics: Natural language generation & Benchmark (computing). The author has an hindex of 3, co-authored 6 publications receiving 61 citations.

Papers

PDF

Open Access

More filters

Posted Content•

The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics.

[...]

Sebastian Gehrmann¹, Tosin P. Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Chinenye Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur P. Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou - Show less +52 more•Institutions (1)

Google¹

02 Feb 2021-arXiv: Computation and Language

TL;DR: GEM as discussed by the authors is a living benchmark for natural language generation (NLG), its Evaluation and Metrics, which provides an environment in which models can easily be applied to a wide set of tasks and in which evaluation strategies can be tested.

...read moreread less

Abstract: We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on a constantly evolving ecosystem of automated metrics, datasets, and human evaluation standards. Due to this moving target, new models often still evaluate on divergent anglo-centric corpora with well-established, but flawed, metrics. This disconnect makes it challenging to identify the limitations of current models and opportunities for progress. Addressing this limitation, GEM provides an environment in which models can easily be applied to a wide set of tasks and in which evaluation strategies can be tested. Regular updates to the benchmark will help NLG research become more multilingual and evolve the challenge alongside models. This paper serves as the description of the data for which we are organizing a shared task at our ACL 2021 Workshop and to which we invite the entire NLG community to participate.

...read moreread less

44 citations

Posted Content•

Datasets: A Community Library for Natural Language Processing

[...]

Quentin Lhoest, Albert Villanova del Moral, Yacine Jernite¹, Abhishek Thakur, Patrick von Platen, Suraj Patil, Julien Chaumond, Mariama Drame, Julien Plu², Lewis Tunstall, Joe Davison, Mario Sasko, Gunjan Chhablani³, Bhavitvya Malik, Simon Brandeis, Teven Le Scao, Victor Sanh, Canwen Xu⁴, Nicolas Patry, Angelina McMillan-Major, Philipp Schmid, Sylvain Gugger, Clement Delangue, Théo Matussière, Lysandre Debut, Stas Bekman, Pierric Cistac, Thibault Goehringer, Victor Mustar, François Lagunas, Alexander M. Rush⁵, Thomas Wolf⁶ - Show less +28 more•Institutions (6)

New York University¹, Institut Eurécom², Birla Institute of Technology and Science³, University of California, San Diego⁴, Cornell University⁵, Bayer⁶

07 Sep 2021-arXiv: Computation and Language

TL;DR: Datasets as discussed by the authors is a community library for contemporary NLP designed to support the scale, variety, and quantity of publicly-available NLP datasets, as well as new tasks, larger models, and novel benchmarks.

...read moreread less

Abstract: The scale, variety, and quantity of publicly-available NLP datasets has grown rapidly as researchers propose new tasks, larger models, and novel benchmarks. Datasets is a community library for contemporary NLP designed to support this ecosystem. Datasets aims to standardize end-user interfaces, versioning, and documentation, while providing a lightweight front-end that behaves similarly for small datasets as for internet-scale corpora. The design of the library incorporates a distributed, community-driven approach to adding datasets and documenting usage. After a year of development, the library now includes more than 650 unique datasets, has more than 250 contributors, and has helped support a variety of novel cross-dataset research projects and shared tasks. The library is available at this https URL.

...read moreread less

28 citations

Proceedings Article•DOI•

The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics

[...]

Sebastian Gehrmann¹, Tosin P. Adewumi², Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Anuoluwapo Aremu, Antoine Bosselut³, Khyathi Raghavi Chandu⁴, Miruna-Adriana Clinciu⁵, Dipanjan Das¹, Kaustubh Dhole⁶, Wanyu Du⁷, Esin Durmus⁸, Ondřej Dušek⁹, Chris Chinenye Emezue¹⁰, Varun Gangal⁴, Cristina Garbacea¹¹, Tatsunori Hashimoto¹², Yufang Hou¹³, Yacine Jernite¹⁴, Harsh Jhamtani⁴, Yangfeng Ji⁷, Shailza Jolly¹⁵, Mihir Kale¹, Dhruv Kumar¹⁶, Faisal Ladhak⁸, Aman Madaan⁴, Mounica Maddela, Khyati Mahajan¹⁷, Saad Mahamood¹⁸, Bodhisattwa Prasad Majumder¹⁹, Pedro Henrique Martins²⁰, Angelina McMillan-Major, Simon Mille²¹, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan¹, Vitaly Nikolaev¹, Andre Niyongabo Rubungo, Salomey Osei²², Ankur P. Parikh¹, Laura Perez-Beltrachini²³, Niranjan Ramesh Rao, Vikas Raunak⁴, Juan Diego Rodriguez²⁴, Sashank Santhanam¹⁷, João Sedoc¹⁴, Thibault Sellam¹, Samira Shaikh¹⁷, Anastasia Shimorina²⁵, Marco Antonio Sobrevilla Cabezudo²⁶, Hendrik Strobelt¹³, Nishant Subramani²⁷, Wei Xu²⁸, Diyi Yang²⁹, Akhila Yerukola¹², Jiawei Zhou - Show less +52 more•Institutions (29)

Google¹, Luleå University of Technology², Allen Institute for Artificial Intelligence³, Carnegie Mellon University⁴, Heriot-Watt University⁵, Tata Institute of Fundamental Research⁶, University of Virginia⁷, Columbia University⁸, Charles University in Prague⁹, Technische Universität München¹⁰, University of Michigan¹¹, Stanford University¹², IBM¹³, New York University¹⁴, Kaiserslautern University of Technology¹⁵, University of Waterloo¹⁶, University of North Carolina at Charlotte¹⁷, University of Aberdeen¹⁸, University of California, San Diego¹⁹, University of Lisbon²⁰, Pompeu Fabra University²¹, African Institute for Mathematical Sciences²², University of Edinburgh²³, University of Texas at Austin²⁴, Centre national de la recherche scientifique²⁵, University of São Paulo²⁶, Intel²⁷, Georgia Institute of Technology²⁸, Association for Computing Machinery²⁹

02 Feb 2021

...read moreread less

Abstract: We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on a constantly evolving ecosystem of automated metrics, datasets, and human evaluation standards. Due to this moving target, new models often still evaluate on divergent anglo-centric corpora with well-established, but flawed, metrics. This disconnect makes it challenging to identify the limitations of current models and opportunities for progress. Addressing this limitation, GEM provides an environment in which models can easily be applied to a wide set of tasks and in which evaluation strategies can be tested. Regular updates to the benchmark will help NLG research become more multilingual and evolve the challenge alongside models. This paper serves as the description of the data for the 2021 shared task at the associated GEM Workshop.

...read moreread less

26 citations

Proceedings Article•DOI•

Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards

[...]

Angelina McMillan-Major, Salomey Osei¹, Juan Diego Rodriguez², Pawan Sasanka Ammanamanchi, Sebastian Gehrmann³, Yacine Jernite⁴ - Show less +2 more•Institutions (4)

Kwame Nkrumah University of Science and Technology¹, University of Texas at Austin², Google³, New York University⁴

16 Aug 2021-arXiv: Databases

TL;DR: In this paper, the authors present two case studies of efforts that aim to develop reusable documentation templates for datasets and models, including the HuggingFace data card, a general purpose card for datasets in NLP, and the GEM benchmark data and model cards with a focus on natural language generation.

...read moreread less

Abstract: Developing documentation guidelines and easy-to-use templates for datasets and models is a challenging task, especially given the variety of backgrounds, skills, and incentives of the people involved in the building of natural language processing (NLP) tools. Nevertheless, the adoption of standard documentation practices across the field of NLP promotes more accessible and detailed descriptions of NLP datasets and models, while supporting researchers and developers in reflecting on their work. To help with the standardization of documentation, we present two case studies of efforts that aim to develop reusable documentation templates -- the HuggingFace data card, a general purpose card for datasets in NLP, and the GEM benchmark data and model cards with a focus on natural language generation. We describe our process for developing these templates, including the identification of relevant stakeholder groups, the definition of a set of guiding principles, the use of existing templates as our foundation, and iterative revisions based on feedback.

...read moreread less

5 citations

Proceedings Article•

Datasets: A Community Library for Natural Language Processing

[...]

New York University¹, Institut Eurécom², Birla Institute of Technology and Science³, University of California, San Diego⁴, Cornell University⁵, Bayer⁶

01 Nov 2021

...read moreread less

2 citations

Cited by

PDF

Open Access

More filters

Posted Content•

Evaluation of Text Generation: A Survey

[...]

Asli Celikyilmaz¹, Elizabeth Clark², Jianfeng Gao³•Institutions (3)

Facebook¹, University of Washington², Microsoft³

26 Jun 2020-arXiv: Computation and Language

TL;DR: This paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models.

...read moreread less

Abstract: The paper surveys evaluation methods of natural language generation (NLG) systems that have been developed in the last few years We group NLG evaluation methods into three categories: (1) human-centric evaluation metrics, (2) automatic metrics that require no training, and (3) machine-learned metrics For each category, we discuss the progress that has been made and the challenges still being faced, with a focus on the evaluation of recently proposed NLG tasks and neural NLG models We then present two examples for task-specific NLG evaluations for automatic text summarization and long text generation, and conclude the paper by proposing future research directions

...read moreread less

186 citations

Posted Content•

A Survey of Knowledge-Enhanced Text Generation.

[...]

Wenhao Yu¹, Chenguang Zhu², Zaitang Li³, Zhiting Hu, Qingyun Wang⁴, Heng Ji⁴, Meng Jiang¹ - Show less +3 more•Institutions (4)

University of Notre Dame¹, Microsoft², The Chinese University of Hong Kong³, University of Illinois at Urbana–Champaign⁴

09 Oct 2020-arXiv: Computation and Language

TL;DR: A comprehensive review of the research on knowledge-enhanced text generation over the past five years is presented, which includes two parts: (i) general methods and architectures for integrating knowledge into text generation; (ii) specific techniques and applications according to different forms of knowledge data.

...read moreread less

Abstract: The goal of text generation is to make machines express in human language. It is one of the most important yet challenging tasks in natural language processing (NLP). Since 2014, various neural encoder-decoder models pioneered by Seq2Seq have been proposed to achieve the goal by learning to map input text to output text. However, the input text alone often provides limited knowledge to generate the desired output, so the performance of text generation is still far from satisfaction in many real-world scenarios. To address this issue, researchers have considered incorporating various forms of knowledge beyond the input text into the generation models. This research direction is known as knowledge-enhanced text generation. In this survey, we present a comprehensive review of the research on knowledge enhanced text generation over the past five years. The main content includes two parts: (i) general methods and architectures for integrating knowledge into text generation; (ii) specific techniques and applications according to different forms of knowledge data. This survey can have broad audiences, researchers and practitioners, in academia and industry.

...read moreread less

115 citations

Posted Content•

Deep Learning for Text Style Transfer: A Survey

[...]

Di Jin¹, Zhijing Jin², Zhiting Hu³, Olga Vechtomova⁴, Rada Mihalcea⁵ - Show less +1 more•Institutions (5)

Massachusetts Institute of Technology¹, Max Planck Society², University of California, San Diego³, University of Waterloo⁴, University of Michigan⁵

01 Nov 2020-arXiv: Computation and Language

TL;DR: This article presents a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neuralText style transfer work in 2017, as well as the rich methodologies in the presence of parallel and non-parallel data.

...read moreread less

Abstract: Text style transfer (TST) is an important task in natural language generation (NLG), which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others. It has a long history in the field of natural language processing (NLP), and recently has re-gained significant attention thanks to the promising performance brought by deep neural models. In this paper, we present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017. We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data. We also provide discussions on a variety of important topics regarding the future development of TST. Our curated paper list is at this https URL

...read moreread less

101 citations

Proceedings Article•DOI•

CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities

[...]

Mina Lee, Percy Liang, Qian Yang

18 Jan 2022

TL;DR: It is argued that by curating and analyzing large interaction datasets, the HCI community can foster more incisive examinations of LMs’ generative capabilities, and presents CoAuthor, a dataset designed for revealing GPT-3’s capabilities in assisting creative and argumentative writing.

...read moreread less

Abstract: Large language models (LMs) offer unprecedented language generation capabilities and exciting opportunities for interaction design. However, their highly context-dependent capabilities are difficult to grasp and are often subjectively interpreted. In this paper, we argue that by curating and analyzing large interaction datasets, the HCI community can foster more incisive examinations of LMs’ generative capabilities. Exemplifying this approach, we present CoAuthor, a dataset designed for revealing GPT-3’s capabilities in assisting creative and argumentative writing. CoAuthor captures rich interactions between 63 writers and four instances of GPT-3 across 1445 writing sessions. We demonstrate that CoAuthor can address questions about GPT-3’s language, ideation, and collaboration capabilities, and reveal its contribution as a writing “collaborator” under various definitions of good collaboration. Finally, we discuss how this work may facilitate a more principled discussion around LMs’ promises and pitfalls in relation to interaction design. The dataset and an interface for replaying the writing sessions are publicly available at https://coauthor.stanford.edu.

...read moreread less

81 citations

Posted Content•

On the Opportunities and Risks of Foundation Models.

[...]

Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ B. Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri S. Chatterji, Annie Chen, Kathleen Creel, Jared Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel¹, Noah D. Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Ahmad Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, Aditi Raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf H. Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Yang Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang - Show less +110 more•Institutions (1)

Stanford University¹

16 Aug 2021-arXiv: Learning

TL;DR: The authors provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e. g.g. model architectures, training procedures, data, systems, security, evaluation, theory) to their applications.

...read moreread less

Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.

...read moreread less

76 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

Collapse