Home
/
Authors
/
Saad Mahamood

Author

Saad Mahamood

Other affiliations: King's College, Aberdeen

Bio: Saad Mahamood is an academic researcher from University of Aberdeen. The author has contributed to research in topics: Natural language generation & Computer science. The author has an hindex of 9, co-authored 19 publications receiving 387 citations. Previous affiliations of Saad Mahamood include King's College, Aberdeen.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

From data to text in the Neonatal Intensive Care Unit: Using NLG technology for decision support and information management

[...]

Albert Gatt¹, François Portet, Ehud Reiter¹, Jim Hunter¹, Saad Mahamood¹, Wendy Moncur¹, Somayajulu Sripada¹ - Show less +3 more•Institutions (1)

King's College, Aberdeen¹

01 Aug 2009-Ai Communications

TL;DR: Recent and ongoing work on building systems that automatically generate textual summaries of neonatal data are described, showing that the technology is viable and comparable in its effectiveness for decision support to existing presentation modalities.

...read moreread less

Abstract: Contemporary Neonatal Intensive Care Units collect vast amounts of patient data in various formats, making efficient processing of information by medical professionals difficult. Moreover, different stakeholders in the neonatal scenario, which include parents as well as staff occupying different roles, have different information requirements. This paper describes recent and ongoing work on building systems that automatically generate textual summaries of neonatal data. Our evaluation results show that the technology is viable and comparable in its effectiveness for decision support to existing presentation modalities. We discuss the lessons learned so far, as well as the major challenges involved in extending current technology to deal with a broader range of data types, and to improve the textual output in the form of more coherent summaries.

...read moreread less

138 citations

Proceedings Article•

Twenty Years of Confusion in Human Evaluation : NLG Needs Evaluation Sheets and Standardised Definitions

[...]

David M. Howcroft¹, Anya Belz, Miruna-Adriana Clinciu¹, Dimitra Gkatzia², Sadid A. Hasan³, Saad Mahamood⁴, Simon Mille⁵, Emiel van Miltenburg⁶, Sashank Santhanam⁷, Verena Rieser¹ - Show less +6 more•Institutions (7)

Heriot-Watt University¹, Edinburgh Napier University², CVS Health³, University of Aberdeen⁴, Pompeu Fabra University⁵, Tilburg University⁶, University of North Carolina at Charlotte⁷

01 Dec 2020

TL;DR: Due to a pervasive lack of clarity in reports and extreme diversity in approaches, human evaluation in NLG presents as extremely confused in 2020, and that the field is in urgent need of standard methods and terminology.

...read moreread less

Abstract: Human assessment remains the most trusted form of evaluation in NLG, but highly diverse approaches and a proliferation of different quality criteria used by researchers make it difficult to compare results and draw conclusions across papers, with adverse implications for meta-evaluation and reproducibility. In this paper, we present (i) our dataset of 165 NLG papers with human evaluations, (ii) the annotation scheme we developed to label the papers for different aspects of evaluations, (iii) quantitative analyses of the annotations, and (iv) a set of recommendations for improving standards in evaluation reporting. We use the annotations as a basis for examining information included in evaluation reports, and levels of consistency in approaches, experimental design and terminology, focusing in particular on the 200+ different terms that have been used for evaluated aspects of quality. We conclude that due to a pervasive lack of clarity in reports and extreme diversity in approaches, human evaluation in NLG presents as extremely confused in 2020, and that the field is in urgent need of standard methods and terminology.

...read moreread less

95 citations

Proceedings Article•

Generating Affective Natural Language for Parents of Neonatal Infants

[...]

Saad Mahamood¹, Ehud Reiter¹•Institutions (1)

University of Aberdeen¹

28 Sep 2011

TL;DR: This paper presents several affective NLG strategies for generating medical texts for parents of pre-term neonates, and shows that all recipients preferred texts generated with the affective strategies, regardless of predicted stress level.

...read moreread less

Abstract: This paper presents several affective NLG strategies for generating medical texts for parents of pre-term neonates. Initially, these were meant to be personalised according to a model of the recipient's level of stress. However, our evaluation showed that all recipients preferred texts generated with the affective strategies, regardless of predicted stress level.

...read moreread less

55 citations

Proceedings Article•DOI•

A Snapshot of NLG Evaluation Practices 2005 - 2014

[...]

Dimitra Gkatzia¹, Saad Mahamood²•Institutions (2)

Heriot-Watt University¹, University of Aberdeen²

01 Sep 2015

TL;DR: A snapshot of endto-end NLG system evaluations as presented in conference and journal papers over the last ten years is presented to better understand the nature and type of evaluations that have been undertaken.

...read moreread less

Abstract: In this paper we present a snapshot of endto-end NLG system evaluations as presented in conference and journal papers1 over the last ten years in order to better understand the nature and type of evaluations that have been undertaken. We find that researchers tend to favour specific evaluation methods, and that their evaluation approaches are also correlated with the publication venue. We further discuss what factors may influence the types of evaluation used for a given NLG system.

...read moreread less

49 citations

Posted Content•

The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics.

[...]

Sebastian Gehrmann¹, Tosin P. Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Chinenye Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur P. Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou - Show less +52 more•Institutions (1)

Google¹

02 Feb 2021-arXiv: Computation and Language

TL;DR: GEM as discussed by the authors is a living benchmark for natural language generation (NLG), its Evaluation and Metrics, which provides an environment in which models can easily be applied to a wide set of tasks and in which evaluation strategies can be tested.

...read moreread less

Abstract: We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on a constantly evolving ecosystem of automated metrics, datasets, and human evaluation standards. Due to this moving target, new models often still evaluate on divergent anglo-centric corpora with well-established, but flawed, metrics. This disconnect makes it challenging to identify the limitations of current models and opportunities for progress. Addressing this limitation, GEM provides an environment in which models can easily be applied to a wide set of tasks and in which evaluation strategies can be tested. Regular updates to the benchmark will help NLG research become more multilingual and evolve the challenge alongside models. This paper serves as the description of the data for which we are organizing a shared task at our ACL 2021 Workshop and to which we invite the entire NLG community to participate.

...read moreread less

44 citations

1
2
3
4
…
5

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Survey of the state of the art in natural language generation: core tasks, applications and evaluation

[...]

Albert Gatt¹, Emiel Krahmer²•Institutions (2)

University of Malta¹, Tilburg University²

01 Jan 2018-Journal of Artificial Intelligence Research

TL;DR: A survey of the state of the art in natural language generation can be found in this article, with an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organized.

...read moreread less

Abstract: This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past two decades, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artifical intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of nlp, with an emphasis on different evaluation methods and the relationships between them.

...read moreread less

562 citations

Proceedings Article•DOI•

Why We Need New Evaluation Metrics for NLG

[...]

Jekaterina Novikova, Ondřej Dušek, Amanda Cercas Curry, Verena Rieser

10 Sep 2017

TL;DR: A wide range of metrics are investigated, including state-of-the-art word-based and novel grammar-based ones, and it is demonstrated that they only weakly reflect human judgements of system outputs as generated by data-driven, end-to-end NLG.

...read moreread less

Abstract: The majority of NLG evaluation relies on automatic metrics, such as BLEU . In this paper, we motivate the need for novel, system- and data-independent automatic evaluation methods: We investigate a wide range of metrics, including state-of-the-art word-based and novel grammar-based ones, and demonstrate that they only weakly reflect human judgements of system outputs as generated by data-driven, end-to-end NLG. We also show that metric performance is data- and system-specific. Nevertheless, our results also suggest that automatic metrics perform reliably at system-level and can support system development by finding cases where a system performs poorly.

...read moreread less

421 citations

Journal Article•DOI•

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

[...]

Teven Le Scao, Angela Fan, Christopher Akiki, Elizabeth-Jane Pavlick +383 more

09 Nov 2022-arXiv.org

TL;DR: BLOOM as discussed by the authors is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total).

...read moreread less

Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

...read moreread less

407 citations

Journal Article•DOI•

Automatic generation of textual summaries from neonatal intensive care data

[...]

François Portet¹, Ehud Reiter¹, Albert Gatt¹, Jim Hunter¹, Somayajulu Sripada¹, Yvonne Freer², Cindy Sykes - Show less +3 more•Institutions (2)

University of Aberdeen¹, University of Edinburgh²

01 May 2009-Artificial Intelligence

TL;DR: A prototype, called BT-45, is presented, which generates textual summaries of about 45 minutes of continuous physiological signals and discrete events and brings together techniques from the different areas of signal processing, medical reasoning, knowledge engineering, and natural language generation.

...read moreread less

279 citations

Time in language.

[...]

Bernard Comrie¹•Institutions (1)

Max Planck Society¹

01 Jan 2000

243 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

Collapse