Home
/
Authors
/
Terrence Szymanski

Author

Terrence Szymanski

Other affiliations: University of Michigan

Bio: Terrence Szymanski is an academic researcher from University College Dublin. The author has contributed to research in topics: Newspaper & Word embedding. The author has an hindex of 5, co-authored 10 publications receiving 354 citations. Previous affiliations of Terrence Szymanski include University of Michigan.

Papers

PDF

Open Access

More filters

Proceedings Article•

Diachronic word embeddings and semantic shifts: a survey

[...]

Andrey Kutuzov¹, Lilja Øvrelid¹, Terrence Szymanski², Erik Velldal¹•Institutions (2)

University of Oslo¹, University College Dublin²

09 Jun 2018

TL;DR: This paper surveys the current state of academic research related to diachronic word embeddings and semantic shifts detection, and proposes several axes along which these methods can be compared, and outlines the main challenges before this emerging subfield of NLP.

...read moreread less

Abstract: Recent years have witnessed a surge of publications aimed at tracing temporal changes in lexical semantics using distributional methods, particularly prediction-based word embedding models. However, this vein of research lacks the cohesion, common terminology and shared practices of more established areas of natural language processing. In this paper, we survey the current state of academic research related to diachronic word embeddings and semantic shifts detection. We start with discussing the notion of semantic shifts, and then continue with an overview of the existing methods for tracing such time-related shifts with word embedding models. We propose several axes along which these methods can be compared, and outline the main challenges before this emerging subfield of NLP, as well as prospects and possible applications.

...read moreread less

191 citations

Posted Content•

Diachronic word embeddings and semantic shifts: a survey

[...]

Andrey Kutuzov¹, Lilja Øvrelid¹, Terrence Szymanski², Erik Velldal¹•Institutions (2)

University of Oslo¹, University College Dublin²

09 Jun 2018-arXiv: Computation and Language

TL;DR: A survey of the current state of academic research related to diachronic word embeddings and semantic shifts detection can be found in this article, where the authors discuss the notion of semantic shifts, and then continue with an overview of the existing methods for tracing such time-related shifts with word embedding models.

...read moreread less

124 citations

Proceedings Article•DOI•

Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings

[...]

Terrence Szymanski¹•Institutions (1)

University College Dublin¹

04 Aug 2017

TL;DR: It is shown that temporal word analogies can effectively be modeled with diachronic word embeddings, provided that the independent embedding spaces from each time period are appropriately transformed into a common vector space.

...read moreread less

Abstract: This paper introduces the concept of temporal word analogies: pairs of words which occupy the same semantic space at different points in time. One well-known property of word embeddings is that they are able to effectively model traditional word analogies (“word w1 is to word w2 as word w3 is to word w4”) through vector addition. Here, I show that temporal word analogies (“word w1 at time t𝛼 is like word w2 at time t𝛽”) can effectively be modeled with diachronic word embeddings, provided that the independent embedding spaces from each time period are appropriately transformed into a common vector space. When applied to a diachronic corpus of news articles, this method is able to identify temporal word analogies such as “Ronald Reagan in 1987 is like Bill Clinton in 1997”, or “Walkman in 1987 is like iPod in 2007”.

...read moreread less

52 citations

Proceedings Article•DOI•

UCD : Diachronic Text Classification with Character, Word, and Syntactic N-grams

[...]

Terrence Szymanski¹, Gerard Lynch¹•Institutions (1)

University College Dublin¹

05 Jun 2015

TL;DR: This work extracts n-gram features from the text at the letter, word, and syntactic level, and uses these to train a classifier on date-labeled training data, and incorporates date probabilities of syntactic features as estimated from a very large external corpus of books.

...read moreread less

Abstract: We present our submission to SemEval-2015 Task 7: Diachronic Text Evaluation, in which we approach the task of assigning a date to a text as a multi-class classification problem. We extract n-gram features from the text at the letter, word, and syntactic level, and use these to train a classifier on date-labeled training data. We also incorporate date probabilities of syntactic features as estimated from a very large external corpus of books. Our system achieved the highest performance of all systems on subtask 2: identifying texts by specific time language use.

...read moreread less

16 citations

Posted Content•

Helping News Editors Write Better Headlines: A Recommender to Improve the Keyword Contents & Shareability of News Headlines

[...]

Terrence Szymanski, Claudia Orellana-Rodriguez, Mark T. Keane

26 May 2017-arXiv: Computation and Language

TL;DR: In this article, the authors present a software tool that employs state-of-the-art NLP and machine learning techniques to help newspaper editors compose effective headlines for online publication.

...read moreread less

Abstract: We present a software tool that employs state-of-the-art natural language processing (NLP) and machine learning techniques to help newspaper editors compose effective headlines for online publication. The system identifies the most salient keywords in a news article and ranks them based on both their overall popularity and their direct relevance to the article. The system also uses a supervised regression model to identify headlines that are likely to be widely shared on social media. The user interface is designed to simplify and speed the editor's decision process on the composition of the headline. As such, the tool provides an efficient way to combine the benefits of automated predictors of engagement and search-engine optimization (SEO) with human judgments of overall headline quality.

...read moreread less

13 citations

Cited by

PDF

Open Access

More filters

Quantitative Analysis of Culture Using Millions of Digitized Books

[...]

Björn-Olav Dozo

17 Dec 2010

TL;DR: The authors survey the vast terrain of "culturomics", focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000, using a corpus of digitized texts containing about 4% of all books ever printed.

...read moreread less

Abstract: L'article, publie dans Science, sur une des premieres utilisations analytiques de Google Books, fondee sur les n-grammes (Google Ngrams) We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of "culturomics", focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can ...

...read moreread less

735 citations

Journal Article•DOI•

Bias in data-driven artificial intelligence systems—An introductory survey

[...]

Eirini Ntoutsi¹, Pavlos Fafalios², Ujwal Gadiraju¹, Vasileios Iosifidis¹, Wolfgang Nejdl¹, Maria-Esther Vidal, Salvatore Ruggieri³, Franco Turini³, Symeon Papadopoulos, Emmanouil Krasanakis, Ioannis Kompatsiaris, Katharina Kinder-Kurlanda⁴, Claudia Wagner⁴, Fariba Karimi⁴, Miriam Fernandez⁵, Harith Alani⁵, Bettina Berendt⁶, Bettina Berendt⁷, Tina Kruegel¹, Christian Heinze¹, Klaus Broelemann⁸, Gjergji Kasneci⁸, Thanassis Tiropanis⁹, Steffen Staab⁹, Steffen Staab¹, Steffen Staab¹⁰ - Show less +22 more•Institutions (10)

Leibniz University of Hanover¹, Foundation for Research & Technology – Hellas², University of Pisa³, Leibniz Association⁴, Open University⁵, University of Copenhagen Faculty of Science⁶, Technical University of Berlin⁷, Harvard University⁸, University of Southampton⁹, University of Stuttgart¹⁰

01 May 2020-Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery

TL;DR: A broad multidisciplinary overview of the area of bias in AI systems is provided, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well‐grounded in a legal frame.

...read moreread less

Abstract: Artificial Intelligence (AI)‐based systems are widely employed nowadays to make decisions that have far‐reaching impact on individuals and society. Their decisions might affect everyone, everywhere, and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training, and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multidisciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well‐grounded in a legal frame. In this survey, we focus on data‐driven AI, as a large part of AI is powered nowadays by (big) data and powerful machine learning algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features such as race, sex, and so forth.

...read moreread less

271 citations

Proceedings Article•

Diachronic word embeddings and semantic shifts: a survey

[...]

Andrey Kutuzov¹, Lilja Øvrelid¹, Terrence Szymanski², Erik Velldal¹•Institutions (2)

University of Oslo¹, University College Dublin²

09 Jun 2018

...read moreread less

191 citations

Journal Article•DOI•

A Bayesian Model of Diachronic Meaning Change

[...]

Lea Frermann¹, Mirella Lapata¹•Institutions (1)

University of Edinburgh¹

19 Feb 2016-Transactions of the Association for Computational Linguistics

TL;DR: A dynamic Bayesian model of diachronic meaning change is presented, which infers temporal word representations as a set of senses and their prevalence and reveals that it performs on par with highly optimized task-specific systems.

...read moreread less

Abstract: Word meanings change over time and an automated procedure for extracting this information from text would be useful for historical exploratory studies, information retrieval or question answering. We present a dynamic Bayesian model of diachronic meaning change, which infers temporal word representations as a set of senses and their prevalence. Unlike previous work, we explicitly model language change as a smooth, gradual process. We experimentally show that this modeling decision is beneficial: our model performs competitively on meaning change detection tasks whilst inducing discernible word senses and their development over time. Application of our model to the SemEval-2015 temporal classification benchmark datasets further reveals that it performs on par with highly optimized task-specific systems.

...read moreread less

152 citations

Posted Content•

Diachronic word embeddings and semantic shifts: a survey

[...]