Top 14 papers published by Marie-Francine Moens from Katholieke Universiteit Leuven in 2019

Proceedings Article•DOI•

Talk2Car: Taking Control of Your Self-Driving Car

[...]

Thierry Deruyttere¹, Simon Vandenhende¹, Dusan Grujicic¹, Luc Van Gool², Marie-Francine Moens¹ - Show less +1 more•Institutions (2)

Katholieke Universiteit Leuven¹, École Polytechnique Fédérale de Lausanne²

01 Sep 2019

TL;DR: This work presents the Talk2Car dataset, which is the first object referral dataset that contains commands written in natural language for self-driving cars, and provides a detailed comparison with related datasets such as ReferIt, RefCOCO, Ref COCO+, RefC OCOg, Cityscape-Ref and CLEVR-Ref.

...read moreread less

Abstract: A long-term goal of artificial intelligence is to have an agent execute commands communicated through natural language. In many cases the commands are grounded in a visual environment shared by the human who gives the command and the agent. Execution of the command then requires mapping the command into the physical visual space, after which the appropriate action can be taken. In this paper we consider the former. Or more specifically, we consider the problem in an autonomous driving setting, where a passenger requests an action that can be associated with an object found in a street scene. Our work presents the Talk2Car dataset, which is the first object referral dataset that contains commands written in natural language for self-driving cars. We provide a detailed comparison with related datasets such as ReferIt, RefCOCO, RefCOCO+, RefCOCOg, Cityscape-Ref and CLEVR-Ref. Additionally, we include a performance analysis using strong state-of-the-art models. The results show that the proposed object referral task is a challenging one for which the models show promising results but still require additional research in natural language processing, computer vision and the intersection of these fields. The dataset can be found on our website: http://macchina-ai.eu/

...read moreread less

45 citations

Proceedings Article•DOI•

Learning Unsupervised Multilingual Word Embeddings with Incremental Multilingual Hubs.

[...]

Geert Heyman¹, Bregt Verreet, Ivan Vulić², Marie-Francine Moens¹•Institutions (2)

Katholieke Universiteit Leuven¹, University of Cambridge²

01 Jun 2019

TL;DR: This work proposes a new robust framework for learning unsupervised multilingual word embeddings that mitigates the instability issues and yields results that are on par with the state-of-the-art methods in the bilingual lexicon induction (BLI) task, and simultaneously obtain state of theart scores on two downstream tasks: multilingual document classification and multilingual dependency parsing.

...read moreread less

Abstract: Recent research has discovered that a shared bilingual word embedding space can be induced by projecting monolingual word embedding spaces from two languages using a self-learning paradigm without any bilingual supervision. However, it has also been shown that for distant language pairs such fully unsupervised self-learning methods are unstable and often get stuck in poor local optima due to reduced isomorphism between starting monolingual spaces. In this work, we propose a new robust framework for learning unsupervised multilingual word embeddings that mitigates the instability issues. We learn a shared multilingual embedding space for a variable number of languages by incrementally adding new languages one by one to the current multilingual space. Through the gradual language addition the method can leverage the interdependencies between the new language and all other languages in the current multilingual space. We find that it is beneficial to project more distant languages later in the iterative process. Our fully unsupervised multilingual embedding spaces yield results that are on par with the state-of-the-art methods in the bilingual lexicon induction (BLI) task, and simultaneously obtain state-of-the-art scores on two downstream tasks: multilingual document classification and multilingual dependency parsing, outperforming even supervised baselines. This finding also accentuates the need to establish evaluation protocols for cross-lingual word embeddings beyond the omnipresent intrinsic BLI task in future work.

...read moreread less

29 citations

Journal Article•DOI•

A Survey on Temporal Reasoning for Temporal Information Extraction from Text

[...]

Artuur Leeuwenberg, Marie-Francine Moens

30 Sep 2019-Journal of Artificial Intelligence Research

TL;DR: This article presents a comprehensive survey of the research from the past decades on temporal reasoning for automatic temporal information extraction from text, providing a case study on the integration of symbolic reasoning with machine learning-based information extraction systems.

...read moreread less

Abstract: Time is deeply woven into how people perceive, and communicate about the world. Almost unconsciously, we provide our language utterances with temporal cues, like verb tenses, and we can hardly produce sentences without such cues. Extracting temporal cues from text, and constructing a global temporal view about the order of described events is a major challenge of automatic natural language understanding. Temporal reasoning, the process of combining different temporal cues into a coherent temporal view, plays a central role in temporal information extraction. This article presents a comprehensive survey of the research from the past decades on temporal reasoning for automatic temporal information extraction from text, providing a case study on how combining symbolic reasoning with machine learning-based information extraction systems can improve performance. It gives a clear overview of the used methodologies for temporal reasoning, and explains how temporal reasoning can be, and has been successfully integrated into temporal information extraction systems. Based on the distillation of existing work, this survey also suggests currently unexplored research areas. We argue that the level of temporal reasoning that current systems use is still incomplete for the full task of temporal information extraction, and that a deeper understanding of how the various types of temporal information can be integrated into temporal reasoning is required to drive future research in this area.

...read moreread less

22 citations

Journal Article•DOI•

Transfer Learning for Named Entity Recognition in Financial and Biomedical Documents

[...]

Sumam Francis, Jordy Van Landeghem, Marie-Francine Moens

26 Jul 2019-Information-an International Interdisciplinary Journal

TL;DR: It is demonstrated through experiments that NER models trained on labeled data from a source domain can be used as base models and then be fine-tuned with few labeled data for recognition of different named entity classes in a target domain.

...read moreread less

Abstract: Recent deep learning approaches have shown promising results for named entity recognition (NER). A reasonable assumption for training robust deep learning models is that a sufficient amount of high-quality annotated training data is available. However, in many real-world scenarios, labeled training data is scarcely present. In this paper we consider two use cases: generic entity extraction from financial and from biomedical documents. First, we have developed a character based model for NER in financial documents and a word and character based model with attention for NER in biomedical documents. Further, we have analyzed how transfer learning addresses the problem of limited training data in a target domain. We demonstrate through experiments that NER models trained on labeled data from a source domain can be used as base models and then be fine-tuned with few labeled data for recognition of different named entity classes in a target domain. We also witness an interest in language models to improve NER as a way of coping with limited labeled data. The current most successful language model is BERT. Because of its success in state-of-the-art models we integrate representations based on BERT in our biomedical NER model along with word and character information. The results are compared with a state-of-the-art model applied on a benchmarking biomedical corpus.

...read moreread less

21 citations

Proceedings Article•DOI•

Generating Captions for Images of Ancient Artworks

[...]

Shurong Sheng¹, Marie-Francine Moens¹•Institutions (1)

Katholieke Universiteit Leuven¹

15 Oct 2019

TL;DR: This paper proposes an artwork type enriched image captioning model where the encoder represents an input artwork image as a 512-dimensional vector and the decoder generates a corresponding caption based on the input image vector.

...read moreread less

Abstract: The neural encoder-decoder framework is widely adopted for image captioning of natural images. However, few works have contributed to generating captions for cultural images using this scheme. In this paper, we propose an artwork type enriched image captioning model where the encoder represents an input artwork image as a 512-dimensional vector and the decoder generates a corresponding caption based on the input image vector. The artwork type is first predicted by a convolutional neural network classifier and then merged into the decoder. We investigate multiple approaches to integrate the artwork type into the captioning model among which is one that applies a step-wise weighted sum of the artwork type vector and the hidden representation vector of the decoder. This model outperforms three baseline image captioning models for a Chinese art image captioning dataset on all evaluation metrics. One of the baselines is a state-of-the-art approach fusing textual image attributes into the captioning model for natural images. The proposed model also obtains promising results for another Egyptian art image captioning dataset.

...read moreread less

16 citations

Proceedings Article•DOI•

Talk2Car: Taking Control of Your Self-Driving Car

[...]

Thierry Deruyttere¹, Simon Vandenhende¹, Dusan Grujicic¹, Luc Van Gool², Marie-Francine Moens¹ - Show less +1 more•Institutions (2)

Katholieke Universiteit Leuven¹, École Polytechnique Fédérale de Lausanne²

24 Sep 2019-arXiv: Artificial Intelligence

TL;DR: The Talk2Car dataset as mentioned in this paper is the first object referral dataset that contains commands written in natural language for self-driving cars, where a passenger requests an action that can be associated with an object found in a street scene.

...read moreread less

Abstract: A long-term goal of artificial intelligence is to have an agent execute commands communicated through natural language. In many cases the commands are grounded in a visual environment shared by the human who gives the command and the agent. Execution of the command then requires mapping the command into the physical visual space, after which the appropriate action can be taken. In this paper we consider the former. Or more specifically, we consider the problem in an autonomous driving setting, where a passenger requests an action that can be associated with an object found in a street scene. Our work presents the Talk2Car dataset, which is the first object referral dataset that contains commands written in natural language for self-driving cars. We provide a detailed comparison with related datasets such as ReferIt, RefCOCO, RefCOCO+, RefCOCOg, Cityscape-Ref and CLEVR-Ref. Additionally, we include a performance analysis using strong state-of-the-art models. The results show that the proposed object referral task is a challenging one for which the models show promising results but still require additional research in natural language processing, computer vision and the intersection of these fields. The dataset can be found on our website: this http URL

...read moreread less

14 citations

Journal Article•DOI•

Justifying diagnosis decisions by deep neural networks

[...]

Graham Spinks¹, Marie-Francine Moens¹•Institutions (1)

Katholieke Universiteit Leuven¹

06 Jul 2019-Journal of Biomedical Informatics

TL;DR: An integrated approach is proposed across visual and textual data to both determine and justify a medical diagnosis by a neural network and achieves excellent diagnosis accuracy and captioning quality when compared to current state-of-the-art single-task methods.

...read moreread less

13 citations

Journal Article•DOI•

Improving the Translation Environment for Professional Translators

[...]

20 Jun 2019-Informatics (Basel)

TL;DR: The SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment is described.

...read moreread less

Abstract: When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project.

...read moreread less

7 citations

Book Chapter•DOI•

Multimodal Neural Machine Translation of Fashion E-commerce Descriptions

[...]

Katrien Laenen¹, Marie-Francine Moens¹•Institutions (1)

Katholieke Universiteit Leuven¹

21 Jul 2019

TL;DR: A multimodal neural machine translation model in which the decoder that generates the translation attends to visually grounded representations that capture both the semantics of the fashion words in the source language and regions in the fashion image.

...read moreread less

Abstract: Neural networks become extremely popular in artificial intelligence. In this paper we show how they aid in automatically translating fashion item descriptions and how they use fashion images to generate the translations. More specifically, we propose a multimodal neural machine translation model in which the decoder that generates the translation attends to visually grounded representations that capture both the semantics of the fashion words in the source language and regions in the fashion image. We introduce this novel neural architecture in the context of fashion e-commerce, where product descriptions need to be available in multiple languages. We report state-of-the-art multimodal translation results on a real-world fashion e-commerce dataset.

...read moreread less

4 citations

Book Chapter•DOI•

[...]

Niraj Shrestha¹, Elias Moons², Marie-Francine Moens²•Institutions (2)

Katholieke Universiteit Leuven¹, University of Copenhagen Faculty of Science²

28 Mar 2019-Advances in intelligent systems and computing

TL;DR: This paper shows that by incorporating content from related written sources in the training of the classification model has a benefit and qualitatively demonstrates that these representations to a certain extent indirectly correct the transcription noise.

...read moreread less

Abstract: Today’s content including user generated content is increasingly found in multimedia format. It is known that speech data are sometimes incorrectly transcribed especially when they are spoken by voices on which the transcribers have not been trained or when they contain unfamiliar words. A familiar mining tasks that helps in storage, indexing and retrieval is automatic classification with predefined category labels. Although state-of-the-art classifiers like neural networks, support vector machines (SVM) and logistic regression classifiers perform quite satisfactory when categorizing written text, their performance degrades when applied on speech data transcribed by automatic speech recognition (ASR) due to transcription errors like insertion and deletion of words, grammatical errors and words that are just transcribed wrongly. In this paper, we show that by incorporating content from related written sources in the training of the classification model has a benefit. We especially focus on and compare different representations that make this integration possible, such as representations of speech data that embed content from the written text and simple concatenation of speech and written content. In addition, we qualitatively demonstrate that these representations to a certain extent indirectly correct the transcription noise.

...read moreread less

3 citations

Book Chapter•DOI•

Can Image Captioning Help Passage Retrieval in Multimodal Question Answering

[...]

Shurong Sheng¹, Katrien Laenen¹, Marie-Francine Moens¹•Institutions (1)

Katholieke Universiteit Leuven¹

14 Apr 2019

TL;DR: A novel approach to conducting passage retrieval for multimodal question answering of ancient artworks where the query image caption of the multi-modal query is provided as additional evidence to state-of-the-art retrieval models in the cultural heritage domain trained on a small dataset.

...read moreread less

Abstract: Passage retrieval for multimodal question answering, spanning natural language processing and computer vision, is a challenging task, particularly when the documentation to search from contains poor punctuation or obsolete word forms and with little labeled training data. Here, we introduce a novel approach to conducting passage retrieval for multimodal question answering of ancient artworks where the query image caption of the multimodal query is provided as additional evidence to state-of-the-art retrieval models in the cultural heritage domain trained on a small dataset. The query image caption is generated with an advanced image captioning model trained on an external dataset. Consequently, the retrieval model obtains transferred knowledge from the external dataset. Extensive experiments prove the efficiency of this approach on a benchmark dataset compared to state-of-the-art approaches.

...read moreread less

DOI•

Joint Processing of Language and Visual Data for Better Automated Understanding (Dagstuhl Seminar 19021)

[...]

Marie-Francine Moens, Lucia Specia, Tinne Tuytelaars

01 Jan 2019

TL;DR: This report documents the program and the outcomes of Dagstuhl Seminar 19021 "Joint Processing of Language and Visual Data for Better Automated Understanding".

...read moreread less

Abstract: This report documents the program and the outcomes of Dagstuhl Seminar 19021 "Joint Processing of Language and Visual Data for Better Automated Understanding". It includes a discussion of the motivation and overall organization, the abstracts of the talks, and a report of each working group.

...read moreread less

Posted Content•

Attention-based Fusion for Outfit Recommendation. (arXiv:1908.10585v1 [cs.CV])

[...]

Katrien Laenen, Marie-Francine Moens

25 Nov 2019

Journal Article•DOI•

Report on the Second Workshop on Exploitation of Social Media for Emergency Relief and Preparedness (SMERP 2018) at the Web Conference (WWW) 2018

[...]

Saptarshi Ghosh¹, Kripabandhu Ghosh², Debasis Ganguly³, Tanmoy Chakraborty⁴, Gareth J. F. Jones⁵, Marie-Francine Moens⁶ - Show less +2 more•Institutions (6)

Indian Institute of Technology Kharagpur¹, Indian Institute of Technology Kanpur², IBM³, Indraprastha Institute of Information Technology⁴, Dublin City University⁵, Katholieke Universiteit Leuven⁶

17 Jan 2019

TL;DR: The Second Workshop on Exploitation of Social Media for Emergency Relief and Preparedness (SMERP) was held in conjunction with The Web Conference (WWW) 2018 at Lyon, France to promote multi-modal and multi-view information retrieval from the social media content in disaster situations.

...read moreread less

Abstract: The Second Workshop on Exploitation of Social Media for Emergency Relief and Preparedness (SMERP) was held in conjunction with The Web Conference (WWW) 2018 at Lyon, France. A primary aim of the workshop was to promote multi-modal and multi-view information retrieval from the social media content in disaster situations. The workshop programme included keynote talks, a peer-reviewed paper track, and a panel discussion on the relevant research problems in the scope of the workshop.

...read moreread less

Showing papers by "Marie-Francine Moens published in 2019"