Home
/
Authors
/
Didier Cherix

Author

Didier Cherix

Bio: Didier Cherix is an academic researcher from Leipzig University. The author has contributed to research in topics: Semantic Web & Question answering. The author has an hindex of 5, co-authored 7 publications receiving 294 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

GERBIL: General Entity Annotator Benchmarking Framework

[...]

Ricardo Usbeck¹, Michael Röder¹, Axel-Cyrille Ngonga Ngomo¹, Ciro Baron¹, Andreas Both, Martin Brümmer¹, Diego Ceccarelli², Marco Cornolti³, Didier Cherix, Bernd Eickmann, Paolo Ferragina³, Christiane Lemke, Andrea Moro⁴, Roberto Navigli⁴, Francesco Piccinno³, Giuseppe Rizzo⁵, Harald Sack, René Speck¹, Raphaël Troncy⁵, Jörg Waitelonis, Lars Wesemann - Show less +17 more•Institutions (5)

Leipzig University¹, Istituto di Scienza e Tecnologie dell'Informazione², University of Pisa³, Sapienza University of Rome⁴, Institut Eurécom⁵

18 May 2015

TL;DR: GERBIL aims to become a focal point for the state of the art, driving the research agenda of the community by presenting comparable objective evaluation results.

...read moreread less

Abstract: We present GERBIL, an evaluation framework for semantic entity annotation. The rationale behind our framework is to provide developers, end users and researchers with easy-to-use interfaces that allow for the agile, fine-grained and uniform evaluation of annotation tools on multiple datasets. By these means, we aim to ensure that both tool developers and end users can derive meaningful insights pertaining to the extension, integration and use of annotation applications. In particular, GERBIL provides comparable results to tool developers so as to allow them to easily discover the strengths and weaknesses of their implementations with respect to the state of the art. With the permanent experiment URIs provided by our framework, we ensure the reproducibility and archiving of evaluation results. Moreover, the framework generates data in machine-processable format, allowing for the efficient querying and post-processing of evaluation results. Finally, the tool diagnostics provided by GERBIL allows deriving insights pertaining to the areas in which tools should be further refined, thus allowing developers to create an informed agenda for extensions and end users to detect the right tools for their purposes. GERBIL aims to become a focal point for the state of the art, driving the research agenda of the community by presenting comparable objective evaluation results.

...read moreread less

219 citations

Book Chapter•DOI•

Qanary --- A Methodology for Vocabulary-Driven Open Question Answering Systems

[...]

Andreas Both, Dennis Diefenbach, Kuldeep Singh, Saedeeh Shekarpour¹, Didier Cherix, Christoph Lange¹ - Show less +2 more•Institutions (1)

University of Bonn¹

29 May 2016

TL;DR: This work provides an approach driven by a core QA vocabulary that is aligned to existing, powerful ontologies provided by domain-specific communities, and is agnostic to implementation details and that inherently follows the linked data principles.

...read moreread less

Abstract: It is very challenging to access the knowledge expressed within big data sets. Question answering QA aims at making sense out of data via a simple-to-use interface. However, QA systems are very complex and earlier approaches are mostly singular and monolithic implementations for QA in specific domains. Therefore, it is cumbersome and inefficient to design and implement new or improved approaches, in particular as many components are not reusable. Hence, there is a strong need for enabling best-of-breed QA systems, where the best performing components are combined, aiming at the best quality achievable in the given domain. Taking into account the high variety of functionality that might be of use within a QA system and therefore reused in new QA systems, we provide an approach driven by a core QA vocabulary that is aligned to existing, powerful ontologies provided by domain-specific communities. We achieve this by a methodology for binding existing vocabularies to our core QA vocabulary without re-creating the information provided by external components. We thus provide a practical approach for rapidly establishing new domain-specific QA systems, while the core QA vocabulary is re-usable across multiple domains. To the best of our knowledge, this is the first approach to open QA systems that is agnostic to implementation details and that inherently follows the linked data principles.

...read moreread less

44 citations

Book Chapter•DOI•

Qanary – the Fast Track to Creating a Question Answering System with Linked Data Technology

[...]

Kuldeep Singh, Andreas Both, Dennis Diefenbach, Saedeeh Shekarpour, Didier Cherix, Christoph Lange¹ - Show less +2 more•Institutions (1)

University of Bonn¹

29 May 2016

TL;DR: This work follows the research agenda of establishing an ecosystem for components of QA systems, which will enable the QA community to elevate the reusability of such components and to intensify their research activities.

...read moreread less

Abstract: Question answering (QA) systems focus on making sense out of data via an easy-to-use interface. However, these systems are very complex and integrate a lot of technology tightly. Previously presented QA systems are mostly singular and monolithic implementations. Hence, their reusability is limited. In contrast, we follow the research agenda of establishing an ecosystem for components of QA systems, which will enable the QA community to elevate the reusability of such components and to intensify their research activities.

...read moreread less

37 citations

Book Chapter•DOI•

The Qanary Ecosystem: Getting New Insights by Composing Question Answering Pipelines

[...]

Dennis Diefenbach, Kuldeep Singh¹, Andreas Both², Didier Cherix, Christoph Lange¹, Sören Auer¹ - Show less +2 more•Institutions (2)

University of Bonn¹, DATEV²

05 Jun 2017

TL;DR: The main goal is to show how the research community can use Qanary to gain new insights into QA processes, and illustrate this by focusing on the Entity Linking task w.r.t. textual natural language input, which is a fundamental step in mostQA processes.

...read moreread less

Abstract: The field of Question Answering (QA) is very multi-disciplinary as it requires expertise from a large number of areas such as natural language processing (NLP), artificial intelligence, machine learning, information retrieval, speech recognition and semantic technologies In the past years a large number of QA systems were proposed using approaches from different fields and focusing on particular tasks in the QA process Unfortunately, most of these systems cannot be easily reused, extended, and results cannot be easily reproduced since the systems are mostly implemented in a monolithic fashion, lack standardized interfaces and are often not open source or available as Web services To address these issues we developed the knowledge-based Qanary methodology for choreographing QA pipelines distributed over the Web Qanary employs the qa vocabulary as an exchange format for typical QA components As a result, QA systems can be built using the Qanary methodology in a simpler, more flexible and standardized way while becoming knowledge-driven instead of being process-oriented This paper presents the components and services that are integrated using the qa vocabulary and the Qanary methodology within the Qanary ecosystem Moreover, we show how the Qanary ecosystem can be used to analyse QA processes to detect weaknesses and research gaps We illustrate this by focusing on the Entity Linking (EL) task wrt textual natural language input, which is a fundamental step in most QA processes Additionally, we contribute the first EL benchmark for QA, as open source Our main goal is to show how the research community can use Qanary to gain new insights into QA processes

...read moreread less

23 citations

CROCUS: Cluster-based Ontology Data Cleansing.

[...]

Didier Cherix¹, Ricardo Usbeck¹, Andreas Both, Jens Lehmann•Institutions (1)

Leipzig University¹

01 Jan 2014

TL;DR: This system provides a semi-automatic approach for instance-level error detection in ontologies which is agnostic of the underlying Linked Data knowledge base and works at very low costs.

...read moreread less

Abstract: Over the past years, a vast number of datasets have been published based on Semantic Web standards, which provides an opportunity for creating novel industrial applications. However, industrial requirements on data quality are high while the time to market as well as the required costs for data preparation have to be kept low. Unfortunately, many Linked Data sources are error-prone which prevents their direct use in productive systems. Hence, (semi-)automatic quality assurance processes are needed as manual ontology repair procedures by domain experts are expensive and time consuming. In this article, we present CROCUS – a pipeline for cluster-based ontology data cleansing. Our system provides a semi-automatic approach for instance-level error detection in ontologies which is agnostic of the underlying Linked Data knowledge base and works at very low costs. CROCUS was evaluated on two datasets. The experiments show that we are able to detect errors with high recall.

...read moreread less

7 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Survey on Aspect-Level Sentiment Analysis

[...]

Kim Schouten¹, Flavius Frasincar¹•Institutions (1)

Erasmus University Rotterdam¹

01 Mar 2016-IEEE Transactions on Knowledge and Data Engineering

TL;DR: An in-depth overview of the current state-of-the-art of aspect-level sentiment analysis is given, showing the tremendous progress that has been made in finding both the target, which can be an entity as such, or some aspect of it, and the corresponding sentiment.

...read moreread less

Abstract: The field of sentiment analysis, in which sentiment is gathered, analyzed, and aggregated from text, has seen a lot of attention in the last few years. The corresponding growth of the field has resulted in the emergence of various subareas, each addressing a different level of analysis or research question. This survey focuses on aspect-level sentiment analysis, where the goal is to find and aggregate sentiment on entities mentioned within documents or aspects of them. An in-depth overview of the current state-of-the-art is given, showing the tremendous progress that has already been made in finding both the target, which can be an entity as such, or some aspect of it, and the corresponding sentiment. Aspect-level sentiment analysis yields very fine-grained sentiment information which can be useful for applications in various domains. Current solutions are categorized based on whether they provide a method for aspect detection, sentiment analysis, or both. Furthermore, a breakdown based on the type of algorithm used is provided. For each discussed study, the reported performance is included. To facilitate the quantitative evaluation of the various proposed methods, a call is made for the standardization of the evaluation methodology that includes the use of shared data sets. Semantically-rich concept-centric aspect-level sentiment analysis is discussed and identified as one of the most promising future research direction.

...read moreread less

579 citations

Journal Article•DOI•

Core techniques of question answering systems over knowledge bases: a survey

[...]

Dennis Diefenbach¹, Vanessa Lopez², Kamal Deep Singh¹, Pierre Maret¹•Institutions (2)

University of Lyon¹, IBM²

01 Jun 2018-Knowledge and Information Systems

TL;DR: An overview of the techniques used in current QA systems over KBs is given, which were evaluated on a popular series of benchmarks: Question Answering over Linked Data and WebQuestions.

...read moreread less

Abstract: The Semantic Web contains an enormous amount of information in the form of knowledge bases (KB). To make this information available, many question answering (QA) systems over KBs were created in the last years. Building a QA system over KBs is difficult because there are many different challenges to be solved. In order to address these challenges, QA systems generally combine techniques from natural language processing, information retrieval, machine learning and Semantic Web. The aim of this survey is to give an overview of the techniques used in current QA systems over KBs. We present the techniques used by the QA systems which were evaluated on a popular series of benchmarks: Question Answering over Linked Data. Techniques that solve the same task are first grouped together and then described. The advantages and disadvantages are discussed for each technique. This allows a direct comparison of similar techniques. Additionally, we point to techniques that are used over WebQuestions and SimpleQuestions, which are two other popular benchmarks for QA systems.

...read moreread less

268 citations

Proceedings Article•DOI•

Neural Network-based Question Answering over Knowledge Graphs on Word and Character Level

[...]

Denis Lukovnikov¹, Asja Fischer¹, Jens Lehmann¹, Sören Auer¹•Institutions (1)

University of Bonn¹

03 Apr 2017

TL;DR: This work trains a neural network for answering simple questions in an end-to-end manner, leaving all decisions to the model, which contains a nested word/character-level question encoder which allows to handle out-of-vocabulary and rare word problems while still being able to exploit word-level semantics.

...read moreread less

Abstract: Question Answering (QA) systems over Knowledge Graphs (KG) automatically answer natural language questions using facts contained in a knowledge graph. Simple questions, which can be answered by the extraction of a single fact, constitute a large part of questions asked on the web but still pose challenges to QA systems, especially when asked against a large knowledge resource. Existing QA systems usually rely on various components each specialised in solving different sub-tasks of the problem (such as segmentation, entity recognition, disambiguation, and relation classification etc.). In this work, we follow a quite different approach: We train a neural network for answering simple questions in an end-to-end manner, leaving all decisions to the model. It learns to rank subject-predicate pairs to enable the retrieval of relevant facts given a question. The network contains a nested word/character-level question encoder which allows to handle out-of-vocabulary and rare word problems while still being able to exploit word-level semantics. Our approach achieves results competitive with state-of-the-art end-to-end approaches that rely on an attention mechanism.

...read moreread less

248 citations

Journal Article•DOI•

Survey on challenges of Question Answering in the Semantic Web

[...]

Konrad Höffner¹, Sebastian Walter², Edgard Marx¹, Ricardo Usbeck¹, Jens Lehmann¹, Axel-Cyrille Ngonga Ngomo¹ - Show less +2 more•Institutions (2)

Leipzig University¹, Bielefeld University²

01 Jan 2017-Sprachwissenschaft

TL;DR: This survey analyzes 62 different SQA systems, which are systematically and manually selected using predefined inclusion and exclusion criteria, leading to 72 selected publications out of 1960 candidates, and identifies common challenges, structure solutions, and provide recommendations for future systems.

...read moreread less

Abstract: Semantic Question Answering (SQA) removes two major access requirements to the Semantic Web: the mastery of a formal query language like SPARQL and knowledge of a specific vocabulary. Because of the complexity of natural language, SQA presents difficult challenges and many research opportunities. Instead of a shared effort, however, many essential components are redeveloped, which is an inefficient use of researcher’s time and resources. This survey analyzes 62 different SQA systems, which are systematically and manually selected using predefined inclusion and exclusion criteria, leading to 72 selected publications out of 1960 candidates. We identify common challenges, structure solutions, and provide recommendations for future systems. This work is based on publications from the end of 2010 to July 2015 and is also compared to older but similar surveys.

...read moreread less

205 citations

Book Chapter•DOI•

TabEL: Entity Linking in Web Tables

[...]

Chandra Bhagavatula¹, Thanapon Noraset¹, Doug Downey¹•Institutions (1)

Northwestern University¹

11 Oct 2015

TL;DR: TabEL differs from previous work by weakening the assumption that the semantics of a table can be mapped to pre-defined types and relations found in the target KB, and enforces soft constraints in the form of a graphical model that assigns higher likelihood to sets of entities that tend to co-occur in Wikipedia documents and tables.

...read moreread less

Abstract: Web tables form a valuable source of relational data. The Web contains an estimated 154 million HTML tables of relational data, with Wikipedia alone containing 1.6 million high-quality tables. Extracting the semantics of Web tables to produce machine-understandable knowledge has become an active area of research. A key step in extracting the semantics of Web content is entity linking EL: the task of mapping a phrase in text to its referent entity in a knowledge base KB. In this paper we present TabEL, a new EL system for Web tables. TabEL differs from previous work by weakening the assumption that the semantics of a table can be mapped to pre-defined types and relations found in the target KB. Instead, TabEL enforces soft constraints in the form of a graphical model that assigns higher likelihood to sets of entities that tend to co-occur in Wikipedia documents and tables. In experiments, TabEL significantly reduces error when compared to current state-of-the-art table EL systems, including a $$75\%$$ error reduction on Wikipedia tables and a $$60\%$$ error reduction on Web tables. We also make our parsed Wikipedia table corpus and test datasets publicly available for future work.

...read moreread less

162 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Collapse