scispace - formally typeset
Search or ask a question

Showing papers on "Ontology-based data integration published in 2011"


Book ChapterDOI
TL;DR: This paper reports results and lessons learned from the Ontology Alignment Evaluation Initiative (OAEI), a benchmarking initiative for ontology matching, and describes the evaluation design used in the OAEI campaigns in terms of datasets, evaluation criteria and workflows.
Abstract: In the area of semantic technologies, benchmarking and systematic evaluation is not yet as established as in other areas of computer science, e.g., information retrieval. In spite of successful attempts, more effort and experience are required in order to achieve such a level of maturity. In this paper, we report results and lessons learned from the Ontology Alignment Evaluation Initiative (OAEI), a benchmarking initiative for ontology matching. The goal of this work is twofold: on the one hand, we document the state of the art in evaluating ontology matching methods and provide potential participants of the initiative with a better understanding of the design and the underlying principles of the OAEI campaigns. On the other hand, we report experiences gained in this particular area of semantic technologies to potential developers of benchmarking for other kinds of systems. For this purpose, we describe the evaluation design used in the OAEI campaigns in terms of datasets, evaluation criteria and workflows, provide a global view on the results of the campaigns carried out from 2005 to 2010 and discuss upcoming trends, both specific to ontology matching and generally relevant for the evaluation of semantic technologies. Finally, we argue that there is a need for a further automation of benchmarking to shorten the feedback cycle for tool developers.

290 citations


Journal ArticleDOI
TL;DR: MASTRO is a Java tool for ontology-based data access (OBDA) developed at Sapienza Universita di Roma and at the Free University of Bozen-Bolzano that provides optimized algorithms for answering expressive queries, as well as features for intensional reasoning and consistency checking.
Abstract: In this paper we present MASTRO, a Java tool for ontology-based data access (OBDA) developed at Sapienza Universita di Roma and at the Free University of Bozen-Bolzano. MASTRO manages OBDA systems in which the ontology is specified in DL-Lite A,id, a logic of the DL-Lite family of tractable Description Logics specifically tailored to ontology-based data access, and is connected to external JDBC enabled data management systems through semantic mappings that associate SQL queries over the external data to the elements of the ontology. Advanced forms of integrity constraints, which turned out to be very useful in practical applications, are also enabled over the ontologies. Optimized algorithms for answering expressive queries are provided, as well as features for intensional reasoning and consistency checking. MASTRO provides a proprietary API, an OWLAPI compatible interface, and a plugin for the Protege 4 ontology editor. It has been successfully used in several projects carried out in collaboration with important organizations, on which we briefly comment in this paper.

282 citations


Journal ArticleDOI
TL;DR: A new measure based on the exploitation of the taxonomical structure of a biomedical ontology is proposed, using SNOMED CT as the input ontology and shows that it outperforms most of the previous measures avoiding, at the same time, some of their limitations.

239 citations


Proceedings ArticleDOI
16 Jul 2011
TL;DR: The combined approach is described, which incorporates the information given by the ontology into the data and employs query rewriting to eliminate spurious answers in ontology-based data access.
Abstract: The use of ontologies for accessing data is one of the most exciting new applications of description logics in databases and other information systems. A realistic way of realising sufficiently scalable ontology-based data access in practice is by reduction to querying relational databases. In this paper, we describe the combined approach, which incorporates the information given by the ontology into the data and employs query rewriting to eliminate spurious answers. We illustrate this approach for ontologies given in the DL-Lite family of description logics and briefly discuss the results obtained for the EL family.

165 citations


Journal ArticleDOI
TL;DR: This paper proposes a set of guidelines for importing required terms from an external resource into a target ontology, describing the methodology, its implementation, and some examples of this application, and outline future work and extensions.
Abstract: While the Web Ontology Language OWL provides a mechanism to import ontologies, this mechanism is not always suitable. Current editing tools present challenges for working with large ontologies and direct OWL imports can prove impractical for day-to-day development. Furthermore, external ontologies often undergo continuous change which can introduce conflicts when integrated with multiple efforts. Finally, importing heterogeneous ontologies in their entirety may lead to inconsistencies or unintended inferences. In this paper we propose a set of guidelines for importing required terms from an external resource into a target ontology. We describe the methodology, its implementation, present some examples of this application, and outline future work and extensions.

165 citations


Book ChapterDOI
01 Jan 2011
TL;DR: This chapter presents a survey of the most relevant methods, techniques and tools used for the task of ontology learning, explaining how BOEMIE addresses problems observed in existing systems and contributes to issues that are not frequently considered by existing approaches.
Abstract: Ontology learning is the process of acquiring (constructing or integrating) an ontology (semi-) automatically. Being a knowledge acquisition task, it is a complex activity, which becomes even more complex in the context of the BOEMIE project1, due to the management of multimedia resources and the multi-modal semantic interpretation that they require. The purpose of this chapter is to present a survey of the most relevant methods, techniques and tools used for the task of ontology learning. Adopting a practical perspective, an overview of the main activities involved in ontology learning is presented. This breakdown of the learning process is used as a basis for the comparative analysis of existing tools and approaches. The comparison is done along dimensions that emphasize the particular interests of the BOEMIE project. In this context, ontology learning in BOEMIE is treated and compared to the state of the art, explaining how BOEMIE addresses problems observed in existing systems and contributes to issues that are not frequently considered by existing approaches.

158 citations


Journal ArticleDOI
TL;DR: A model for linguistic grounding of ontologies called LexInfo, implemented as an OWL ontology and freely available together with an API, which allows us to associate linguistic information to elements in an ontology with respect to any level of linguistic description and expressivity.

147 citations


Journal ArticleDOI
TL;DR: This work investigates the literature on both metamodelling and ontologies in order to identify ways in which they can be made compatible and linked in such a way as to benefit both communities and create a contribution to a coherent underpinning theory for software engineering.

143 citations


Journal ArticleDOI
TL;DR: This restructured ontology can be used to identify immune cells by flow cytometry, supports sophisticated biological queries involving cells, and helps generate new hypotheses about cell function based on similarities to other cell types.
Abstract: The Cell Ontology (CL) is an ontology for the representation of in vivo cell types. As biological ontologies such as the CL grow in complexity, they become increasingly difficult to use and maintain. By making the information in the ontology computable, we can use automated reasoners to detect errors and assist with classification. Here we report on the generation of computable definitions for the hematopoietic cell types in the CL. Computable definitions for over 340 CL classes have been created using a genus-differentia approach. These define cell types according to multiple axes of classification such as the protein complexes found on the surface of a cell type, the biological processes participated in by a cell type, or the phenotypic characteristics associated with a cell type. We employed automated reasoners to verify the ontology and to reveal mistakes in manual curation. The implementation of this process exposed areas in the ontology where new cell type classes were needed to accommodate species-specific expression of cellular markers. Our use of reasoners also inferred new relationships within the CL, and between the CL and the contributing ontologies. This restructured ontology can be used to identify immune cells by flow cytometry, supports sophisticated biological queries involving cells, and helps generate new hypotheses about cell function based on similarities to other cell types. Use of computable definitions enhances the development of the CL and supports the interoperability of OBO ontologies.

140 citations


Journal ArticleDOI
TL;DR: This article describes how to adapt a semi-automatic method for learning OWL class expressions to the ontology engineering use case and performs rigorous performance optimization of the underlying algorithms for providing instant suggestions to the user.

134 citations


Journal ArticleDOI
TL;DR: Methods developed in the fields of Natural Language Processing, information extraction, information retrieval and machine learning provide techniques for automating the enrichment of an ontology from free-text documents.

Book ChapterDOI
01 Jul 2011
TL;DR: In the last decades, the use of ontologies in information systems has become more and more popular in various fields, such as web technologies, database integration, multi agent systems, natural language processing, etc.
Abstract: In the last decades, the use of ontologies in information systems has become more and more popular in various fields, such as web technologies, database integration, multi agent systems, natural language processing, etc. Artificial intelligent researchers have initially borrowed the word “ontology” from Philosophy, then the word spread in many scientific domain and ontologies are now used in several developments.

Journal ArticleDOI
TL;DR: Preliminary results of an ongoing effort to normalize the Gene Ontology by explicitly stating the definitions of compositional classes in a form that can be used by reasoners are presented.

Proceedings ArticleDOI
16 Jul 2011
TL;DR: A new approach is reported that enables us to efficiently extract a polynomial representation of the family of all locality-based modules of an ontology, and the fundamental algorithm to pursue this task is described.
Abstract: Extracting a subset of a given ontology that captures all the ontology's knowledge about a specified set of terms is a well-understood task. This task can be based, for instance, on locality-based modules. However, a single module does not allow us to understand neither topicality, connectedness, structure, or superfluous parts of an ontology, nor agreement between actual and intended modeling. The strong logical properties of locality-based modules suggest that the family of all such modules of an ontology can support comprehension of the ontology as a whole. However, extracting that family is not feasible, since the number of locality-based modules of an ontology can be exponential w.r.t. its size. In this paper we report on a new approach that enables us to efficiently extract a polynomial representation of the family of all locality-based modules of an ontology. We also describe the fundamental algorithm to pursue this task, and report on experiments carried out and results obtained.

Proceedings ArticleDOI
24 Oct 2011
TL;DR: This talk provides an introduction to ontology-based data management, illustrating the main ideas and techniques for using an ontology to access the data layer of an information system, and discusses several important issues that are still the subject of extensive investigations, including the need of inconsistency tolerant query answering methods, and theneed of supporting update operations expressed over the ontology.
Abstract: Ontology-based data management aims at accessing and using data by means of an ontology, i.e., a conceptual representation of the domain of interest in the underlying information system. This new paradigm provides several interesting features, many of which have been already proved effective in managing complex information systems. On the other hand, several important issues remain open, and constitute stimulating challenges for the research community. In this talk we first provide an introduction to ontology-based data management, illustrating the main ideas and techniques for using an ontology to access the data layer of an information system, and then we discuss several important issues that are still the subject of extensive investigations, including the need of inconsistency tolerant query answering methods, and the need of supporting update operations expressed over the ontology.

Journal ArticleDOI
TL;DR: The design and development of NanoParticle Ontology is discussed, which is developed within the framework of the Basic Formal Ontology (BFO), and implemented in the Ontology Web Language (OWL) using well-defined ontology design principles.

Book ChapterDOI
28 Jun 2011
TL;DR: Pythia compositionally constructs meaning representations using a vocabulary aligned to the vocabulary of a given ontology, which relies on a deep linguistic analysis that allows to construct formal queries even for complex natural language questions.
Abstract: In this paper we present the ontology-based question answering system Pythia. It compositionally constructs meaning representations using a vocabulary aligned to the vocabulary of a given ontology. In doing so it relies on a deep linguistic analysis, which allows to construct formal queries even for complex natural language questions (e.g. involving quantification and superlatives).

Journal ArticleDOI
03 Oct 2011-PLOS ONE
TL;DR: The work in developing an ontology of chemical information entities, with a primary focus on data-driven research and the integration of calculated properties (descriptors) of chemical entities within a semantic web context is described.
Abstract: Cheminformatics is the application of informatics techniques to solve chemical problems in silico. There are many areas in biology where cheminformatics plays an important role in computational research, including metabolism, proteomics, and systems biology. One critical aspect in the application of cheminformatics in these fields is the accurate exchange of data, which is increasingly accomplished through the use of ontologies. Ontologies are formal representations of objects and their properties using a logic-based ontology language. Many such ontologies are currently being developed to represent objects across all the domains of science. Ontologies enable the definition, classification, and support for querying objects in a particular domain, enabling intelligent computer applications to be built which support the work of scientists both within the domain of interest and across interrelated neighbouring domains. Modern chemical research relies on computational techniques to filter and organise data to maximise research productivity. The objects which are manipulated in these algorithms and procedures, as well as the algorithms and procedures themselves, enjoy a kind of virtual life within computers. We will call these information entities. Here, we describe our work in developing an ontology of chemical information entities, with a primary focus on data-driven research and the integration of calculated properties (descriptors) of chemical entities within a semantic web context. Our ontology distinguishes algorithmic, or procedural information from declarative, or factual information, and renders of particular importance the annotation of provenance to calculated data. The Chemical Information Ontology is being developed as an open collaborative project. More details, together with a downloadable OWL file, are available at http://code.google.com/p/semanticchemistry/ (license: CC-BY-SA).

Journal ArticleDOI
TL;DR: Changes and improvements made to SO are reported including new relationships to better define the mereological, spatial and temporal aspects of biological sequence.

Book ChapterDOI
23 Oct 2011
TL;DR: This paper discusses several approaches to learning a matching function between two ontologies using a small set of manually aligned concepts, and evaluates them on different pairs of financial accounting standards, showing that multilingual information can indeed improve the matching quality, even in cross-lingual scenarios.
Abstract: Ontology matching is a task that has attracted considerable attention in recent years. With very few exceptions, however, research in ontology matching has focused primarily on the development of monolingual matching algorithms. As more and more resources become available in more than one language, novel algorithms are required which are capable of matching ontologies which share more than one language, or ontologies which are multilingual but do not share any languages. In this paper, we discuss several approaches to learning a matching function between two ontologies using a small set of manually aligned concepts, and evaluate them on different pairs of financial accounting standards, showing that multilingual information can indeed improve the matching quality, even in cross-lingual scenarios. In addition to this, as current research on ontology matching does not make a satisfactory distinction between multilingual and cross-lingual ontology matching, we provide precise definitions of these terms in relation to monolingual ontology matching, and quantify their effects on different matching algorithms.

Journal ArticleDOI
TL;DR: The approach for ontology extraction on top of RDB by incorporating concept hierarchy as background knowledge is proposed, which is more efficient than the current approaches and can be applied in any of the fields such as eGoverment, eCommerce and so on.
Abstract: Relational Database (RDB) has been widely used as the back-end database of information system. Contains a wealth of high-quality information, RDB provides conceptual model and metadata needed in the ontology construction. However, most of the existing ontology building approaches convert RDB schema without considering the knowledge resided in the database. This paper proposed the approach for ontology extraction on top of RDB by incorporating concept hierarchy as background knowledge. Incorporating the background knowledge in the building process of Web Ontology Language (OWL) ontology gives two main advantages: (1) accelerate the building process, thereby minimizing the conversion cost; (2) background knowledge guides the extraction of knowledge resided in database. The experimental simulation using a gold standard shows that the Taxonomic F-measure (TF) evaluation reaches 90% while Relation Overlap (RO) is 83.33%. In term of processing time, this approach is more efficient than the current approaches. In addition, our approach can be applied in any of the fields such as eGoverment, eCommerce and so on.

Book ChapterDOI
29 May 2011
TL;DR: This work presents a solution for automatically finding schema-level links between two LOD ontologies - in the sense of ontology alignment - and shows that this solution significantly outperformed existing ontology aligned solutions on this same task.
Abstract: The Linked Open Data (LOD) is a major milestone towards realizing the Semantic Web vision, and can enable applications such as robust Question Answering (QA) systems that can answer queries requiring multiple, disparate information sources. However, realizing these applications requires relationships at both the schema and instance level, but currently the LOD only provides relationships for the latter. To address this limitation, we present a solution for automatically finding schema-level links between two LOD ontologies - in the sense of ontology alignment. Our solution, called BLOOMS+, extends our previous solution (i.e. BLOOMS) in two significant ways. BLOOMS+ 1) uses a more sophisticated metric to determine which classes between two ontologies to align, and 2) considers contextual information to further support (or reject) an alignment. We present a comprehensive evaluation of our solution using schema-level mappings from LOD ontologies to Proton (an upper level ontology) - created manually by human experts for a real world application called FactForge. We show that our solution performed well on this task. We also show that our solution significantly outperformed existing ontology alignment solutions (including our previously published work on BLOOMS) on this same task.

Book ChapterDOI
26 May 2011

Journal ArticleDOI
TL;DR: A novel method, dubbed DiShIn, that effectively exploits the multiple inheritance relationships present in many biomedical ontologies by modifying the way traditional semantic similarity measures calculate the shared information content of two ontology concepts.
Abstract: The large-scale effort in developing, maintaining and making biomedical ontologies available motivates the application of similarity measures to compare ontology concepts or, by extension, the entities described therein. A common approach, known as semantic similarity, compares ontology concepts through the information content they share in the ontology. However, different disjunctive ancestors in the ontology are frequently neglected, or not properly explored, by semantic similarity measures. This paper proposes a novel method, dubbed DiShIn, that effectively exploits the multiple inheritance relationships present in many biomedical ontologies. DiShIn calculates the shared information content of two ontology concepts, based on the information content of the disjunctive common ancestors of the concepts being compared. DiShIn identifies these disjunctive ancestors through the number of distinct paths from the concepts to their common ancestors. DiShIn was applied to Gene Ontology and its performance was evaluated against state-of-the-art measures using CESSM, a publicly available evaluation platform of protein similarity measures. By modifying the way traditional semantic similarity measures calculate the shared information content, DiShIn was able to obtain a statistically significant higher correlation between semantic and sequence similarity. Moreover, the incorporation of DiShIn in existing applications that exploit multiple inheritance would reduce their execution time.

Journal ArticleDOI
TL;DR: The underlying knowledge base, which is based on the formal ontology OntoCAPE, is presented, and the design and implementation of a prototypical integration software are described and the application of the software prototype in a large industrial use case is reported.

Journal ArticleDOI
01 Mar 2011
TL;DR: A new method for combining the WordNet and Fuzzy Formal Concept Analysis (FFCA) techniques for merging ontologies with the same domain, called FFCA-Merge is proposed, which can merge domain ontologies effectively.
Abstract: Many different contents and structures exist in constructed ontologies, including those that exist in the same domain. If extant domain ontologies can be used, time and money can be saved. However, domain knowledge changes fast. In addition, the extant domain ontologies may require updates to solve domain problems. The reuse of extant ontologies is an important topic for their application. Thus, the integration of extant domain ontologies is of considerable importance. In this paper, we propose a new method for combining the WordNet and Fuzzy Formal Concept Analysis (FFCA) techniques for merging ontologies with the same domain, called FFCA-Merge. Through the method, two extant ontologies can be converted into a fuzzy ontology. The new fuzzy ontology is more flexible than a general ontology. The experimental results indicate that our method can merge domain ontologies effectively.

Journal ArticleDOI
TL;DR: OntoCmaps is presented, a domain-independent and open ontology learning tool that extracts deep semantic representations from corpora and generates rich conceptual representations in the form of concept maps and proposes an innovative filtering mechanism based on metrics from graph theory.

Journal ArticleDOI
TL;DR: This paper proposes a new ontology, called OM (Ontology of units of Measure and related concepts), which defines the complete set of concepts in the domain as distinguished in the textual standards, and can answer a wider range of competency questions than the existing approaches do.

Journal ArticleDOI
01 Jul 2011
TL;DR: This paper presents the approach to extract relevant ontology concepts and their relationships from a knowledge base of heterogeneous text documents and shows the architecture of the implemented system and discusses the experiments in a real-world context.
Abstract: Ontologies have been frequently employed in order to solve problems derived from the management of shared distributed knowledge and the efficient integration of information across different applications However, the process of ontology building is still a lengthy and error-prone task Therefore, a number of research studies to (semi-)automatically build ontologies from existing documents have been developed In this paper, we present our approach to extract relevant ontology concepts and their relationships from a knowledge base of heterogeneous text documents We also show the architecture of the implemented system and discuss the experiments in a real-world context

Journal ArticleDOI
TL;DR: A methodology for building a semantically annotated multi-faceted ontology for product family modelling that is able to automatically suggest semantically-related annotations based on the design and manufacturing repository is proposed.