scispace - formally typeset
Search or ask a question
Topic

Ontology-based data integration

About: Ontology-based data integration is a research topic. Over the lifetime, 11065 publications have been published within this topic receiving 216888 citations.


Papers
More filters
Book ChapterDOI
29 Sep 2008
TL;DR: An ontology that was developed to represent different aspects of workflows for collaborative ontology development is discussed, and an ontology is a key component of the customizable workflow support in Protege.
Abstract: As knowledge engineering moves to the Semantic Web, ontologies become dynamic products of collaborative development rather than artifacts produced in a closed environment of a single research group. However, the projects differ--sometimes significantly--in the way the community members can contribute, the different roles they play, the mechanisms they use to carry out discussions and to achieve consensus. We are currently developing a flexible mechanism to support a wide range of collaborative workflows in the Protege environment. In this paper, we analyze workflows for several active projects, and describe the properties of these workflows. We discuss an ontology that we developed to represent different aspects of workflows for collaborative ontology development. This ontology is a key component of the customizable workflow support in Protege. We evaluate the coverage and flexibility of this ontology by using it to represent formally two different collaborative workflows described in the literature, Diligent and BiomedGT. This evaluation demonstrates that our workflow ontology is sufficiently flexible to represent these very different workflows.

56 citations

Journal ArticleDOI
TL;DR: The use of common ontologies are demonstrated in building this unified conceptualization, e.g. a common ontology on assessment projects and scenarios, which refers to the development of SEAMLESS-IF, an integrated modelling framework to assess agricultural and environmental policy options as to their contribution to sustainable development.
Abstract: Integrated Assessment and Modelling (IAM) provides an interdisciplinary approach to support ex-ante decision-making by combining quantitative models representing different systems and scales into a framework for integrated assessment. Scenarios in IAM are developed in the interaction between scientists and stakeholders to explore possible pathways of future development. As IAM typically combines models from different disciplines, there is a clear need for a consistent definition and implementation of scenarios across models, policy problems and scales. This paper presents such a unified conceptualization for scenario and assessment projects. We demonstrate the use of common ontologies in building this unified conceptualization, e.g. a common ontology on assessment projects and scenarios. The common ontology and the process of ontology engineering are used in a case study, which refers to the development of SEAMLESS-IF, an integrated modelling framework to assess agricultural and environmental policy options as to their contribution to sustainable development. The presented common ontology on assessment projects and scenarios can be reused by IAM consortia and if required, adapted by using the process of ontology engineering as proposed in this paper.

56 citations

Journal ArticleDOI
TL;DR: The notion of ontology localization is revisited, a new definition is proposed and the layers of an ontology that can be affected by the process of localizing it are specified.
Abstract: We revisit the notion of ontology localization, propose a new definition and clearly specify the layers of an ontology that can be affected by the process of localizing it. We also work out a number of dimensions that allow to characterize the type of ontology localization performed and to predict the layers that will be affected. Overall our aim is to contribute to a better understanding of the task of localizing an ontology.

56 citations

Journal ArticleDOI
TL;DR: This paper starts by tackling the basic issue of matching heterogeneous dimensions and provides a number of general properties that a dimension matching should fulfill, and proposes two different approaches to the problem of integration that try to enforce matchings satisfying these properties.
Abstract: In this paper we address the problem of integrating independent and possibly heterogeneous data warehouses, a problem that has received little attention so far, but that arises very often in practice. We start by tackling the basic issue of matching heterogeneous dimensions and provide a number of general properties that a dimension matching should fulfill. We then propose two different approaches to the problem of integration that try to enforce matchings satisfying these properties. The first approach refers to a scenario of loosely coupled integration, in which we just need to identify the common information between data sources and perform join operations over the original sources. The goal of the second approach is the derivation of a materialized view built by merging the sources, and refers to a scenario of tightly coupled integration in which queries are performed against the view. We also illustrate architecture and functionality of a practical system that we have developed to demonstrate the effectiveness of our integration strategies.

56 citations

Book ChapterDOI
TL;DR: This chapter presents a straightforward categorization of SS measures and describes the main strategies they employ, and summarizes comparative assessment studies, highlighting the top measures in different settings, and compare different implementation strategies and their use.
Abstract: Gene Ontology-based semantic similarity (SS) allows the comparison of GO terms or entities annotated with GO terms, by leveraging on the ontology structure and properties and on annotation corpora. In the last decade the number and diversity of SS measures based on GO has grown considerably, and their application ranges from functional coherence evaluation, protein interaction prediction, and disease gene prioritization.Understanding how SS measures work, what issues can affect their performance and how they compare to each other in different evaluation settings is crucial to gain a comprehensive view of this area and choose the most appropriate approaches for a given application.In this chapter, we provide a guide to understanding and selecting SS measures for biomedical researchers. We present a straightforward categorization of SS measures and describe the main strategies they employ. We discuss the intrinsic and external issues that affect their performance, and how these can be addressed. We summarize comparative assessment studies, highlighting the top measures in different settings, and compare different implementation strategies and their use. Finally, we discuss some of the extant challenges and opportunities, namely the increased semantic complexity of GO and the need for fast and efficient computation, pointing the way towards the future generation of SS measures.

56 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Software development
73.8K papers, 1.4M citations
84% related
User interface
85.4K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202337
2022149
202111
202011
201919
201843