scispace - formally typeset
Search or ask a question
Topic

Ontology-based data integration

About: Ontology-based data integration is a research topic. Over the lifetime, 11065 publications have been published within this topic receiving 216888 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: ENVO has been shaped into an ontology which bridges multiple domains including biomedicine, natural and anthropogenic ecology, ‘omics, and socioeconomic development and is anticipate that ENVO’s growth will accelerate in 2017.
Abstract: The Environment Ontology (ENVO; http://www.environmentontology.org/ ), first described in 2013, is a resource and research target for the semantically controlled description of environmental entities. The ontology's initial aim was the representation of the biomes, environmental features, and environmental materials pertinent to genomic and microbiome-related investigations. However, the need for environmental semantics is common to a multitude of fields, and ENVO's use has steadily grown since its initial description. We have thus expanded, enhanced, and generalised the ontology to support its increasingly diverse applications. We have updated our development suite to promote expressivity, consistency, and speed: we now develop ENVO in the Web Ontology Language (OWL) and employ templating methods to accelerate class creation. We have also taken steps to better align ENVO with the Open Biological and Biomedical Ontologies (OBO) Foundry principles and interoperate with existing OBO ontologies. Further, we applied text-mining approaches to extract habitat information from the Encyclopedia of Life and automatically create experimental habitat classes within ENVO. Relative to its state in 2013, ENVO's content, scope, and implementation have been enhanced and much of its existing content revised for improved semantic representation. ENVO now offers representations of habitats, environmental processes, anthropogenic environments, and entities relevant to environmental health initiatives and the global Sustainable Development Agenda for 2030. Several branches of ENVO have been used to incubate and seed new ontologies in previously unrepresented domains such as food and agronomy. The current release version of the ontology, in OWL format, is available at http://purl.obolibrary.org/obo/envo.owl . ENVO has been shaped into an ontology which bridges multiple domains including biomedicine, natural and anthropogenic ecology, ‘omics, and socioeconomic development. Through continued interactions with our users and partners, particularly those performing data archiving and sythesis, we anticipate that ENVO’s growth will accelerate in 2017. As always, we invite further contributions and collaboration to advance the semantic representation of the environment, ranging from geographic features and environmental materials, across habitats and ecosystems, to everyday objects in household settings.

181 citations

01 Jan 2007
TL;DR: Different data qualities are analyzed with respect to their semantic and spatial structure leading to the distinction of six categories regarding the spatio-semantic coherence of 3D city models, and it is shown how spatial data with complex object descriptions support the integration process.
Abstract: An increasing number of applications rely on 3D geoinformation. In addition to 3D geometry, these applications particularly require complex semantic information. In the context of spatial data infrastructures the needed data are drawn from distributed sources and often are thematically and spatially fragmented. Straight forward joining of 3D objects would inevitably lead to geometrical inconsistencies such as cracks, permeations, or other inconsistencies. Semantic information can help to reduce the ambiguities for geometric integration, if it is coherently structured with respect to geometry. The paper discusses these problems with special focus on virtual 3D city models and the semantic data model CityGML, an emerging standard for the representation and the exchange of 3D city models based on ISO 191xx standards and GML3. Different data qualities are analyzed with respect to their semantic and spatial structure leading to the distinction of six categories regarding the spatio-semantic coherence of 3D city models. Furthermore, it is shown how spatial data with complex object descriptions support the integration process. The derived categories will help in the future development of automatic integration methods for complex 3D geodata.

181 citations

Journal ArticleDOI
Jie Tang1, Juanzi Li1, Bangyong Liang1, Xiaotong Huang1, Yi Li1, Kehong Wang1 
TL;DR: An approach called Risk Minimization based Ontology Mapping (RiMOM) is proposed, which automates the process of discoveries on 1:1, n: 1, 1:null and null:1 mappings and uses thesaurus and statistical technique to deal with the problem of name conflict in mapping process.

180 citations

Journal ArticleDOI
01 Mar 2007
TL;DR: A novel episode-based ontology construction mechanism to extract domain ontology from unstructured text documents and fuzzy numbers for conceptual similarity computing are presented for concept clustering and taxonomic relation definitions.
Abstract: Ontology is playing an increasingly important role in knowledge management and the Semantic Web. This study presents a novel episode-based ontology construction mechanism to extract domain ontology from unstructured text documents. Additionally, fuzzy numbers for conceptual similarity computing are presented for concept clustering and taxonomic relation definitions. Moreover, concept attributes and operations can be extracted from episodes to construct a domain ontology, while non-taxonomic relations can be generated from episodes. The fuzzy inference mechanism is also applied to obtain new instances for ontology learning. Experimental results show that the proposed approach can effectively construct a Chinese domain ontology from unstructured text documents.

179 citations

Journal ArticleDOI
01 Sep 2004
TL;DR: The pros and cons of the current approaches and systems are identified and what an integration system for biologists ought to be are discussed.
Abstract: This paper surveys the area of biological and genomic sources integration, which has recently become a major focus of the data integration research field. The challenges that an integration system for biological sources must face are due to several factors such as the variety and amount of data available, the representational heterogeneity of the data in the different sources, and the autonomy and differing capabilities of the sources.This survey describes the main integration approaches that have been adopted. They include warehouse integration, mediator-based integration, and navigational integration. Then we look at the four major existing integration systems that have been developed for the biological domain: SRS, BioKleisli, TAMBIS, and DiscoveryLink. After analyzing these systems and mentioning a few others, we identify the pros and cons of the current approaches and systems and discuss what an integration system for biologists ought to be.

178 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Software development
73.8K papers, 1.4M citations
84% related
User interface
85.4K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202337
2022149
202111
202011
201919
201843