scispace - formally typeset
Search or ask a question

Showing papers on "Ontology-based data integration published in 2014"


Journal ArticleDOI
TL;DR: A live catalogue of pitfalls that extends previous works on modeling errors with new pitfalls resulting from an empirical analysis of over 693 ontologies, and OOPS! (OntOlogy Pitfall Scanner!), a tool for detecting pitfalls in ontologies and targeted at newcomers and domain experts unfamiliar with description logics and ontology implementation languages.
Abstract: This paper presents two contributions to the field of Ontology Evaluation. First, a live catalogue of pitfalls that extends previous works on modeling errors with new pitfalls resulting from an empirical analysis of over 693 ontologies. Such a catalogue classifies pitfalls according to the Structural, Functional and Usability-Profiling dimensions. For each pitfall, we incorporate the value of its importance level (critical, important and minor) and the number of ontologies where each pitfall has been detected. Second, OOPS! (OntOlogy Pitfall Scanner!), a tool for detecting pitfalls in ontologies and targeted at newcomers and domain experts unfamiliar with description logics and ontology implementation languages. The tool operates independently of any ontology development platform and is available online. The evaluation of the system is provided both through a survey of users' satisfaction and worldwide usage statistics. In addition, the system is also compared with existing ontology evaluation tools in terms of coverage of pitfalls detected.

227 citations


Journal ArticleDOI
TL;DR: It is demonstrated that the inclusion of fuzzy concepts and relations in the ontology provide benefits during the recognition process with respect to crisp approaches.
Abstract: We propose a fuzzy ontology for human activity representation, which allows us to model and reason about vague, incomplete, and uncertain knowledge. Some relevant subdomains found to be missing in previous proposed ontologies for this domain were modelled as well. The resulting fuzzy OWL 2 ontology is able to model uncertain knowledge and represent temporal relationships between activities using an underlying fuzzy state machine representation. We provide a proof of concept of the approach in work scenarios such as the office domain, and also make experiments to emphasize the benefits of our approach with respect to crisp ontologies. As a result, we demonstrate that the inclusion of fuzzy concepts and relations in the ontology provide benefits during the recognition process with respect to crisp approaches.

113 citations


Book ChapterDOI
01 Jan 2014
TL;DR: This chapter begins with an elaboration of the FBS ontology followed by the situated FBS framework which articulates a more detailed cognitive view and demonstrates both the empirical support for the ontology and its applicability.
Abstract: This chapter commences by introducing the background to the development of the Function-Behaviour-Structure (FBS) ontology. It then proceeds with an elaboration of the FBS ontology followed by the situated FBS framework which articulates a more detailed cognitive view. A series of exemplary empirical studies that use a coding scheme based on the FBS ontology is presented that demonstrates both the empirical support for the ontology and its applicability. The chapter concludes with a brief discussion on the role of this ontology and possible developments.

109 citations


Book ChapterDOI
25 May 2014
TL;DR: This paper first analyzes real-world competency questions collected from two different domains, and employs the linguistic notion of presupposition to describe the ontology requirements implied by Competency questions, and shows that these requirements can be tested automatically.
Abstract: Ontology authoring is a non-trivial task for authors who are not proficient in logic. It is difficult to either specify the requirements for an ontology, or test their satisfaction. In this paper, we propose a novel approach to address this problem by leveraging the ideas of competency questions and test-before software development. We first analyse real-world competency questions collected from two different domains. Analysis shows that many of them can be categorised into patterns that differ along a set of features. Then we employ the linguistic notion of presupposition to describe the ontology requirements implied by competency questions, and show that these requirements can be tested automatically.

100 citations


Patent
21 Mar 2014
TL;DR: In this paper, a model of ontology is obtained that comprises nodes representing concepts of the ontology and edges between nodes representing relationships between associated nodes of an ontology, and the spreading activation operation matching a portion of a textual source to a matching node and identifying related nodes to the matching node through edges of the model associated with the matching nodes to generate an activation network.
Abstract: Mechanisms are provided for modifying an ontology for use with a natural language processing (NLP) task. A model of the ontology is obtained that comprises nodes representing concepts of the ontology and edges between nodes representing relationships between associated nodes of the ontology. A spreading activation operation is performed on the model of the ontology with the spreading activation operation matching a portion of a textual source to a matching node of the model and identifying related nodes to the matching node through edges of the model associated with the matching node to thereby generate an activation network. The activation network is evaluated with regard to a chosen NLP task to determine a performance metric for the NLP task associated with the nodes of the model. Based on the results, one of the model or a configuration of the activation network may be modified.

93 citations


Journal ArticleDOI
TL;DR: The content of the Pathway Ontology has increased by over 75% since first presented and the implementation of pipelines promote an enriched provision of pathway data.
Abstract: Background The Pathway Ontology (PW) developed at the Rat Genome Database (RGD), covers all types of biological pathways, including altered and disease pathways and captures the relationships between them within the hierarchical structure of a directed acyclic graph. The ontology allows for the standardized annotation of rat, and of human and mouse genes to pathway terms. It also constitutes a vehicle for easy navigation between gene and ontology report pages, between reports and interactive pathway diagrams, between pathways directly connected within a diagram and between those that are globally related in pathway suites and suite networks. Surveys of the literature and the development of the Pathway and Disease Portals are important sources for the ongoing development of the ontology. User requests and mapping of pathways in other databases to terms in the ontology further contribute to increasing its content. Recently built automated pipelines use the mapped terms to make available the annotations generated by other groups.

90 citations


Proceedings ArticleDOI
01 Jan 2014
TL;DR: A formal ontological description of the Business Process Modelling Notation (BPMN), one of the most popular languages for business process modelling, and the modelling process followed for the creation of the BPMN Ontology are presented.
Abstract: In this paper we describe a formal ontological description of the Business Process Modelling Notation (BPMN), one of the most popular languages for business process modelling. The proposed ontology (the BPMN Ontology) provides a classification of all the elements of BPMN, together with the formal description of the attributes and conditions describing how the elements can be combined in a BPMN business process description. Using the classes and properties defined in the BPMN Ontology any BPMN diagram can be represented as an A-box (i.e., a set of instances and assertions on them) of the ontology: this allows the exploitation of ontological reasoning services such as consistency checking and query answering to investigate the compliance of a process with the BPMN Specification as well as other structural property of the process. The paper also presents the modelling process followed for the creation of the BPMN Ontology, and describes some application scenarios exploiting the BPMN Ontology.

84 citations


Journal ArticleDOI
TL;DR: This work states that there is a lack of formal representation of the relevant knowledge domain for neurodegenerative diseases such as Alzheimer's disease.
Abstract: Background Biomedical ontologies offer the capability to structure and represent domain-specific knowledge semantically. Disease-specific ontologies can facilitate knowledge exchange across multiple disciplines, and ontology-driven mining approaches can generate great value for modeling disease mechanisms. However, in the case of neurodegenerative diseases such as Alzheimer's disease, there is a lack of formal representation of the relevant knowledge domain. Methods Alzheimer's disease ontology (ADO) is constructed in accordance to the ontology building life cycle. The Protege OWL editor was used as a tool for building ADO in Ontology Web Language format. Results ADO was developed with the purpose of containing information relevant to four main biological views—preclinical, clinical, etiological, and molecular/cellular mechanisms—and was enriched by adding synonyms and references. Validation of the lexicalized ontology by means of named entity recognition-based methods showed a satisfactory performance (F score=72%). In addition to structural and functional evaluation, a clinical expert in the field performed a manual evaluation and curation of ADO. Through integration of ADO into an information retrieval environment, we show that the ontology supports semantic search in scientific text. The usefulness of ADO is authenticated by dedicated use case scenarios. Conclusions Development of ADO as an open ADO is a first attempt to organize information related to Alzheimer's disease in a formalized, structured manner. We demonstrate that ADO is able to capture both established and scattered knowledge existing in scientific text.

79 citations


Journal ArticleDOI
TL;DR: The goal of this article is to survey several of the most outstanding methodologies, methods and techniques that have emerged in the last years, and present the most popular development environments, which can be utilized to carry out, or facilitate specific activities within the methodologies.
Abstract: Building ontologies in a collaborative and increasingly community-driven fashion has become a central paradigm of modern ontology engineering. This understanding of ontologies and ontology engineering processes is the result of intensive theoretical and empirical research within the Semantic Web community, supported by technology developments such as Web 2.0. Over 6 years after the publication of the first methodology for collaborative ontology engineering, it is generally acknowledged that, in order to be useful, but also economically feasible, ontologies should be developed and maintained in a community-driven manner, with the help of fully-fledged environments providing dedicated support for collaboration and user participation. Wikis, and similar communication and collaboration platforms enabling ontology stakeholders to exchange ideas and discuss modeling decisions are probably the most important technological components of such environments. In addition, process-driven methodologies assist the ontology engineering team throughout the ontology life cycle, and provide empirically grounded best practices and guidelines for optimizing ontology development results in real-world projects. The goal of this article is to analyze the state of the art in the field of collaborative ontology engineering. We will survey several of the most outstanding methodologies, methods and techniques that have emerged in the last years, and present the most popular development environments, which can be utilized to carry out, or facilitate specific activities within the methodologies. A discussion of the open issues identified concludes the survey and provides a roadmap for future research and development in this lively and promising field.

78 citations


01 Jan 2014
TL;DR: Aber-OWL provides a framework for automatically accessing information that is annotated with ontologies or contains terms used to label classes in ontologies that enable ontology-based semantic access to biological data and literature.
Abstract: Background Many ontologies have been developed in biology and these ontologies increasingly contain large volumes of formalized knowledge commonly expressed in the Web Ontology Language (OWL). Computational access to the knowledge contained within these ontologies relies on the use of automated reasoning.

77 citations


Journal ArticleDOI
01 Sep 2014
TL;DR: A new approach called STROMA (SemanTic Refinement of Ontology MAppings) is presented to determine semantic ontology mappings that follows a so-called enrichment strategy that refines the mappings determined with a state-of-the-art match tool.
Abstract: There is a large number of tools to match or align corresponding concepts between ontologies. Most tools are restricted to equality correspondences, although many concepts may be related differently, e.g. according to an is-a or part-of relationship. Supporting such additional semantic correspondences can greatly improve the expressiveness of ontology mappings and their usefulness for tasks such as ontology merging and ontology evolution. We present a new approach called STROMA (SemanTic Refinement of Ontology MAppings) to determine semantic ontology mappings. In contrast to previous approaches, it follows a so-called enrichment strategy that refines the mappings determined with a state-of-the-art match tool. The enrichment strategy employs several techniques including the use of background knowledge and linguistic approaches to identify the additional kinds of correspondences. We evaluate the approach in detail using several real-life benchmark tests. A comparison with different tools for semantic ontology matching confirms the viability of the proposed enrichment strategy.

Patent
23 Jul 2014
TL;DR: In this paper, an example method for facilitating network control and management using semantic reasoners in a network environment is provided and includes generating a fully populated semantics model of the network from network data according to a base network ontology.
Abstract: An example method for facilitating network control and management using semantic reasoners in a network environment is provided and includes generating a fully populated semantics model of the network from network data according to a base network ontology of the network, mapping the fully populated semantics model to a network knowledge base, feeding contents of the network knowledge base to a semantic reasoner, and controlling and managing the network using the semantic reasoner. In specific embodiments, generating the model includes receiving the network data from the network, parsing the network data, loading the parsed network data into in-memory data structures, accessing a manifest specifying binding between a network data definition format and ontology components of the base network ontology, identifying ontology components associated with the network data based on the manifest, and populating the identified ontology components with individuals and properties from the corresponding data structures.

DOI
04 Feb 2014
TL;DR: A novel architecture for instance matching that takes into account the particularities of this heterogeneous and distributed setting and operates even when there is no overlap between schemas, apart from a key label that matching instances must share is proposed.
Abstract: Data integration is a broad area encompassing techniques to merge data between data sources. Although there are plenty of efficient and effective methods focusing on data integration over homogeneous data, where instances share the same schema and range of values, their applications over heterogeneous data are less clear. This thesis considers data integration within the environment of the Semantic Web. More particularly, we propose a novel architecture for instance matching that takes into account the particularities of this heterogeneous and distributed setting. Instead of assuming that instances share the same schema, the proposed method operates even when there is no overlap between schemas, apart from a key label that matching instances must share. Moreover, we have considered the distributed nature of the Semantic Web to propose a new architecture for general data integration, which operates on-the-fly and in a pay-as-you-go fashion. We show that our view and the view of the traditional data integration school each only partially address the problem, but together complement each other. We have observed that this unified view gives a better insight into their relative importance and how data integration methods can benefit from their combination. The results achieved in this work are particularly interesting for the Semantic Web and Data Integration communities.

Journal ArticleDOI
TL;DR: OntoDM-core defines the most essential data mining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer, which provides a representational framework for the description of mining structured data.
Abstract: In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines the most essential data mining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend. OntoDM-core is available at http://www.ontodm.com .

Journal ArticleDOI
TL;DR: This work proposes a semi-automatic system, called the Framework for InTegrating Ontologies, that can reduce the heterogeneity of the ontologies and retrieve frequently used core properties for each class by analyzing the instances of linked data sets.
Abstract: The Linked Open Data cloud contains tremendous amounts of interlinked instances with abundant knowledge for retrieval. However, because the ontologies are large and heterogeneous, it is time-consuming to learn all the ontologies manually and it is difficult to learn the properties important for describing instances of a specific class. To construct an ontology that helps users to easily access various data sets, we propose a semi-automatic system, called the Framework for InTegrating Ontologies, that can reduce the heterogeneity of the ontologies and retrieve frequently used core properties for each class. The framework consists of three main components: graph-based ontology integration, machine-learning-based approach for finding the core ontology classes and properties, and integrated ontology constructor. By analyzing the instances of linked data sets, this framework constructs a high-quality integrated ontology, which is easily understandable and effective in knowledge acquisition from various data sets using simple SPARQL queries.

Journal ArticleDOI
TL;DR: GOssTo is a user-friendly software system for calculating semantic similarities between gene products according to the Gene Ontology, and it allows the calculation of similarities on a genomic scale in a few minutes on a regular desktop machine.
Abstract: Summary: We present GOssTo, the Gene Ontology semantic similarity Tool, a user-friendly software system for calculating semantic similarities between gene products according to the Gene Ontology. GOssTo is bundled with six semantic similarity measures, including both term- and graph-based measures, and has extension capabilities to allow the user to add new similarities. Importantly, for any measure, GOssTo can also calculate the Random Walk Contribution that has been shown to greatly improve the accuracy of similarity measures. GOssTo is very fast, easy to use, and it allows the calculation of similarities on a genomic scale in a few minutes on a regular desktop machine. Contact: ku.ca.luhr.sc@otrebla Availability: GOssTo is available both as a stand-alone application running on GNU/Linux, Windows and MacOS from www.paccanarolab.org/gossto and as a web application from www.paccanarolab.org/gosstoweb. The stand-alone application features a simple and concise command line interface for easy integration into high-throughput data processing pipelines.

Journal ArticleDOI
TL;DR: An ontology-based framework for IDA that has a modular design that facilitates the integration, exchange and reuse of its constitutive parts is presented and it is shown how complex temporal patterns that combine several variables and representation schemes can be used to infer process states and/or conditions.
Abstract: In the past years, the large availability of sensed data highlighted the need of computer-aided systems that perform intelligent data analysis (IDA) over the obtained data streams. Temporal abstractions (TAs) are key to interpret the principle encoded within the data, but their usefulness depends on an efficient management of domain knowledge. In this article, an ontology-based framework for IDA is presented. It is based on a knowledge model composed by two existing ontologies (Semantic Sensor Network ontology (SSN), SWRL Temporal Ontology (SWRLTO)) and a new developed one: the Temporal Abstractions Ontology (TAO). SSN conceptualizes sensor measurements, thus enabling a full integration with semantic sensor web (SSW) technologies. SWRLTO provides temporal modeling and reasoning. TAO has been designed to capture the semantic of TAs. These ontologies have been aligned through DOLCE Ultra-Lite (DUL) upper ontology, boosting the integration with other domains. The resulting knowledge model has a modular design that facilitates the integration, exchange and reuse of its constitutive parts. The framework is sketched in a chemical plant case study. It is shown how complex temporal patterns that combine several variables and representation schemes can be used to infer process states and/or conditions.

Journal ArticleDOI
TL;DR: This study proposes the first considerably wide-coverage ontology for the wind energy domain and the ontology is built through a semi-automatic process which makes use of the related Web resources, thereby reducing the overall cost of the ontological building process.

Journal ArticleDOI
TL;DR: The tensions that may emerge between ontology authors including antagonistic ontology building styles (definition-driven vs. manually crafted hierarchies) are uncovered and mapped to a set of key design recommendations which should inform and guide future efforts for improving ontology authoring tool support, thus opening up ontologyAuthoring to a new generation of users.
Abstract: The process of authoring ontologies appears to be fragmented across several tools and workarounds, and there exists no well accepted framework for common authoring tasks such as exploring ontologies, comparing versions, debugging, and testing. This lack of an adequate and seamless tool chain potentially hinders the broad uptake of ontologies, especially OWL, as a knowledge representation formalism. We start to address this situation by presenting insights from an interview-based study with 15 ontology experts. We uncover the tensions that may emerge between ontology authors including antagonistic ontology building styles (definition-driven vs. manually crafted hierarchies). We identify the problems reported by the ontology authors and the strategies they employ to solve them. These data are mapped to a set of key design recommendations, which should inform and guide future efforts for improving ontology authoring tool support, thus opening up ontology authoring to a new generation of users. We discuss future research avenues in light of these results.

Proceedings ArticleDOI
27 Jul 2014
TL;DR: The extension to an ontology-based data model supporting the design and performance evaluation of production systems is presented and the formalization of the performance history of a production system and its components is addressed to capture both the spatial and state evolution of the objects.
Abstract: This paper presents the extension to an ontology-based data model supporting the design and performance evaluation of production systems. This extension aims to link the modeling of the spatial representation of physical objects with the characterization of their states and behavior. Furthermore, the formalization of the performance history of a production system and its components is addressed to capture both the spatial and state evolution of the objects. Such history can be provided by simulation runs or gathered from a monitoring system. A test case is described and then used to show how different software tools can be integrated to support the integrated design and evaluation of a production system.

Journal ArticleDOI
TL;DR: A novel measure named link weight is demonstrated that uses semantic characteristics of two entities and Google page count to calculate an information distance similarity between them and is able to create alignments between different lexical entities that denotes the same ones.

Proceedings ArticleDOI
16 Jun 2014
TL;DR: The high accuracy achieved in the tests demonstrates the effectiveness of the proposed method, as well as the applicability of Wikipedia for semantic text categorization purposes, and allows dynamically changing the classification topics without retraining of the classifier.
Abstract: We present a method for the automatic classification of text documents into a dynamically defined set of topics of interest. The proposed approach requires only a domain ontology and a set of user-defined classification topics, specified as contexts in the ontology. Our method is based on measuring the semantic similarity of the thematic graph created from a text document and the ontology sub-graphs resulting from the projection of the defined contexts. The domain ontology effectively becomes the classifier, where classification topics are expressed using the defined ontological contexts. In contrast to the traditional supervised categorization methods, the proposed method does not require a training set of documents. More importantly, our approach allows dynamically changing the classification topics without retraining of the classifier. In our experiments, we used the English language Wikipedia converted to an RDF ontology to categorize a corpus of current Web news documents into selection of topics of interest. The high accuracy achieved in our tests demonstrates the effectiveness of the proposed method, as well as the applicability of Wikipedia for semantic text categorization purposes.

Journal ArticleDOI
TL;DR: An ontology learning and population system that combines both statistical and semantic methodologies is presented that achieves good performances on standard datasets.
Abstract: The success of Semantic Web will heavily rely on the availability of formal ontologies to structure machine understanding data. However, there is still a lack of general methodologies for ontology automatic learning and population, i.e. the generation of domain ontologies from various kinds of resources by applying natural language processing and machine learning techniques In this paper, the authors present an ontology learning and population system that combines both statistical and semantic methodologies. Several experiments have been carried out, demonstrating the effectiveness of the proposed approach. HighlightsA graph of terms can be effectively used for ontology building.Such a graph is extracted from documents thanks to a LDA based methodology.Ontology learning involves the use of annotated lexicons (WordNet).Proposed method achieves good performances on standard datasets.

Journal ArticleDOI
TL;DR: A domain-independent process for the automatic population of ontologies from text that applies natural language processing and information extraction techniques to acquire and classify ontology instances and can extract and classify instances with high effectiveness with the additional advantage of domain independence is proposed.

Journal ArticleDOI
TL;DR: A library of common ontology alignment patterns as reusable templates of recurring correspondences is developed, based on a detailed analysis of frequent ontology mismatches, and an application of ontology aligned patterns for an ontology transformation service is described.
Abstract: Interoperability between heterogeneous ontological descriptions can be performed through ontology mediation techniques. At the heart of ontology mediation lies the alignment: a specification of correspondences between ontology entities. Ontology matching can bring some automation but are limited to finding simple correspondences. Design patterns have proven themselves useful to capture experience in design problems. In this article, we introduce ontology alignment patterns as reusable templates of recurring correspondences. Based on a detailed analysis of frequent ontology mismatches, we develop a library of common patterns. Ontology alignment patterns can be used to refine correspondences, either by the alignment designer or via pattern detection algorithms. We distinguish three levels of abstraction for ontology alignment representation, going from executable transformation rules, to concrete correspondences between two ontologies, to ontology alignment patterns at the third level. We express patterns using an ontology alignment representation language, making them ready to use in practical mediation tasks. We extract mismatches from vocabularies associated with data sets published as linked open data, and we evaluate the ability of correspondence patterns to provide proper alignments for these mismatches. Finally, we describe an application of ontology alignment patterns for an ontology transformation service.

Journal ArticleDOI
TL;DR: TermGenie is a web-based class-generation system that complements traditional ontology development tools and is simple and intuitive and can be used by most biocurators without extensive training.
Abstract: Biological ontologies are continually growing and improving from requests for new classes (terms) by biocurators. These ontology requests can frequently create bottlenecks in the biocuration process, as ontology developers struggle to keep up, while manually processing these requests and create classes. TermGenie allows biocurators to generate new classes based on formally specified design patterns or templates. The system is web-based and can be accessed by any authorized curator through a web browser. Automated rules and reasoning engines are used to ensure validity, uniqueness and relationship to pre-existing classes. In the last 4 years the Gene Ontology TermGenie generated 4715 new classes, about 51.4% of all new classes created. The immediate generation of permanent identifiers proved not to be an issue with only 70 (1.4%) obsoleted classes. TermGenie is a web-based class-generation system that complements traditional ontology development tools. All classes added through pre-defined templates are guaranteed to have OWL equivalence axioms that are used for automatic classification and in some cases inter-ontology linkage. At the same time, the system is simple and intuitive and can be used by most biocurators without extensive training.

Journal ArticleDOI
TL;DR: This paper presents a method for building a bilingual domain ontology from textual and termino-ontological resources intended for semantic annotation and information retrieval of textual documents and is sufficiently flexible to be applied to other domains.

Journal ArticleDOI
TL;DR: This paper has created a novel approach to alignment evaluation based on statistical SignTest that can be conveniently used to emphasise differences in quality of designated mappings.

Journal ArticleDOI
TL;DR: This paper introduces a semantically-based three stage-approach to assist developers in checking the consistency of the requirements models and choose the most suitable and relevant ontology for their development project from a given repository.

Journal ArticleDOI
TL;DR: A new gradient learning model for ontology similarity measuring and ontology mapping in multidividing setting is raised and the sample error in this setting is given by virtue of the hypothesis space and the trick of ontology dividing operator.
Abstract: The gradient learning model has been raising great attention in view of its promising perspectives for applications in statistics, data dimensionality reducing, and other specific fields. In this paper, we raise a new gradient learning model for ontology similarity measuring and ontology mapping in multidividing setting. The sample error in this setting is given by virtue of the hypothesis space and the trick of ontology dividing operator. Finally, two experiments presented on plant and humanoid robotics field verify the efficiency of the new computation model for ontology similarity measure and ontology mapping applications in multidividing setting.