scispace - formally typeset
Search or ask a question

Showing papers by "Natalya F. Noy published in 2012"


Journal ArticleDOI
TL;DR: The centerpiece of the National Center for Biomedical Ontology is a web-based resource known as BioPortal, which makes available for research in computationally useful forms more than 270 of the world's biomedical ontologies and terminologies, and supports a wide range of web services that enable investigators to use the ontologies to annotate and retrieve data.

270 citations


Journal ArticleDOI
TL;DR: A survey of the growing-and surprisingly diverse-landscape of ontology libraries is provided, which identifies a core set of questions that ontology practitioners and users should consider in choosing an ontology library for finding ontologies or publishing their own.

178 citations


Book ChapterDOI
11 Nov 2012
TL;DR: CrowdMap is introduced, a model to acquire human contributions via microtask crowdsourcing to improve the accuracy of existing ontology alignment solutions in a fast, scalable, and cost-effective manner.
Abstract: The last decade of research in ontology alignment has brought a variety of computational techniques to discover correspondences between ontologies. While the accuracy of automatic approaches has continuously improved, human contributions remain a key ingredient of the process: this input serves as a valuable source of domain knowledge that is used to train the algorithms and to validate and augment automatically computed alignments. In this paper, we introduce CrowdMap, a model to acquire such human contributions via microtask crowdsourcing. For a given pair of ontologies, CrowdMap translates the alignment problem into microtasks that address individual alignment questions, publishes the microtasks on an online labor market, and evaluates the quality of the results obtained from the crowd. We evaluated the current implementation of CrowdMap in a series of experiments using ontologies and reference alignments from the Ontology Alignment Evaluation Initiative and the crowdsourcing platform CrowdFlower. The experiments clearly demonstrated that the overall approach is feasible, and can improve the accuracy of existing ontology alignment solutions in a fast, scalable, and cost-effective manner.

161 citations


Book ChapterDOI
11 Nov 2012
TL;DR: BioPortal as discussed by the authors is a repository of biomedical ontologies, with more than 300 ontologies to date, including ontologies developed in OWL, OBO and other languages, as well as a large number of medical terminologies that the US National Library of Medicine distributes in its own proprietary format.
Abstract: BioPortal is a repository of biomedical ontologies--the largest such repository, with more than 300 ontologies to date. This set includes ontologies that were developed in OWL, OBO and other languages, as well as a large number of medical terminologies that the US National Library of Medicine distributes in its own proprietary format. We have published the RDF based serializations of all these ontologies and their metadata at sparql.bioontology.org . This dataset contains 203M triples, representing both content and metadata for the 300+ ontologies; and 9M mappings between terms. This endpoint can be queried with SPARQL which opens new usage scenarios for the biomedical domain. This paper presents lessons learned from having redesigned several applications that today use this SPARQL endpoint to consume ontological data.

37 citations


Proceedings Article
01 Jan 2012
TL;DR: While ontologyDesign patterns provide a vehicle for capturing formally reoccurring models and best practices in ontology design, it is shown that today their use in a case study of widely used biomedical ontologies is limited.
Abstract: Ontology design patterns (ODPs) are a proposed solution to facilitate ontology development, and to help users avoid some of the most frequent modeling mistakes. ODPs originate from similar approaches in software engineering, where software design patterns have become a critical aspect of software development. There is little empirical evidence for ODP prevalence or effectiveness thus far. In this work, we determine the use and applicability of ODPs in a case study of biomedical ontologies. We encoded ontology design patterns from two ODP catalogs. We then searched for these patterns in a set of eight ontologies. We found five patterns of the 69 patterns. Two of the eight ontologies contained these patterns. While ontology design patterns provide a vehicle for capturing formally reoccurring models and best practices in ontology design, we show that today their use in a case study of widely used biomedical ontologies is limited.

24 citations


Proceedings Article
03 Nov 2012
TL;DR: A methodology is presented for deriving a kind of abstraction network, called a partial-area taxonomy, for the Ontology of Clinical Research (OCRe), and the generalizability of the paradigm of the derivation methodology to various families of biomedical ontologies is discussed.
Abstract: An abstraction network is an auxiliary network of nodes and links that provides a compact, high-level view of an ontology. Such a view lends support to ontology orientation, comprehension, and quality-assurance efforts. A methodology is presented for deriving a kind of abstraction network, called a partial-area taxonomy, for the Ontology of Clinical Research (OCRe). OCRe was selected as a representative of ontologies implemented using the Web Ontology Language (OWL) based on shared domains. The derivation of the partial-area taxonomy for the Entity hierarchy of OCRe is described. Utilizing the visualization of the content and structure of the hierarchy provided by the taxonomy, the Entity hierarchy is audited, and several errors and inconsistencies in OCRe's modeling of its domain are exposed. After appropriate corrections are made to OCRe, a new partial-area taxonomy is derived. The generalizability of the paradigm of the derivation methodology to various families of biomedical ontologies is discussed.

22 citations


12 Nov 2012
TL;DR: The adoption of ODPs from two popular ODP libraries among the ontologies in BioPortal, a large ontology repository that contains over 300 biomedical ontologies, is determined and it is suggested that ODP's may be developed in a bottom-up fashion, much like software-design patterns.
Abstract: Ontology Design Patterns (ODPs) provide a means to capture best practice, to prevent modeling errors, and to encode formally common modeling situations for use during ontology development. Despite the popularity of ODPs and supposed positive effects from their use, there is scant empirical evidence of their level of adoption in real world ontologies or on their effectiveness. Knowing the goals of ODPs, they may assist in the development of large-scale biomedical ontologies. Before studying ODP effectiveness and applicability, we ask the following questions to understand better the landscape of ODP use: Are ODPs used in biomedical ontologies? Which patterns do the ontology developers use? In which ontologies? How frequently are patterns used? To answer these questions, we determined the adoption of ODPs from two popular ODP libraries among the ontologies in BioPortal, a large ontology repository that contains over 300 biomedical ontologies. We encoded 68 ODPs from two online libraries in the Ontology Pre-Processor Language, and, using these encodings, determined ODP prevalence in BioPortal ontologies. We found modest use of ODPs, with 33% of the ontologies containing at least one pattern. Upper Level Ontology, Closure, and Value Partition were the three most commonly used patterns, occurring in 20%, 9%, and 6% of the BioPortal ontologies, respectively. The low prevalence of ODPs may be due to lack of proper tooling, lack of user knowledge of and education about them, the age of the ontologies in the repository, or the specificity of some ODPs. We noted that there is a tension between the high expressivity of many ODPs and the goal of maintaining low expressivity of some biomedical ontologies. Additional tooling is necessary to make ODPs more accessible to domain experts. Furthermore, we suggest that ODPs may be developed in a bottom-up fashion, much like software-design patterns.

16 citations


Proceedings Article
23 Mar 2012
TL;DR: iCAT Analytics is a novel web-based tool that allows to investigate systematically crowd-based processes in knowledge-production systems and supports interactive exploration of pragmatic aspects of ontology engineering such as how a given ontology evolved and the nature of changes, discussions and interactions that took place during its production process.
Abstract: While in the past taxonomic and ontological knowledge was traditionally produced by small groups of co-located experts, today the production of such knowledge has a radically different shape and form. For example, potentially thousands of health professionals, scientists, and ontology experts will collaboratively construct, evaluate and maintain the most recent version of the International Classification of Diseases (ICD-11), a large ontology of diseases and causes of deaths managed by the World Health Organization. In this work, we present a novel web-based tool — iCAT Analytics — that allows to investigate systematically crowd-based processes in knowledge-production systems. To enable such investigation, the tool supports interactive exploration of pragmatic aspects of ontology engineering such as how a given ontology evolved and the nature of changes, discussions and interactions that took place during its production process. While iCAT Analytics was motivated by ICD-11, it could potentially be applied to any crowd-based ontology-engineering project. We give an introduction to the features of iCAT Analytics and present some insights specifically for ICD-11.

15 citations


12 Nov 2012
TL;DR: A survey on the language expressivity required to express the ODPs contained in the two main ODP catalogs: ODP.org and ODPS.net found that most of the O DPs cannot be incorporated into ontologies that are constrained to fit into one of the OWL 2 profiles.
Abstract: In recent years there has been a large amount of research into capturing, publishing and analysing Ontology Design Patterns (ODPs) However, there has not been any analysis into the typical language expressivity required to represent ODPs and how these requirements sit with lightweight fragments of the widely used ontology language OWL In this paper we therefore present a survey on the language expressivity required to express the ODPs contained in the two main ODP catalogs: ODPorg and ODPSsfnet We surveyed a total of 104 machine processable ODPs and found that the OWL representations of these patterns typically require highly expressive fragments of the OWL language such as ALCHIN, SHOIN, SHOIQ and SROIQ We observed that most ODPs required the use of inverse properties, cardinality restrictions and universal restrictions, and that 10 patterns require OWL 2 constructs such as property chains, disjoint properties and qualified cardinality restrictions that are not available in OWL 1 Moreover, we found that most of the ODPs cannot be incorporated into ontologies that are constrained to fit into one of the OWL 2 profiles Specifically, only 12 out of the 104 ODPs surveyed can be represented in OWL2EL, 13 in OWL2RL and 23 in OWL2QL Despite this, we conjecture that it may be possible to rewrite and weaken some of them so that modellers using lightweight fragments of OWL can incorporate ODPs into their ontologies

14 citations


01 Jan 2012
TL;DR: This work introduces a principled computational framework and methodology for automated discovery of context-specific functional links between ontologies and proposes a heuristic pruning technique as an efficient algorithm for inferring such links.
Abstract: We introduce a principled computational framework and methodology for automated discovery of context-specific functional links between ontologies. Our model leverages over disparate free-text literature resources to score the model of dependency linking two terms under a context against their model of independence. We identify linked terms as those having a significant bayes factor (p < 0.01). To scale our algorithm over massive ontologies, we propose a heuristic pruning technique as an efficient algorithm for inferring such links. We have applied this method to translationalize Gene Ontology to all other ontologies available at National Center of Biomedical Ontology (NCBO) BioPortal under the context of Human Disease ontology. Our results show that in addition to broadening the scope of hypothesis for researchers, our work can potentially be used to explore continuum of relationships among ontologies to guide various biological experiments.

5 citations


01 Jan 2012
TL;DR: This paper will discuss different recommendation techniques from the literature, map and apply these categories to the domain of collaboratively engineered biomedical ontologies and present prototypical implementations of selected recommendation techniques.
Abstract: Biomedical ontologies such as the 11th revision of the International Classification of Diseases and others are increasingly produced with the help of collaborative ontology engineering platforms that facilitate cooperation and coordination among a large number of users and contributors. While collaborative approaches to engineering biomedical ontologies can be expected to yield a number of advantages, such as increased participation and coverage, they come with a number of novel challenges and risks. For example, they might suffer from low participation, lack of coordination, lack of control or other related problems that are neither well understood nor addressed by the current state of research. In this paper, we aim to tackle some of these problems by exploring techniques for recommending concepts to experts on collaborative ontology engineering platforms. In detail, this paper will (i) discuss different recommendation techniques from the literature (ii) map and apply these categories to the domain of collaboratively engineered biomedical ontologies and (iii) present prototypical implementations of selected recommendation techniques as