scispace - formally typeset
Search or ask a question

Showing papers on "Ontology (information science) published in 2002"


01 Jan 2002
TL;DR: An ontology defines a common vocabulary for researchers who need to share information in a domain that includes machine-interpretable definitions of basic concepts in the domain and relations among them.
Abstract: 1 Why develop an ontology? In recent years the development of ontologies—explicit formal specifications of the terms in the domain and relations among them (Gruber 1993)—has been moving from the realm of ArtificialIntelligence laboratories to the desktops of domain experts. Ontologies have become common on the World-Wide Web. The ontologies on the Web range from large taxonomies categorizing Web sites (such as on Yahoo!) to categorizations of products for sale and their features (such as on Amazon.com). The WWW Consortium (W3C) is developing the Resource Description Framework (Brickley and Guha 1999), a language for encoding knowledge on Web pages to make it understandable to electronic agents searching for information. The Defense Advanced Research Projects Agency (DARPA), in conjunction with the W3C, is developing DARPA Agent Markup Language (DAML) by extending RDF with more expressive constructs aimed at facilitating agent interaction on the Web (Hendler and McGuinness 2000). Many disciplines now develop standardized ontologies that domain experts can use to share and annotate information in their fields. Medicine, for example, has produced large, standardized, structured vocabularies such as SNOMED (Price and Spackman 2000) and the semantic network of the Unified Medical Language System (Humphreys and Lindberg 1993). Broad general-purpose ontologies are emerging as well. For example, the United Nations Development Program and Dun & Bradstreet combined their efforts to develop the UNSPSC ontology which provides terminology for products and services (www.unspsc.org). An ontology defines a common vocabulary for researchers who need to share information in a domain. It includes machine-interpretable definitions of basic concepts in the domain and relations among them. Why would someone want to develop an ontology? Some of the reasons are:

4,838 citations


Book
28 Feb 2002
TL;DR: The authors present an ontology learning framework that extends typical ontology engineering environments by using semiautomatic ontology construction tools and encompasses ontology import, extraction, pruning, refinement and evaluation.
Abstract: The Semantic Web relies heavily on formal ontologies to structure data for comprehensive and transportable machine understanding. Thus, the proliferation of ontologies factors largely in the Semantic Web's success. The authors present an ontology learning framework that extends typical ontology engineering environments by using semiautomatic ontology construction tools. The framework encompasses ontology import, extraction, pruning, refinement and evaluation.

2,061 citations


Book ChapterDOI
01 Oct 2002
TL;DR: This paper introduces the DOLCE upper level ontology, the first module of a Foundational Ontologies Library being developed within the WonderWeb project, and suggests that such analysis could hopefully lead to an ?
Abstract: In this paper we introduce the DOLCE upper level ontology, the first module of a Foundational Ontologies Library being developed within the WonderWeb project. DOLCE is presented here in an intuitive way; the reader should refer to the project deliverable for a detailed axiomatization. A comparison with WordNet's top-level taxonomy of nouns is also provided, which shows how DOLCE, used in addition to the OntoClean methodology, helps isolating and understanding some major WordNet?s semantic limitations. We suggest that such analysis could hopefully lead to an ?ontologically sweetened? WordNet, meant to be conceptually more rigorous, cognitively transparent, and efficiently exploitable in several applications.

1,100 citations


Proceedings ArticleDOI
07 May 2002
TL;DR: Glue is described, a system that employs machine learning techniques to find semantic mappings between ontologies and is distinguished in that it works with a variety of well-defined similarity notions and that it efficiently incorporates multiple types of knowledge.
Abstract: Ontologies play a prominent role on the Semantic Web. They make possible the widespread publication of machine understandable data, opening myriad opportunities for automated information processing. However, because of the Semantic Web's distributed nature, data on it will inevitably come from many different ontologies. Information processing across ontologies is not possible without knowing the semantic mappings between their elements. Manually finding such mappings is tedious, error-prone, and clearly not possible at the Web scale. Hence, the development of tools to assist in the ontology mapping process is crucial to the success of the Semantic Web.We describe glue, a system that employs machine learning techniques to find such mappings. Given two ontologies, for each concept in one ontology glue finds the most similar concept in the other ontology. We give well-founded probabilistic definitions to several practical similarity measures, and show that glue can work with all of them. This is in contrast to most existing approaches, which deal with a single similarity measure. Another key feature of glue is that it uses multiple learning strategies, each of which exploits a different type of information either in the data instances or in the taxonomic structure of the ontologies. To further improve matching accuracy, we extend glue to incorporate commonsense knowledge and domain constraints into the matching process. For this purpose, we show that relaxation labeling, a well-known constraint optimization technique used in computer vision and other fields, can be adapted to work efficiently in our context. Our approach is thus distinguished in that it works with a variety of well-defined similarity notions and that it efficiently incorporates multiple types of knowledge. We describe a set of experiments on several real-world domains, and show that glue proposes highly accurate semantic mappings.

1,027 citations


Proceedings ArticleDOI
07 May 2002
TL;DR: This paper defines the semantics for a relevant subset of DAML-S in terms of a first-order logical language and provides decision procedures for Web service simulation, verification and composition.
Abstract: Web services -- Web-accessible programs and devices - are a key application area for the Semantic Web. With the proliferation of Web services and the evolution towards the Semantic Web comes the opportunity to automate various Web services tasks. Our objective is to enable markup and automated reasoning technology to describe, simulate, compose, test, and verify compositions of Web services. We take as our starting point the DAML-S DAML+OIL ontology for describing the capabilities of Web services. We define the semantics for a relevant subset of DAML-S in terms of a first-order logical language. With the semantics in hand, we encode our service descriptions in a Petri Net formalism and provide decision procedures for Web service simulation, verification and composition. We also provide an analysis of the complexity of these tasks under different restrictions to the DAML-S composite services we can describe. Finally, we present an implementation of our analysis techniques. This implementation takes as input a DAML-S description of a Web service, automatically generates a Petri Net and performs the desired analysis. Such a tool has broad applicability both as a back end to existing manual Web service composition tools, and as a stand-alone tool for Web service developers.

953 citations


Book ChapterDOI
01 Oct 2002
TL;DR: A set of ontology similarity measures and a multiple-phase empirical evaluation are presented for measuring the similarity between ontologies for the task of detecting and retrieving relevant ontologies.
Abstract: Ontologies now play an important role for many knowledge-intensive applications for which they provide a source of precisely defined terms However, with their wide-spread usage there come problems concerning their proliferation Ontology engineers or users frequently have a core ontology that they use, eg, for browsing or querying data, but they need to extend it with, adapt it to, or compare it with the large set of other ontologies For the task of detecting and retrieving relevant ontologies, one needs means for measuring the similarity between ontologies We present a set of ontology similarity measures and a multiple-phase empirical evaluation

847 citations


Journal ArticleDOI
01 Jun 2002
TL;DR: A taxonomy for characterizing Web data extraction fools is proposed, a survey of major web data extraction tools described in the literature is briefly surveyed, and a qualitative analysis of them is provided.
Abstract: In the last few years, several works in the literature have addressed the problem of data extraction from Web pages. The importance of this problem derives from the fact that, once extracted, the data can be handled in a way similar to instances of a traditional database. The approaches proposed in the literature to address the problem of Web data extraction use techniques borrowed from areas such as natural language processing, languages and grammars, machine learning, information retrieval, databases, and ontologies. As a consequence, they present very distinct features and capabilities which make a direct comparison difficult to be done. In this paper, we propose a taxonomy for characterizing Web data extraction fools, briefly survey major Web data extraction tools described in the literature, and provide a qualitative analysis of them. Hopefully, this work will stimulate other studies aimed at a more comprehensive analysis of data extraction approaches and tools for Web data.

760 citations


Journal ArticleDOI
01 Jul 2002
TL;DR: This paper provides an overview of the four main components of the Pathway Tools: the PathoLogic component supports creation of new PGDBs from the annotated genome of an organism, and the pathway/Genome Navigator provides query, visualization, and Web-publishing services forPGDBs.
Abstract: Motivation: Bioinformatics requires reusable software tools for creating model-organism databases (MODs). Results: The Pathway Tools is a reusable, productionquality software environment for creating a type of MOD called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc (see http://ecocyc.org) integrates our evolving understanding of the genes, proteins, metabolic network, and genetic network of an organism. This paper provides an overview of the four main components of the Pathway Tools: The PathoLogic component supports creation of new PGDBs from the annotated genome of an organism. The Pathway/Genome Navigator provides query, visualization, and Web-publishing services for PGDBs. The Pathway/Genome Editors support interactive updating of PGDBs. The Pathway Tools ontology defines the schema of PGDBs. The Pathway Tools makes use of the Ocelot object database system for data management services for PGDBs. The Pathway Tools has been used to build PGDBs for 13 organisms within SRI and by external

665 citations


Book
01 Dec 2002
TL;DR: Towards theSemantic Web focuses on the application of Semantic Web technology and ontologies in particular to electronically available information to improve the quality of knowledge management in large and distributed organizations.
Abstract: From the Publisher: "Towards the Semantic Web focuses on the application of Semantic Web technology and ontologies in particular to electronically available information to improve the quality of knowledge management in large and distributed organizations. Covering the key technologies for the next generation of the WWW, this book is a mixture of theory, tools and applications in an important area of WWW research." Aimed primarily at researchers and developers in the area of WWW-based knowledge management and information retrieval. It will also be a useful reference for students in computer science at the postgraduate level, academic and industrial researchers in the field, business managers who are aiming to increase the corporations' information infrastructure and industrial personnel who are tracking WWW technology developments in order to understand the business implications.

647 citations


Book ChapterDOI
09 Jun 2002
TL;DR: This paper focuses on collaborative development of ontologies with OntoEdit which is guided by a comprehensive methodology.
Abstract: Ontologies now play an important role for enabling the semantic web. They provide a source of precisely defined terms e.g. for knowledge-intensive applications. The terms are used for concise communication across people and applications. Typically the development of ontologies involves collaborative efforts of multiple persons. OntoEdit is an ontology editor that integrates numerous aspects of ontology engineering. This paper focuses on collaborative development of ontologies with OntoEdit which is guided by a comprehensive methodology.

422 citations


Journal ArticleDOI
TL;DR: Creating a general ontology characterizing the conduct of knowledge management and its implications for knowledge management is described.
Abstract: Creating a general ontology characterizing the conduct of knowledge management.

Book ChapterDOI
01 Oct 2002
TL;DR: MAFRA is presented, an interactive, incremental and dynamic framework for mapping distributed ontologies in the Semantic Web, and aims to balance the autonomy of each community with the need for interoperability.
Abstract: Ontologies as means for conceptualizing and structuring domain knowledge within a community of interest are seen as a key to realize the Semantic Web vision. However, the decentralized nature of the Web makes achieving this consensus across communities difficult, thus, hampering efficient knowledge sharing between them. In order to balance the autonomy of each community with the need for interoperability, mapping mechanisms between distributed ontologies in the Semantic Web are required. In this paper we present MAFRA, an interactive, incremental and dynamic framework for mapping distributed ontologies.

Book ChapterDOI
01 Oct 2002
TL;DR: This paper identifies a possible six-phase evolution process and introduces the concept of an evolution strategy encapsulating policy for evolution with respect to user?s requirements, focusing on providing the user with capabilities to control and customize it.
Abstract: With rising importance of knowledge interchange, many industrial and academic applications have adopted ontologies as their conceptual backbone. However, industrial and academic environments are very dynamic, thus inducing changes to application requirements. To fulfill these changes, often the underlying ontology must be evolved as well. As ontologies grow in size, the complexity of change management increases, thus requiring a well-structured ontology evolution process. In this paper we identify a possible six-phase evolution process and focus on providing the user with capabilities to control and customize it. We introduce the concept of an evolution strategy encapsulating policy for evolution with respect to user?s requirements.

Journal ArticleDOI
01 Dec 2002
TL;DR: The DOGMA ontology engineering approach is introduced that separates "atomic" conceptual relations from "predicative" domain rules and a layer of "relatively generic" ontological commitments that hold the domain rules.
Abstract: Ontologies in current computer science parlance are computer based resources that represent agreed domain semantics. Unlike data models, the fundamental asset of ontologies is their relative independence of particular applications, i.e. an ontology consists of relatively generic knowledge that can be reused by different kinds of applications/tasks. The first part of this paper concerns some aspects that help to understand the differences and similarities between ontologies and data models. In the second part we present an ontology engineering framework that supports and favours the genericity of an ontology. We introduce the DOGMA ontology engineering approach that separates "atomic" conceptual relations from "predicative" domain rules. A DOGMA ontology consists of an ontology base that holds sets of intuitive context-specific conceptual relations and a layer of "relatively generic" ontological commitments that hold the domain rules. This constitutes what we shall call the double articulation of a DOGMA ontology 1.

01 Jan 2002
TL;DR: The development and application of a large formal ontology to the semantic web and this upper ontology is extremely broad in scope and can serve as a semantic foundation for search, interoperation, and communication on the semanticweb.
Abstract: In this paper we discuss the development and application of a large formal ontology to the semantic web The Suggested Upper Merged Ontology (SUMO) (Niles & Pease, 2001) (SUMO, 2002) is a “starter document” in the IEEE Standard Upper Ontology effort This upper ontology is extremely broad in scope and can serve as a semantic foundation for search, interoperation, and communication on the semantic web

Book ChapterDOI
01 Oct 2002
TL;DR: M is presented, an annotation tool which provides both automated and semi-automated support for annotating web pages with semantic contents and integrates a web browser with an ontology editor and provides open APIs to link to ontology servers and for integrating information extraction tools.
Abstract: An important precondition for realizing the goal of a semantic web is the ability to annotate web resources with semantic information. In order to carry out this task, users need appropriate representation languages, ontologies, and support tools. In this paper we present MnM, an annotation tool which provides both automated and semi-automated support for annotating web pages with semantic contents. MnM integrates a web browser with an ontology editor and provides open APIs to link to ontology servers and for integrating information extraction tools. MnM can be seen as an early example of the next generation of ontology editors, being web-based, oriented to semantic markup and providing mechanisms for large-scale automatic markup of web pages.

Patent
24 Dec 2002
TL;DR: In this article, a system including software components that efficiently and dynamically analyzes changes to data sources, including application programs, is disclosed, which simulatneously re-codes dynamic adapters between the data sources.
Abstract: A system, including software components, that efficiently and dynamically analyzes changes to data sources, including application programs, whithin an integration environment and simulatneously re-codes dynamic adapters between the data sources is disclosed. The system also monitors at least two of said data sources to detect similarities (3) within the data structures of said data sources and generates new dynamic adapters to integrate said at least two of said data sources. The system also provides real time error validation of dynamic adapters as well as performance optimization of newly created dynamic adapters that have been generated (5) under changing environmental conditions.


Journal Article
TL;DR: DAML+OIL is an ontology language specifically designed for use on the Web; it exploits existing Web standards (XML and RDF), adding the familiar ontological primitives of object oriented and frame based systems, and the formal rigor of a very expressive description logic.
Abstract: Ontologies are set to play a key role in the “Semantic Web”, extending syntactic interoperability to semantic interoperability by providing a source of shared and precisely defined terms. DAML+OIL is an ontology language specifically designed for use on the Web; it exploits existing Web standards (XML and RDF), adding the familiar ontological primitives of object oriented and frame based systems, and the formal rigor of a very expressive description logic. The logical basis of the language means that reasoning services can be provided, both to support ontology design and to make DAML+OIL described Web resources more accessible to automated processes.

Book ChapterDOI
01 Oct 2002
TL;DR: The topic of ontology versioning in the context of the Web is analyzed by looking at the characteristics of the version relation between ontologies and at the identification of online ontologies, and the design of a web-based system is described.
Abstract: To effectively use ontologies on the Web, it is essential that changes in ontologies are managed well. This paper analyzes the topic of ontology versioning in the context of the Web by looking at the characteristics of the version relation between ontologies and at the identification of online ontologies. Then, it describes the design of a web-based system that helps users to manage changes in ontologies. The system helps to keep different versions of web-based ontologies interoperable, by maintaining not only the transformations between ontologies, but also the conceptual relation between concepts in different versions. The system allows ontology engineers to compare versions of ontology and to specify these conceptual relations. For the visualization of differences, it uses an adaptable rule-based mechanism that finds and classifies changes in RDF-based ontologies.

Proceedings ArticleDOI
28 Jul 2002
TL;DR: The PROMPTDIFF algorithm is developed, which integrates different heuristic matchers for comparing ontology versions in a fixed-point manner, using the results of one matcher as an input for others until the matchers produce no more changes.
Abstract: As ontology development becomes a more ubiquitous and collaborative process, the developers face the problem of maintaining versions of ontologies akin to maintaining versions of software code in large software projects. Versioning systems for software code provide mechanisms for tracking versions, checking out versions for editing, comparing different versions, and so on. We can directly reuse many of these mechanisms for ontology versioning. However, version comparison for code is based on comparing text files--an approach that does not work for comparing ontologies. Two ontologies can be identical but have different text representation. We have developed the PROMPTDIFF algorithm, which integrates different heuristic matchers for comparing ontology versions. We combine these matchers in a fixed-point manner, using the results of one matcher as an input for others until the matchers produce no more changes. The current implementation includes ten matchers but the approach is easily extendable to an arbitrary number of matchers. Our evaluation showed that PROMPTDIFF correctly identified 96% of the matches in ontology versions from large projects.

Book ChapterDOI
15 Jul 2002
TL;DR: This paper proposes an approach for semantic search by matching conceptual graphs by calculating semantic similarities between concepts, relations and conceptual graphs using the detailed definitions of semantic similarity.
Abstract: Semantic search becomes a research hotspot. The combined use of linguistic ontologies and structured semantic matching is one of the promising ways to improve both recall and precision. In this paper, we propose an approach for semantic search by matching conceptual graphs. The detailed definitions of semantic similarities between concepts, relations and conceptual graphs are given. According to these definitions of semantic similarity, we propose our conceptual graph matching algorithm that calculates the semantic similarity. The computation complexity of this algorithm is constrained to be polynomial. A prototype of our approach is currently under development with IBM China Research Lab.

Proceedings ArticleDOI
01 Dec 2002
TL;DR: Detailed investigation of the properties of these information content based measures are presented, and various properties of GO are examined, which may have implications for its future design.
Abstract: Many bioinformatics resources hold data in the form of sequences. Often this sequence data is associated with a large amount of annotation. In many cases this data has been hard to model, and has been represented as scientific natural language, which is not readily computationally amenable. The development of the Gene Ontology provides us with a more accessible representation of some of this data. However it is not clear how this data can best be searched, or queried. Recently we have adapted information content based measures for use with the Gene Ontology (GO). In this paper we present detailed investigation of the properties of these measures, and examine various properties of GO, which may have implications for its future design.

Journal ArticleDOI
TL;DR: It is identified that shallow information extraction and natural language processing techniques are deployed to extract concepts or classes from free-text or semi-structured data, but relation extraction is a very complex and difficult issue to resolve and it has turned out to be the main impediment to ontology learning and applicability.
Abstract: Ontology is an important emerging discipline that has the huge potential to improve information organization, management and understanding. It has a crucial role to play in enabling content-based access, interoperability, communications, and providing qualitatively new levels of services on the next wave of web transformation in the form of the Semantic Web. The issues pertaining to ontology generation, mapping and maintenance are critical key areas that need to be understood and addressed. This survey is presented in two parts. The first part reviews the state-of-the-art techniques and work done on semi-automatic and automatic ontology generation, as well as the problems facing such research. The second complementary survey is dedicated to ontology mapping and ontology ‘evolving’. Through this survey, we have identified that shallow information extraction and natural language processing techniques are deployed to extract concepts or classes from free-text or semi-structured data. However, relation extrac...

Journal ArticleDOI
TL;DR: This survey earmarked several application classes that benefit from using ontologies, including natural language processing, intelligent information retrieval, virtual organizations, and simulation and modeling.
Abstract: Ontological engineering has garnered increasing attention over the last few years, as researchers have recognized ontologies are not just for knowledge-based systems---all software needs models of the world, and hence can make use of ontologies at design time [1]. A recent survey of the field [4] suggests developers of practical AI systems may especially benefit from their use. This survey earmarked several application classes that benefit from using ontologies, including natural language processing, intelligent information retrieval (especially from the Internet), virtual organizations, and simulation and modeling.

01 Jan 2002
TL;DR: This paper presents the process by which over the last 15 years several ontologies of varying complexity have been mapped or integrated with Cyc, a large commonsense knowledge base.
Abstract: The advent of Web services, and the Semantic Web described by domain ontologies, highlight the bottleneck to their growth: ontology mapping, merging, and integration. In this paper we present the process by which over the last 15 years several ontologies of varying complexity have been mapped or integrated with Cyc, a large commonsense knowledge base. These include SENSUS, FIPS 10-4, several large (300k-term) pharmaceutical thesauri, large portions of WordNet, MeSH/Snomed/UMLS, and the CIA World Factbook. This has to date required trained ontologists talking with subject matter experts. To break that bottleneck – to enable subject matter experts to directly map/merge/integrate their ontologies – we have been developing interactive clarification-dialog-based tools.

Proceedings ArticleDOI
24 Aug 2002
TL;DR: A supervised learning method is presented that considers the local context surrounding the entity as well as more global semantic information derived from topic signatures and WordNet, and reinforces this method with an algorithm that takes advantage of the presence of entities in multiple contexts.
Abstract: While Named Entity extraction is useful in many natural language applications, the coarse categories that most NE extractors work with prove insufficient for complex applications such as Question Answering and Ontology generation. We examine one coarse category of named entities, persons, and describe a method for automatically classifying person instances into eight finer-grained subcategories. We present a supervised learning method that considers the local context surrounding the entity as well as more global semantic information derived from topic signatures and WordNet. We reinforce this method with an algorithm that takes advantage of the presence of entities in multiple contexts.

Patent
26 Mar 2002
TL;DR: In this paper, a distributed ontology system including a central computer comprising a global ontology directory, a plurality of ontology server computers, each including a repository of class and relation definitions, and a server for responding to queries relating to class and relations definitions in the repository, is described.
Abstract: A distributed ontology system including a central computer comprising a global ontology directory, a plurality of ontology server computers, each including a repository of class and relation definitions, and a server for responding to queries relating to class and relation definitions in the repository, and a computer network connecting the central computer with the plurality of ontology server computers. A method is also described and claimed.

Book ChapterDOI
19 Aug 2002
TL;DR: The definition of a set of similarity measures for comparing ontology-based metadata and an application study using these measures within a hierarchical clustering algorithm are proposed.
Abstract: The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computersr and people to work in cooperation. Recently, different applications based on this vision have been designed, e.g. in the fields of knowledge management, community web portals, e-learning, multimedia retrieval, etc. It is obvious that the complex metadata descriptions generated on the basis of pre-defined ontologies serve as perfect input data for machine learning techniques. In this paper we propose an approach for clustering ontology-based metadata. Main contributions of this paper are the definition of a set of similarity measures for comparing ontology-based metadata and an application study using these measures within a hierarchical clustering algorithm.

Journal ArticleDOI
TL;DR: The potential for information retrieval at different levels of granularity inside the framework of information systems based on ontologies, which leads to ontology-driven geographic information systems.
Abstract: The integration of information of different kinds, such as spatial and alphanumeric at different levels of detail, is a challenge. While a solution is not reached, it is widely recognized that the need to integrate information is so pressing that it does not matter if detail is lost, as long as integration is achieved. This paper shows the potential for information retrieval at different levels of granularity inside the framework of information systems based on ontologies. Ontologies are theories that use a specific vocabulary to describe entities, classes, properties and functions related to a certain view of the world. The use of an ontology, translated into an active information system component, leads to ontology-driven information systems and, in the specific case of GIS, leads to what we call ontology-driven geographic information systems.