scispace - formally typeset
Search or ask a question
Topic

Ontology-based data integration

About: Ontology-based data integration is a research topic. Over the lifetime, 11065 publications have been published within this topic receiving 216888 citations.


Papers
More filters
Journal Article
TL;DR: The focus of this work is on cross ontology methods which are capable of computing the semantic similarity between terms stemming from different ontologies (WordNet and MeSH in this work).
Abstract: Semantic Similarity relates to computing the similarity between concepts (terms) which are not necessarily lexically similar. We investigate approaches for computing semantic similarity by mapping terms to an ontology and by examining their relationships in that ontology. More specifically, we investigate approaches to computing the semantic similarity between natural language terms (using WordNet as the underlying reference ontology) and between medical terms (using the MeSH ontology of medical and biomedical terms). The most popular semantic similarity methods are implemented and evaluated using WordNet and MeSH. The focus of this work is also on cross ontology methods which are capable of computing the semantic similarity between terms stemming from different ontologies (WordNet and MeSH in this work). This is a far more difficult problem (than the single ontology one referred to above) which has not been investigated adequately in the literature. X-Similarity, a novel cross-ontology similarity method is also a contribution of this work. All methods examined in this work are integrated into a semantic similarity system which is accessible on the Web.

143 citations

Journal ArticleDOI
TL;DR: The GSA framework incorporates the ability to automatically and semi‐automatically tract metadata from syntactically and semantically heterogeneous and multimodal data from diverse sources, with integral support for what the authors term spatiotemporal thematic proximity (STTP) reasoning and interactive visualization capabilities.
Abstract: Geospatial ontology development and semantic knowledge discovery addresses the need for modeling, analyzing and visualizing multimodal information, and is unique in offering integrated analytics that encompasses spatial, temporal and thematic dimensions of information and knowledge. The comprehensive ability to provide integrated analysis from multiple forms of information and use of explicit knowledge make this approach unique. This also involves specification of spatiotemporal thematic ontologies and populating such ontologies with high quality knowledge. Such ontologies form the basis for defining the meaning of important relations terms, such as near or surrounded by, and enable computation of spatiotemporal thematic proximity measures we define. SWETO (Semantic Web Technology Evaluation Ontology) and geospatial extension SWETO-GS are examples of these ontologies. The Geospatial Semantics Analytics (GSA) framework incorporates: (1) the ability to automatically and semi-automatically tract metadata from syntactically (including unstructured, semi-structured and structured data) and semantically heterogeneous and multimodal data from diverse sources; and (2) analytical processing that exploits these ontologies and associated knowledge bases, with integral support for what we term spatiotemporal thematic proximity (STTP) reasoning and interactive visualization capabilities. This paper discusses the results of our geospatial ontology development efforts as well as some new semantic analytics methods on this ontology such as STTP.

142 citations

Journal ArticleDOI
TL;DR: The Gene Ontology (GO) and the Mouse Genome Informatics (MGI) database are used as use cases to illustrate the impact of bio-ontologies on data integration and for comparative genomics.

142 citations

01 Jan 2002
TL;DR: Criteria for evaluating ontology-development tools and tools for mapping, aligning, or merging ontologies are presented and what resources as a community need to develop are discussed in order to make performance comparisons within each group of merging and mapping tools useful and effective.
Abstract: The appearance of a large number of ontology tools may leave a user looking for an appropriate tool overwhelmed and uncertain on which tool to choose. Thus evaluation and comparison of these tools is important to help users determine which tool is best suited for their tasks. However, there is no “one size fits all” comparison framework for ontology tools: different classes of tools require very different comparison frameworks. For example, ontology-development tools can easily be compared to one another since they all serve the same task: define concepts, instances, and relations in a domain. Tools for ontology merging, mapping, and alignment however are so different from one another that direct comparison may not be possible. They differ in the type of input they require (e.g., instance data or no instance data), the type of output they produce (e.g., one merged ontology, pairs of related terms, articulation rules), modes of interaction and so on. This diversity makes comparing the performance of mapping tools to one another largely meaningless. We present criteria that partition the set of such tools in smaller groups allowing users to choose the set of tools that best fits their tasks. We discuss what resources we as a community need to develop in order to make performance comparisons within each group of merging and mapping tools useful and effective. These resources will most likely come as results of evaluation experiments of stand-alone tools. As an example of such an experiment, we discuss our experiences and results in evaluating PROMPT, an interactive ontology-merging tool. Our experiment produced some of the resources that we can use in more general evaluation. However, it has also shown that comparing the performance of different tools can be difficult since human experts do not agree on how ontologies should be merged, and we do not yet have a good enough metric for comparing ontologies. 1 Ontology-Mapping Tools Versus Ontology-Development Tools Consider two types of ontology tools: (1) tools for developing ontologies and (2) tools for mapping, aligning, or merging ontologies. By ontology-development tools (which we will call development tools in the paper) we mean ontology editors that allow users to define new concepts, relations, and instances. These tools usually have capabilities for importing and extending existing ontologies. Development tools may include graphical browsers, search capabilities, and constraint checking. Protégé-2000 [17], OntoEdit [19], OilEd [2], WebODE [1], and Ontolingua [7] are some examples of development tools. Tools for mapping, aligning, and merging ontologies (which we will call mapping tools) are the tools that help users find similarities and differences between source ontologies. Mapping tools either identify potential correspondences automatically or provide the environment for the users to find and define these correspondences, or both. Mapping tools are often extensions of development tools. Mapping tool and algorithm examples include PROMPT[16], ONION [13], Chimaera [11], FCA-Merge [18], GLUE [5], and OBSERVER [12]. Even though theories on how to evaluate either type of tools are not well articulated at this point, there are already several frameworks for evaluating ontologydevelopment tools. For example, Duineveld and colleagues [6] in their comparison experiment used different development tools to represent the same domain ontology. Members of the Ontology-environments SIG in the OntoWeb initiative designed an extensive set of criteria for evaluating ontology-development tools and applied these criteria to compare a number of projects. Some of the aspects that these frameworks compare include: – interoperability with other tools and the ability to import and export ontologies in different representation languages; – expressiveness of the knowledge model; – scalability and extensibility; – availability and capabilities of inference services; – usability of the tools. Let us turn to the second class of ontology tools: tools for mapping, aligning, or merging ontologies. It is tempting to reuse many of the criteria from evaluation of development tools. For example, expressiveness of the underlying language is important and so is scalability and extensibility. We need to know if a mapping tool can work with ontologies from different languages. However, if we look at the mapping tools more closely, we see that their comparison and evaluation must be very different from the comparison and evaluation of development tools. All the ontology-development tools have very similar inputs and the desired outputs: we have a domain, possibly a set of ontologies to reuse, and a set of requirements for the ontology, and we need to use a tool to produce an ontology of that domain satisfying the requirements. Unlike the ontology-development tools, the 1 http://delicias.dia.fi.upm.es/ontoweb/sig-tools/ ontology-mapping tools vary with respect to the precise task that they perform, the inputs on which they operate and the outputs that they produce. First, the tasks for which the mapping tools are designed, differ greatly. On the one hand, all the tools are designed to find similarities and differences between source ontologies in one way or another. In fact, researchers have suggested a uniform framework for describing and analyzing this information regardless of what the final task is [3, 10]. On the other hand, from the user’s point of view the tools differ greatly in what tasks this analysis of similarities and differences supports. For example, Chimaera and PROMPT allow users to merge source ontologies into a new ontology that includes concepts from both sources. The output of ONION is a set of articulation rules between two ontologies; these rules define what the similarities and differences are. The articulation rules can later be used for querying and other tasks. The task of GLUE, AnchorPROMPT [14] and FCA-Merge is to provide a set of pairs of related concepts with some certainty factor associated with each pair. Second, different mapping tools rely on different inputs: Some tools deal only with class hierarchies of the sources and are agnostic in their merging algorithms about slots or instances (e.g., Chimaera). Other tools use not only classes but also slots and value restrictions in their analysis (e.g., PROMPT). Other tools rely in their algorithms on the existence of instances in each of the source ontologies (e.g., GLUE). Yet another set of tools require not only that instances are present, but also that source ontologies share a set of instances (e.g., FCA-Merge). Some tools work independently and produce suggestions to the user at the end, allowing the user to analyze the suggestions (e.g., GLUE, FCAMerge). Some tools expect that the source ontologies follow a specific knowledgerepresentation paradigm (e.g., Description Logic for OBSERVER). Other tools rely heavily on interaction with the user and base their analysis not only on the source ontologies themselves but also on the merging or alignment steps that the user performs (e.g., PROMPT, Chimaera). Third, since the tasks that the mapping tools support differ greatly, the interaction between a user and a tool is very different from one tool to another. Some tools provide a graphical interface which allows users to compare the source ontologies visually, and accept or reject the results of the tool analysis (e.g., PROMPT, Chimaera, ONION), the goal of other tools is to run the algorithms which find correlations between the source ontologies and output the results to the user in a text file or on the terminal–the users must then use the results outside the tool itself. The goal of this paper is to start a discussion on a framework for evaluating ontology-mapping tools that would account for this great variety in underlying assumptions and requirements. We argue that many of the tools cannot be compared directly with one another because they are so different in the tasks that they support. We identify the criteria for determining the groups of tools that can be compared directly, define what resources we need to develop to make such comparison possible and discuss our experiences in evaluating our merging tool, PROMPT, as well as the results of this evaluation. 2 Requirements for Evaluating Mapping Tools Before we discuss the evaluation requirements for mapping tools, we must answer the following question which will certainly affect the requirements: what is the goal of such potential evaluation? It is tempting to say “find the best tool.” However, as we have just discussed, given the diversity in the tasks that the tools support, their modes of interaction, the input data they rely on, it is impossible to compare the tools to one another and to find one or even several measures to identify the “best” tool. Therefore, we suggest that the questions driving such evaluation must be user-oriented. A user may ask either what is the best tool for his task or whether a particular tool is good enough for his task. Depending on what the user’s source ontologies are, how much manual work he is willing to put in, how important the precision of the results is, one or another tool will be more appropriate. Therefore, the first set of evaluation criteria are pragmatic criteria. These criteria include but are not limited to the following: Input requirements What elements from the source ontologies does the tool use? Which of these elements does the tool require? This information may include: concept names, class hierarchy, slot definitions, facet values, slot values, instances. Does the tool require that source ontologies use a particular knowledge-representation paradigm? Level of user interaction Does the tool perform the comparison in a “batch mode,” presenting the results at the end, or is it an interactive tool where intermediate results are analyzed by the user, and the tool uses the feedback for further analysis? Type o

142 citations

Journal Article
TL;DR: Five different cases studies that illustrate the use of ontologies in metadata representation, in global conceptualization, in high-level querying, in declarative mediation, and in mapping support are discussed.
Abstract: In this paper, we discuss the use of ontologies for data integration. We consider two different settings depending on the system architecture: central and peer-to-peer data integration. Within those settings, we discuss five different cases studies that illustrate the use of ontologies in metadata representation, in global conceptualization, in high-level querying, in declarative mediation, and in mapping support. Each case study is described in detail and accompanied by examples.

141 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Software development
73.8K papers, 1.4M citations
84% related
User interface
85.4K papers, 1.7M citations
84% related
Support vector machine
73.6K papers, 1.7M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202337
2022149
202111
202011
201919
201843