scispace - formally typeset
Search or ask a question

Showing papers in "Journal on Data Semantics in 2008"


Book ChapterDOI
TL;DR: This paper presents a new ontology language, based on Description Logics, that is particularly suited to reason with large amounts of instances and a novel mapping language that is able to deal with the so-called impedance mismatch problem.
Abstract: Many organizations nowadays face the problem of accessing existing data sources by means of flexible mechanisms that are both powerful and efficient. Ontologies are widely considered as a suitable formal tool for sophisticated data access. The ontology expresses the domain of interest of the information system at a high level of abstraction, and the relationship between data at the sources and instances of concepts and roles in the ontology is expressed by means of mappings. In this paper we present a solution to the problem of designing effective systems for ontology-based data access. Our solution is based on three main ingredients. First, we present a new ontology language, based on Description Logics, that is particularly suited to reason with large amounts of instances. The second ingredient is a novel mapping language that is able to deal with the so-called impedance mismatch problem, i.e., the problem arising from the difference between the basic elements managed by the sources, namely data, and the elements managed by the ontology, namely objects. The third ingredient is the query answering method, that combines reasoning at the level of the ontology with specific mechanisms for both taking into account the mappings and efficiently accessing the data at the sources.

884 citations


Book ChapterDOI
TL;DR: This paradigm based on the idea of harvesting the Semantic Web, i.e., automatically finding and exploring multiple and heterogeneous online knowledge sources to derive mappings, has a promising baseline precision of 70% and is complementary to existing techniques.
Abstract: In this paper we propose an ontology matching paradigm based on the idea of harvesting the Semantic Web, i.e., automatically finding and exploring multiple and heterogeneous online knowledge sources to derive mappings. We adopt an experimental approach in the context of matching two real life, large-scale ontologies to investigate the potential of this paradigm, its limitations, and its relation to other techniques. Our experiments yielded a promising baseline precision of 70% and identified a set of critical issues that need to be considered to achieve the full potential of the paradigm. Besides providing a good performance as a stand-alone matcher, our paradigm is complementary to existing techniques and therefore could be used in hybrid tools that would further advance the state of the art in the ontology matching field.

146 citations


Book ChapterDOI
TL;DR: Flexible querying of RDF data is explored, with the aim of making it possible to return data satisfying query conditions with varying degrees of exactness, and to rank the results of a query depending on how "closely" they satisfy the query conditions.
Abstract: We explore flexible querying of RDF data, with the aim of making it possible to return data satisfying query conditions with varying degrees of exactness, and also to rank the results of a query depending on how "closely" they satisfy the query conditions. We make queries more flexible by logical relaxation of their conditions based on RDFS entailment and RDFS ontologies. We develop a notion of ranking of query answers, and present a query processing algorithm for incrementally computing the relaxed answer of a query. Our approach has application in scenarios where there is a lack of understanding of the ontology underlying the data, or where the data objects have heterogeneous sets of properties or irregular structures.

66 citations


Book ChapterDOI
TL;DR: In this paper, a simple and expressive model that honors the time dependence of the road network is proposed to support the design of efficient algorithms for computing the frequent queries on the network.
Abstract: Given applications such as location based services and the spatio-temporal queries they may pose on a spatial network (e.g., road networks), the goal is to develop a simple and expressive model that honors the time dependence of the road network. The model must support the design of efficient algorithms for computing the frequent queries on the network. This problem is challenging due to potentially conflicting requirements of model simplicity and support for efficient algorithms. Time expanded networks, which have been used to model dynamic networks employ replication of the networks across time instants, resulting in high storage overhead and algorithms that are computationally expensive. In contrast, the proposed time-aggregated graphs do not replicate nodes and edges across time; rather they allow the properties of edges and nodes to be modeled as a time series. Since the model does not replicate the entire graph for every instant of time, it uses less memory and the algorithms for common operations are computationally more efficient than for time expanded networks. One important query on spatio-temporal networks is the computation of shortest paths. Shortest paths can be computed either for a given start time or to find the start time and the path that lead to least travel time journeys (best start time journeys). Developing efficient algorithms for computing shortest paths in a time variant spatial network is challenging because these journeys do not always display optimal prefix property, making techniques like dynamic programming inapplicable. In this paper, we propose algorithms for shortest path computation for a fixed start time. We present the analytical cost model for the algorithm and compare with the performance of existing algorithms.

63 citations


Book ChapterDOI
TL;DR: The goal of this paper is to identify various aspects of context-awareness needed to facilitate semantics integration of data, and to discuss how this knowledge may be represented within ontologies.
Abstract: The goal of this paper is to identify various aspects of context-awareness needed to facilitate semantics integration of data, and to discuss how this knowledge may be represented within ontologies. We first present a taxonomy of ontologies and we show how various kinds of ontologies may cooperate. Then, we compare ontologies and conceptual models. We claim that their main difference is the consensual nature of ontologies when conceptual models are specifically designed for one particular target system. Reaching consensus, in turn, needs specific models of which context dependency has been represented and minimized. We identify five principles for making ontologies less contextual than models and suitable for data integration and we show, as an example, how these principles have been implemented in the PLIB ontology model developed for industrial data integration. Finally, we suggest a road map for switching from conventional databases to ontology-based databases without waiting until standard ontologies are available in every domains.

55 citations


Book ChapterDOI
TL;DR: This paper presents an extended classification of automated ontology matching and proposes an automatic composite solution for the matching problem based on cooperation, and compares the model with three state of the art matching systems.
Abstract: This paper proposes a cooperative approach for composite ontology mapping. We first present an extended classification of automated ontology matching and propose an automatic composite solution for the matching problem based on cooperation. In our proposal, agents apply individual mapping algorithms and cooperate in order to change their individual results. We assume that the approaches are complementary to each other and their combination produces better results than the individual ones. Next, we compare our model with three state of the art matching systems. The results are promising specially for what concerns precision and recall. Finally, we propose an argumentation formalism as an extension of our initial model. We compare our argumentation model with the matching systems, showing improvements on the results.

49 citations


Journal Article
TL;DR: This paper proposes an approach to resolving unsatisfiable ontologies which is fine-grained in the sense that it allows parts of axioms to be changed, and revise the axiom tracing technique first proposed by Baader and Hollunder, so as to track which parts of the problematic axiomatic cause the unsatisfiability.
Abstract: The ability to deal with inconsistencies and to evaluate the impact of possible solutions for resolving inconsistencies are of the utmost importance in real world ontology applications. The common approaches either identify the minimally unsatisfiable sub-ontologies or the maximally satisfiable sub-ontologies. However there is little work which addresses the issue of rewriting the ontology; it is not clear which axioms or which parts of axioms should be repaired, nor how to repair those axioms. In this paper, we address these limitations by proposing an approach to resolving unsatisfiable ontologies which is fine-grained in the sense that it allows parts of axioms to be changed. We revise the axiom tracing technique first proposed by Baader and Hollunder, so as to track which parts of the problematic axioms cause the unsatisfiability. Moreover, we have developed a tool to support the ontology user in rewriting problematic axioms. In order to minimise the impact of changes and prevent unintended entailment loss, both harmful and helpful changes are identified and reported to the user. Finally we present an evaluation of our interactive debugging tool and demonstrate its applicability in practice.

46 citations


Book ChapterDOI
Peter Mork1, Len Seligman1, Arnon Rosenthal1, Joel G. Korb1, Chris Wolf1 
TL;DR: This paper provides a task model for schema integration by providing a breakdown of the relationships between the source schemata and the target schema, and uses this breakdown to motivate a workbench for schema Integration in which multiple tools share a common knowledge repository.
Abstract: A key aspect of any data integration endeavor is determining the relationships between the source schemata and the target schema. This schema integration task must be tackled regardless of the integration architecture or mapping formalism. In this paper, we provide a task model for schema integration. We use this breakdown to motivate a workbench for schema integration in which multiple tools share a common knowledge repository. In particular, the workbench facilitates the interoperation of research prototypes for schema matching (which automatically identify likely semantic correspondences) with commercial schema mapping tools (which help produce instance-level transformations). Currently, each of these tools provides its own ad hoc representation of schemata and mappings; combining these tools requires aligning these representations. The workbench provides a common representation so that these tools can more rapidly be combined.

36 citations


Book ChapterDOI
TL;DR: The proposed similarity measurement considers different existing similarities, which have been combined and extended and is explicitly parameterised according to the criteria induced by the context.
Abstract: In this paper we propose an asymmetric semantic similarity among instances within an ontology. We aim to define a measurement of semantic similarity that exploit as much as possible the knowledge stored in the ontology taking into account different hints hidden in the ontology definition. The proposed similarity measurement considers different existing similarities, which we have combined and extended. Moreover, the similarity assessment is explicitly parameterised according to the criteria induced by the context. The parameterisation aims to assist the user in the decision making pertaining to similarity evaluation, as the criteria can be refined according to user needs. Experiments and an evaluation of the similarity assessment are presented showing the efficiency of the method.

31 citations


Book ChapterDOI
TL;DR: This paper presents a novel approach using the capabilities of semantic technologies in order to improve cross-organisational modelling by automatic generation and evolution of model transformations.
Abstract: Model-driven software development facilitates faster and more flexible integration of information and communication systems. It divides system descriptions into models of different view points and abstraction levels. To effectively realize cross-organisational collaborations, it is an important prerequisite to exchange models between different modelling languages and tools. Knowledge is captured in model transformations, which are continuously adjusted to new modelling formats and tools. However, interoperability problems in modelling can hardly be overcome by solutions that essentially operate at syntactical level. This paper presents a novel approach using the capabilities of semantic technologies in order to improve cross-organisational modelling by automatic generation and evolution of model transformations.

28 citations


Book ChapterDOI
TL;DR: This paper analyzes motivations, requirements and expected results, before proposing a reusable SWS-based framework and demonstrates the application of this framework by showing how integration and interoperability emerge from this model through a cooperative and multi-viewpoint methodology.
Abstract: Joining up services in e-Government usually implies governmental agencies acting in concert without a central control regime. This requires to the sharing scattered and heterogeneous data. Semantic Web Service (SWS) technology can help to integrate, mediate and reason between these datasets. However, since a few real-world applications have been developed, it is still unclear which are the actual benefits and issues of adopting such a technology in the e-Government domain. In this paper, we contribute to raising awareness of the potential benefits in the e-Government community by analyzing motivations, requirements and expected results, before proposing a reusable SWS-based framework. We demonstrate the application of this framework by showing how integration and interoperability emerge from this model through a cooperative and multi-viewpoint methodology. Finally, we illustrate added values and lessons learned by two compelling case studies: a change of circumstances notification system and a GIS-based emergency planning system, and describe key challenges which remain to be addressed.

Book ChapterDOI
TL;DR: This work describes and evaluates the XTREEM-SG approach on finding sibling semantics from semi-structured Web documents, and investigates how variations on input, parameters and gold standard influence the obtained results on structuring a closed vocabulary into semantic sibling groups.
Abstract: The acquisition of explicit semantics is still a research challenge. Approaches for the extraction of semantics focus mostly on learning subordination relations. The extraction of coordination relations, also called "sibling relations" is studied much less, though they are not less important in ontology engineering. We describe and evaluate the XTREEM-SG approach on finding sibling semantics from semi-structured Web documents. XTREEM-SG stands for "Xhtml TREE Mining - for Sibling Groups". It uses the XHTML-markup that is available in Web content to group together terms that are in a sibling relation to each other. Our approach has the advantage that it is domain and language independent; it does not rely on background knowledge, NLP software nor training. We evaluate XTREEM-SG towards two gold standard ontologies. We investigate how variations on input, parameters and gold standard influence the obtained results on structuring a closed vocabulary into semantic sibling groups. Earlier methods that evaluate sibling relations against a gold standard report a 14.18% F-measure on average sibling overlap. Our method improves this number into 22.93%.

Book ChapterDOI
TL;DR: This paper presents the experience in the participation to the Semantic Web Service (SWS) challenge 2006, where the proposed approach achieved very good results in solving the proposed problems.
Abstract: Although Semantic Web Services are expected to produce a revolution in the development of Web-based systems, very few concrete design experiences are available; only recently, Software Engineering methods and tools have started to embrace the deployment of Semantic Web applications. In this paper, we show how classical Software Engineering methods (i.e., formal business process development, computer-aided and component-based software design, and automatic code generation) combine with semantic methods and tools (i.e., ontology engineering, semantic service annotation and discovery) to forge a new approach to software development for the Semantic Web. In particular, we present our experience in the participation to the Semantic Web Service (SWS) challenge 2006, where the proposed approach achieved very good results in solving the proposed problems.

Book ChapterDOI
TL;DR: The intent of the paper is to explain how ORM can be successfully used in the eventual implementation of a data quality firewall, including the details of the architecture of the data Quality Firewall in an enterprise data warehouse to enable data quality assurance.
Abstract: Data Warehouses typically represent data being integrated from multiple source systems. There are inherent data quality problems when data is being consolidated in terms of data semantics, master data integration, cross functional business rule conflicts, data cleansing, etc. This use case demonstrates how multiple Object Role Models were successfully used in the establishment of a Data Quality Firewall architecture to define an Advanced Generation Data Warehouse. The ORM models represented the realization of the 100% principle in ISO TR9007 Report on Conceptual Schemas, that were then successfully transformed into attribute-based models to generate SQL DBMS schemas. These were then subsequently used in RDBMS code generation for an 100% automated implementation for the Data Quality Firewall checks based on the described advanced generation Data Warehouse architecture. This same Data Quality Firewall approach has also been successfully used in implementing multiple web based applications, characteristically yielding a representative savings of 35-40% savings in development costs. The intent of the paper is to explain how ORM can be successfully used in the eventual implementation of a data quality firewall, including the details of the architecture of the data quality firewall in an enterprise data warehouse to enable data quality assurance. It is not within the scope of this paper to address the use or merits of alternative modelling paradigms in this regard.

Book ChapterDOI
TL;DR: In this paper, a universal architecture for emergent semantics using a central repository within a multi-user environment, based on solid linguistic theories, is introduced, and implemented an information retrieval system supporting term queries on standard information retrieval corpora.
Abstract: Emergent Semantics is a new paradigm for inferring semantic meaning from implicit feedback by a sufficiently large number of users of an object retrieval system. In this paper, we introduce a universal architecture for emergent semantics using a central repository within a multi-user environment, based on solid linguistic theories. Based on this architecture, we have implemented an information retrieval system supporting term queries on standard information retrieval corpora. Contrary to existing query refinement strategies, feedback on the retrieval results is incorporated directly into the actual document representations improving future retrievals. An evaluation yields higher precision values at the standard recall levels and thus demonstrates the effectiveness of the emergent semantics approach for typical information retrieval problems.

Book ChapterDOI
TL;DR: This paper introduces an ontology-based approach to process parallel colour descriptions from botanical documents based on a semantic model that takes advantage of ontologies so as to represent the semantics of colour descriptions precisely, to integrate parallel descriptions according to their semantic distances, and to answer colour-related species identification queries.
Abstract: Information integration and retrieval are useful tasks in many information systems. In these systems, it is far from an easy task to directly integrate information from natural language (NL) sources, because precisely capturing NL semantics is not a trivial issue in the first place. In this paper, we choose the botanical domain to investigate this issue. While most existing systems in this domain support only keywordbased search, this paper introduces an ontology-based approach to process parallel colour descriptions from botanical documents. Based on a semantic model, it takes advantage of ontologies so as to represent the semantics of colour descriptions precisely, to integrate parallel descriptions according to their semantic distances, and to answer colour-related species identification queries. To evaluate this approach, we implement a colour reasoner based on the FaCT-DG Description Logic reasoner, and present some results of our experiments on integrating parallel descriptions and species identification queries. From this highly specialised domain, we learn a set of more general methodological rules.