scispace - formally typeset
Open AccessJournal ArticleDOI

Minimizing conservativity violations in ontology alignments: algorithms and evaluation

Reads0
Chats0
TLDR
This paper presents an approach to detect and minimize the violations of the so-called conservativity principle where novel subsumption entailments between named concepts in one of the input ontologies are considered as unwanted.
Abstract
In order to enable interoperability between ontology-based systems, ontology matching techniques have been proposed. However, when the generated mappings lead to undesired logical consequences, their usefulness may be diminished. In this paper, we present an approach to detect and minimize the violations of the so-called conservativity principle where novel subsumption entailments between named concepts in one of the input ontologies are considered as unwanted. The practical applicability of the proposed approach is experimentally demonstrated on the datasets from the Ontology Alignment Evaluation Initiative.

read more

Content maybe subject to copyright    Report

City, University of London Institutional Repository
Citation: Solimando, A., Jimenez-Ruiz, E. and Guerrini, G. (2017). Minimizing
conservativity violations in ontology alignments: algorithms and evaluation. Knowledge and
Information Systems, 51(3), pp. 775-819. doi: 10.1007/s10115-016-0983-3
This is the accepted version of the paper.
This version of the publication may differ from the final published
version.
Permanent repository link: https://openaccess.city.ac.uk/id/eprint/22961/
Link to published version: http://dx.doi.org/10.1007/s10115-016-0983-3
Copyright: City Research Online aims to make research outputs of City,
University of London available to a wider audience. Copyright and Moral
Rights remain with the author(s) and/or copyright holders. URLs from
City Research Online may be freely distributed and linked to.
Reuse: Copies of full items can be used for personal research or study,
educational, or not-for-profit purposes without prior permission or
charge. Provided that the authors, title and full bibliographic details are
credited, a hyperlink and/or URL is given for the original metadata page
and the content is not changed in any way.
City Research Online: http://openaccess.city.ac.uk/ publications@city.ac.uk
City Research Online

Under consideration for publication in Knowledge and Information Systems
Minimizing Conservativity Violations in
Ontology Alignments: Algorithms and
Evaluation
Alessandro Solimando
1
, Ernesto Jim
´
enez-Ruiz
2
, Giovanna Guerrini
1
1
DIBRIS, Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi; University of Genova; Italy
2
Department of Computer Science; University of Oxford; United Kingdom
Abstract. In order to enable interoperability between ontology-based systems, ontology match-
ing techniques have been proposed. However, when the generated mappings lead to undesired
logical consequences, their usefulness may be diminished. In this paper, we present an approach
to detect and minimize the violations of the so-called conservativity principle where novel sub-
sumption entailments between named concepts in one of the input ontologies are considered as
unwanted. The practical applicability of the proposed approach is experimentally demonstrated
on the datasets from the Ontology Alignment Evaluation Initiative.
1. Introduction
Ontologies play a key role in the development of the Semantic Web and are being used
in many diverse application domains, ranging from biomedicine to energy industry. An
application domain may have been modeled with different points of view and purposes.
This situation usually leads to the development of different ontologies that intuitively
overlap, but they use different naming and modeling conventions.
The problem of (semi-)automatically computing mappings between independently
developed ontologies is usually referred to as the ontology matching problem. A num-
ber of sophisticated ontology matching systems have been developed in the last years
[16, 71]. Ontology matching systems, however, rely on lexical and structural heuristics
and the integration of the input ontologies and the mappings may lead to many undesired
logical consequences. In [36] three principles were proposed to minimize the number of
potentially unintended consequences, namely: (i) consistency principle, the mappings
should not lead to unsatisfiable concepts in the integrated ontology, (ii) conservativity
Received xxx
Revised xxx
Accepted xxx

2 A. Solimando et. al
principle, the mappings should not introduce new semantic relationships between con-
cepts from one of the input ontologies, (iii) locality principle, the mappings should link
entities that have similar neighbourhoods.
These alignment principles have been actively investigated in the last years (e.g.,
[32, 33, 36, 54, 56, 57, 66]). Violations to these principles are frequent, even in the
reference mapping sets and the alignments generated by the best performing match-
ers of the Ontology Alignment Evaluation Initiative
1
(OAEI). Also manually curated
alignments, such as the UMLS Metathesaurus [5] (UMLS),
2
a comprehensive effort for
integrating biomedical knowledge bases, suffer from these violations [36]. The occur-
rence of these violations may hinder the usefulness of ontology mappings. The practical
effect of these violations is clearly evident when ontology alignments are involved in
complex tasks such as query answering [54, 78]. The undesired logical consequences
caused by violations can either prevent query answering, or cause incorrect results. In
order to reduce existing violations, alignment repair methods typically remove a subset
of the alignment, given that input ontologies are considered as immutable, a common
setting in ontology alignment repair scenarios.
It should be noted, however, the different nature of the alignment principles. Vio-
lations of the consistency principle, unlike violations of the conservativity and locality
principles, always lead to an undesired logical consequence (i.e., unsatisfiability of a
concept) and they should always be avoided. Conservativity and locality violations may
also lead to undesired logical consequences; however they may also represent false pos-
itives and reveal incompleteness in one of the input ontologies. In Section 8 we discuss
alternative approaches that suggest to fix the input ontologies instead of repairing the
alignment (e.g., [7, 48]).
In this paper we focus on the conservativity violations and we follow a “better safe
than sorry” approach (i.e., we treat violations as undesired consequences led by the
mappings). Conservativity violations are presented in two flavours, namely subsumption
violations and equivalence violations. The (potential) challenging number of conserva-
tivity violations requires to exploit the intrinsic characteristics of these two flavours,
that result in the development of different approaches for their repair. The detection
and correction of subsumption violations relies on the assumption of disjointness [67]
and it is reduced to a consistency principle violation problem; while equivalence vi-
olations are addressed using a combination of graph theory and logic programming.
These two methods are combined into a multi-strategy approach addressing both types
of violations. Our extensive evaluation supports the effectiveness of the individual and
combined approaches in the detection and correction of conservativity violations.
The present paper extends [74, 75] under the following aspects: all the experimental
evaluations provided here cover both reference alignments and alignments computed
by participating systems of the OAEI 20122014 campaigns, where previous papers
covered only the reference alignments of the OAEI. Compared to [75], the present article
fully details the proposed method, including a correctness proof of the technique for
adding disjointness clauses to Horn Propositional formulas, on which our technique
heavily relies. Furthermore, [75] only dealt with the subsumption violations flavour,
while in this paper we also cover in detail the equivalence violations flavour. Concerning
[74],
3
all the technical details and proofs are now provided. In addition, the results of
the evaluation of the two possible variants of our combined repair approach are now
1
http://oaei.ontologymatching.org/
2
Alignments from UMLS are extracted according to the method defined in [36].
3
This paper was presented in a workshop without formal proceedings.

Minimizing Conservativity Violations in Ontology Alignments 3
analyzed, as well as the results for the independent techniques in isolation, that can be
used as baseline results. Finally, an empirical assessment of the impact of our repair
methods on the alignment quality (in terms of precision, recall and f-measure) is now
provided.
The remainder of the paper is organised as follows. Section 2 summarises the basic
concepts and definitions we will rely on along the paper. In Section 3 we introduce
our motivating scenario. Section 4 formally states the problem of computing repairs
for equivalence violations and presents an algorithm to solve such violations. Section 5
describes the method and algorithm to solve subsumption violations. Section 6 details
additional properties of the proposed methods. In Section 7 we present the conducted
evaluation. A comparison with relevant related work is provided in Section 8. Finally,
Section 9 gives some conclusions and future work lines.
2. Preliminaries
In this section, we provide the necessary definitions and notions that will be used in
the subsequent sections. Section 2.1 briefly introduces OWL 2 and the main elements
in an ontology. In Section 2.2 we give a formal definition of ontology mapping and on-
tology alignment (adapted from [17]) with their semantics. In Section 2.3 we precisely
define the semantic consequences imposed by ontology alignments, and we formalize
the consistency and conservativity principles. Finally, Section 2.4 covers the necessary
preliminaries about graph theory.
2.1. Ontologies and OWL 2
Ontologies play a key role in the development of the Semantic Web and are being used
in many diverse application domains, ranging from biomedicine to energy industry.
The most widely used ontology modelling language is the OWL 2 Web Ontology Lan-
guage [11], which is a World Wide Web Consortium (W3C) recommendation [84].
Description Logics (DL) are the formal underpinning of OWL 2 [3, 30].
An OWL 2 ontology O is equipped with a signature Sig(O), that is a vocabulary
of legal names for the entities appearing in the ontology. Sig(O) is composed by the
disjoint union of four finite sets: (i) N
C
, a set of unary symbols called named concepts,
(ii) N
R
, a set of binary symbols called named object properties, (iii) N
D
, a set of bi-
nary symbols called data properties, (iv) N
I
, a set of constant symbols called named
individuals.
OWL 2 ontologies can be seen as a set of axioms that are conformant to the syn-
tactic rules and constraints imposed by their underlying DL language [30], and built
using the elements of the signature. The classification of O, denoted as Cl(O), corre-
sponds to the result of the computation, performed using an OWL 2 reasoner, of the full
subsumption/subconcept relation between its named concepts (i.e., elements of N
C
).
Classification is therefore the subset of the logical closure of an ontology O s.t. each
axiom is of the form A v B, where A, B N
C
(O) and O |= A v B.
2.2. Ontology Mappings and Alignments
Ontology Mappings. In Definition 2.1 we provide the definition of ontology mapping
(also called match or correspondence).

4 A. Solimando et. al
Definition 2.1. Consider two input ontologies O
1
, O
2
, and their respective signature
Sig(O
1
) and Sig(O
2
). A mapping between entities of O
1
, O
2
is a 4-tuple he, e
0
, r, ci
such that e Sig(O
1
) and e
0
Sig(O
2
), r {v, w, ≡}
4
is a semantic relation, and
c is a confidence value. Usually, the real number unit interval (0 . . . 1] is employed for
representing confidence values. Mapping confidence intuitively reflects how reliable a
mapping is (i.e., 1 = very reliable, 0 = not reliable).
Ontology Alignment. Definition 2.2 introduces the notion of alignment.
Definition 2.2. An alignment M between two ontologies, namely O
1
, O
2
, is a set of
mappings between O
1
and O
2
.
The main format to represent mappings have been proposed in the context of the
Alignment API, and it is called RDF Alignment [12]. This format is the standard for
the well-known OAEI campaign. In addition, mappings are also represented as standard
subclass and equivalence DL axioms. When mappings are expressed through OWL 2
axioms, confidence values are represented as OWL 2 axiom annotations [35]. The rep-
resentation through standard OWL 2 axioms enables the reuse of the extensive range of
OWL 2 reasoning infrastructure that is currently available. We adopt this representation,
and in the remainder of the paper we consider alignments as set of OWL 2 axioms.
Definition 2.3 introduces the notion of aligned ontology, resulting from the integra-
tion of two input ontologies, through an alignment between them.
Definition 2.3. Let O
1
, O
2
be two (input) ontologies, and let M be an alignment be-
tween them. The ontology O
M
O
1
,O
2
= O
1
O
2
M is called the aligned ontology w.r.t.
O
1
, O
2
, and M.
O
M
O
1
,O
2
is simply called the aligned ontology when no confusion arises. Note that
we assume that the signature of the aligned ontology is always the union of the signa-
tures of the input ontologies. When the input ontologies are clear from the context we
employ the abbreviated notation O
M
.
Given that each mapping is translated into an OWL 2 axiom, the aligned ontology is
again an OWL 2 ontology. Note that alternative formal semantics for ontology mappings
have been proposed in the literature, such as those proposed by Zimmermann et al.
in [87], and the semantics associated to the so-called bridge rules, in the context of
distributed description logics [6, 55].
2.3. Semantics of the Integration and Principles for Ontology Alignments
This section introduces the semantics of the integration, and provides a formal charac-
terization of the consistency and conservativity principles in ontology alignment.
Semantic Consequences of the Integration. The ontology resulting from the integra-
tion of two ontologies O
1
and O
2
via an alignment M may entail axioms that do not
follow from O
1
, O
2
, or M alone. These new semantic consequences can be captured
by the notion of deductive difference [46, 47].
Intuitively, the deductive difference between O and O
0
, w.r.t. a signature Σ, is the set
of entailments constructed over Σ that do not hold in O, but do hold in O
0
. The notion
4
We exclude disjointness from the semantic relations given that most of the available systems do not compute
this relation. Negative constraints are typically harder to identify and assess than positive ones [20].

Citations
More filters
Journal ArticleDOI

Ontology Based Data Access in Statoil

TL;DR: This work has developed a deployment module to create ontologies and mappings from relational databases in a semi-automatic fashion; a query processing module to perform and optimise the process of translating ontological queries into data queries and their execution over either a single DB of federated DBs; and a query formulation module to support query construction for engineers with a limited IT background.

From Polynomial Procedures to Efficient Reasoning with EL Ontologies

TL;DR: This paper describes ELK—a high performance reasoner for OWL EL ontologies—and details various aspects from theory to implementation that make ELK one of the most competitive reasoning systems forEL ontologies available today.
Book Chapter

Results of the Ontology Alignment Evaluation Initiative 2019

TL;DR: The Ontology Alignment Evaluation Initiative (OAEI) as mentioned in this paper aims at comparing ontology matching systems on precisely defined test cases, which can be based on ontologies of different levels of complexity.
Book Chapter

LogMap family participation in the OAEI 2017

TL;DR: The LogMap project as mentioned in this paper participated in the OAEI 2017 campaign and developed a scalable and logic-based ontology matching system, which is one of the few systems that participates in (almost) all OWL tracks.
Journal ArticleDOI

Large-Scale Ontology Matching: State-of-the-Art Analysis

TL;DR: A review of the state-of-the-art techniques being applied by ontology matching tools to achieve scalability and produce high-quality mappings when matching large ontologies and a direct comparison of the techniques to gauge their effectiveness in achieving scalability is provided.
References
More filters
Journal ArticleDOI

Depth-First Search and Linear Graph Algorithms

TL;DR: The value of depth-first search or “backtracking” as a technique for solving problems is illustrated by two examples of an improved version of an algorithm for finding the strongly connected components of a directed graph.
Journal ArticleDOI

The Unified Medical Language System (UMLS): integrating biomedical terminology

TL;DR: The Unified Medical Language System is a repository of biomedical vocabularies developed by the US National Library of Medicine and includes tools for customizing the Metathesaurus (MetamorphoSys), for generating lexical variants of concept names (lvg) and for extracting UMLS concepts from text (MetaMap).
Journal ArticleDOI

A theory of diagnosis from first principles

TL;DR: The theory accommodates diagnostic reasoning in a wide variety of practical settings, including digital and analogue circuits, medicine, and database updates, and reveals close connections between diagnostic reasoning and nonmonotonic reasoning.
Book

Ontology Matching

TL;DR: The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content.
Proceedings ArticleDOI

Similarity flooding: a versatile graph matching algorithm and its application to schema matching

TL;DR: This paper presents a matching algorithm based on a fixpoint computation that is usable across different scenarios and conducts a user study, in which the accuracy metric was used to estimate the labor savings that the users could obtain by utilizing the algorithm to obtain an initial matching.
Related Papers (5)
Frequently Asked Questions (16)
Q1. What have the authors contributed in "Minimizing conservativity violations in ontology alignments: algorithms and evaluation" ?

In this paper, the authors present an approach to detect and minimize the violations of the so-called conservativity principle where novel subsumption entailments between named concepts in one of the input ontologies are considered as unwanted. 

In order to mitigate incompleteness, the authors plan to study extensions of their techniques to more expressive logical fragments, while keeping the current scalability properties. Nevertheless the authors plan to explore alternative methods to address the conservativity violations. For example, domain experts could be involved in the assessment of the additional disjointness [ 20, 35 ], and to suggest extensions to the input ontologies [ 31 ] for violations recognised as false positives. The authors consider, however, that the proposed methods have also potential in scenarios others than Optique. 

State-of-the-art ontology alignment repair systems, such as ALCOMO [54], AML [66], ASMOV [32], Lily [85], LogMap [33], and YAM++ [60], typically consider the input ontologies as immutable and their repair techniques focus on the mappings. 

The impact of alignment repair is computed as the percentual of gain (resp. loss for negative values) for each measure computed for a repaired alignment, compared to the same measure computed for the original alignment. 

The conservativity principle in ontology alignment aims at capturing the differences in the ontology classification between the input ontologies and the aligned ontology [36] (i.e., new subsumptions and/or new equivalences among concepts). 

Given that queries over the structural relationships of ontologies are heavily employed in their approach, the authors rely on the optimized structural index of LogMap [33, 39], based on the interval labelling schema techniques presented in [1] 

In addition, the authors also define violations between concepts that may have been already involved in a subsumption relationship (i.e., resulting in an equivalence between them), denoted as equivalence conservativity principle violations, or simply equivalence violations. 

In [36] three principles were proposed to minimize the number of potentially unintended consequences, namely: (i) consistency principle, the mappings should not lead to unsatisfiable concepts in the integrated ontology, (ii) conservativityReceived xxx Revised xxx Accepted xxxprinciple, the mappings should not introduce new semantic relationships between concepts from one of the input ontologies, (iii) locality principle, the mappings should link entities that have similar neighbourhoods. 

In a later release of such ontology, 15 entities were merged, while 18 were judged as not equivalent by domain experts (NCI ontology curators). 

The graph representationG of the aligned ontology w.r.t.O1,O2 andM, is built by means of createDigraph function (line 1 of Algorithm 1). 

The experimental results considering EqRepair algorithm, can be summarized as follows:(i) The sum of the detection and repair time of EqRepair is very low due to the linear cost of the detection technique and the efficient parallelization of the diagnosis computation.(ii) 

The correction strategy aims at adding to the input ontologies a minimal set of axioms, so that the input ontologies (in isolation) can entail the novel axiom (solving, in this way, the violation). 

In Definition 4.4 the authors formalize a diagnosis as the set of arcs of the graph representation of an aligned ontology that, once removed, breaks all the unsafe cycles. 

The computed repairs are typically of limited size (less than 10%), but can reach a significant portion of the the original alignment. 

Starting from the results of Proposition 4.2, the authors can characterize a restricted version of the conservativity principle using graph-theoretical concepts only, applied on the graph representation, without the need to refer to the aligned ontology. 

The step 8 of Algorithm 2 uses the mapping (incoherence) repair algorithm of LogMap, for the extended Horn propositional formulas Pd1 and Pd2 , and the input mappings M. The mapping repair process exploits the Dowling-Gallier (D&G) algorithm [14, 23] for propositional Horn satisfiability (refer to [73], Section 6.3, for more details) and checks, for every propositionA of a given formula P , the satisfiability of the propositional formula PA = P ∪ {> →