Similarity flooding: a versatile graph matching algorithm and its application to schema matching

doi:10.1109/ICDE.2002.994702

Open AccessProceedings ArticleDOI

Similarity flooding: a versatile graph matching algorithm and its application to schema matching

Sergey Melnik, +2 more

- pp 117-128

Chats0

TLDR

This paper presents a matching algorithm based on a fixpoint computation that is usable across different scenarios and conducts a user study, in which the accuracy metric was used to estimate the labor savings that the users could obtain by utilizing the algorithm to obtain an initial matching.

Abstract:

Matching elements of two data schemas or two data instances plays a key role in data warehousing, e-business, or even biochemical applications. In this paper we present a matching algorithm based on a fixpoint computation that is usable across different scenarios. The algorithm takes two graphs (schemas, catalogs, or other data structures) as input, and produces as output a mapping between corresponding nodes of the graphs. Depending on the matching goal, a subset of the mapping is chosen using filters. After our algorithm runs, we expect a human to check and if necessary adjust the results. As a matter of fact, we evaluate the 'accuracy' of the algorithm by counting the number of needed adjustments. We conducted a user study, in which our accuracy metric was used to estimate the labor savings that the users could obtain by utilizing our algorithm to obtain an initial matching. Finally, we illustrate how our matching algorithm is deployed as one of several high-level operators in an implemented testbed for managing information models and mappings.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A survey of approaches to automatic schema matching

Erhard Rahm, +1 more

TL;DR: A taxonomy is presented that distinguishes between schema-level and instance-level, element- level and structure- level, and language-based and constraint-based matchers and is intended to be useful when comparing different approaches to schema matching, when developing a new match algorithm, and when implementing a schema matching component.

...read moreread less

Book

Ontology Matching

Jérôme Euzenat, +1 more

TL;DR: The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content.

...read moreread less

Proceedings ArticleDOI

SimRank: a measure of structural-context similarity

Glen Jeh, +1 more

TL;DR: A complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects is proposed.

...read moreread less

Book

Mining of Massive Datasets

Anand Rajaraman, +1 more

TL;DR: This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets, and explains the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing.

...read moreread less

Book ChapterDOI

A survey of schema-based matching approaches

Pavel Shvaiko, +1 more

- 01 Jan 2005 -

Journal on Data Semantics

TL;DR: This paper presents a new classification of schema-based matching techniques that builds on the top of state of the art in both schema and ontology matching and distinguishes between approximate and exact techniques at schema-level; and syntactic, semantic, and external techniques at element- and structure-level.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

The anatomy of a large-scale hypertextual Web search engine

Sergey Brin, +1 more

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

Journal Article

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Sergey Brin, +1 more

- 01 Jan 1998 -

Computer Networks

TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.

...read moreread less

Book

Randomized Algorithms

Rajeev Motwani, +1 more

TL;DR: This book introduces the basic concepts in the design and analysis of randomized algorithms and presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications.

...read moreread less

Journal ArticleDOI

A survey of approaches to automatic schema matching

Erhard Rahm, +1 more

TL;DR: A taxonomy is presented that distinguishes between schema-level and instance-level, element- level and structure- level, and language-based and constraint-based matchers and is intended to be useful when comparing different approaches to schema matching, when developing a new match algorithm, and when implementing a schema matching component.

...read moreread less

Proceedings Article

Generic Schema Matching with Cupid

Jayant Madhavan, +2 more

TL;DR: This paper proposes a new algorithm, Cupid, that discovers mappings between schema elements based on their names, data types, constraints, and schema structure, using a broader set of techniques than past approaches.

...read moreread less

Similarity flooding: a versatile graph matching algorithm and its application to schema matching

Citations

A survey of approaches to automatic schema matching

Ontology Matching

SimRank: a measure of structural-context similarity

Mining of Massive Datasets

A survey of schema-based matching approaches

References

The anatomy of a large-scale hypertextual Web search engine

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Randomized Algorithms

A survey of approaches to automatic schema matching

Generic Schema Matching with Cupid

Related Papers (5)

A survey of approaches to automatic schema matching

Generic Schema Matching with Cupid

COMA: a system for flexible combination of schema matching approaches

Reconciling schemas of disparate data sources: a machine-learning approach

A survey of schema-based matching approaches