scispace - formally typeset
Topic

Schema matching

About: Schema matching is a(n) research topic. Over the lifetime, 929 publication(s) have been published within this topic receiving 30914 citation(s).

...read more

Papers
  More

Open accessJournal ArticleDOI: 10.1007/S007780100057
Erhard Rahm1, Philip A. Bernstein2Institutions (2)
01 Dec 2001-
Abstract: Schema matching is a basic problem in many database application domains, such as data integration, E-business, data warehousing, and semantic query processing. In current implementations, schema matching is typically performed manually, which has significant limitations. On the other hand, previous research papers have proposed many techniques to achieve a partial automation of the match operation for specific application domains. We present a taxonomy that covers many of these existing approaches, and we describe the approaches in some detail. In particular, we distinguish between schema-level and instance-level, element-level and structure-level, and language-based and constraint-based matchers. Based on our classification we review some previous match implementations thereby indicating which part of the solution space they cover. We intend our taxonomy and review of past work to be useful when comparing different approaches to schema matching, when developing a new match algorithm, and when implementing a schema matching component.

...read more

  • Fig. 3.Equivalence pattern
    Fig. 3.Equivalence pattern
  • Table 3.Match cardinalities (Examples)
    Table 3.Match cardinalities (Examples)
  • Table 5.Characteristics of proposed schema match approaches
    Table 5.Characteristics of proposed schema match approaches
  • Table 1.Sample input schemas
    Table 1.Sample input schemas
  • Table 4.Constraint-based matching (example)
    Table 4.Constraint-based matching (example)
  • + 2

Topics: Schema matching (72%), Star schema (66%), Conceptual schema (62%) ...read more

3,611 Citations


Open accessProceedings ArticleDOI: 10.1109/ICDE.2002.994702
26 Feb 2002-
Abstract: Matching elements of two data schemas or two data instances plays a key role in data warehousing, e-business, or even biochemical applications. In this paper we present a matching algorithm based on a fixpoint computation that is usable across different scenarios. The algorithm takes two graphs (schemas, catalogs, or other data structures) as input, and produces as output a mapping between corresponding nodes of the graphs. Depending on the matching goal, a subset of the mapping is chosen using filters. After our algorithm runs, we expect a human to check and if necessary adjust the results. As a matter of fact, we evaluate the 'accuracy' of the algorithm by counting the number of needed adjustments. We conducted a user study, in which our accuracy metric was used to estimate the labor savings that the users could obtain by utilizing our algorithm to obtain an initial matching. Finally, we illustrate how our matching algorithm is deployed as one of several high-level operators in an implemented testbed for managing information models and mappings.

...read more

Topics: 3-dimensional matching (65%), Schema matching (64%), Optimal matching (64%) ...read more

1,581 Citations


Open accessProceedings Article
11 Sep 2001-
Abstract: Schema matching is a critical step in many applications, such as XML message mapping, data warehouse loading, and schema integration. In this paper, we investigate algorithms for generic schema matching, outside of any particular data model or application. We first present a taxonomy for past solutions, showing that a rich range of techniques is available. We then propose a new algorithm, Cupid, that discovers mappings between schema elements based on their names, data types, constraints, and schema structure, using a broader set of techniques than past approaches. Some of our innovations are the integrated use of linguistic and structural matching, context-dependent matching of shared types, and a bias toward leaf structure where much of the schema content resides. After describing our algorithm, we present experimental results that compare Cupid to two other schema matching systems.

...read more

Topics: Schema matching (78%), Star schema (71%), Document Structure Description (66%) ...read more

1,512 Citations


Open accessBook ChapterDOI: 10.1007/11603412_5
Pavel Shvaiko1, Jérôme Euzenat2Institutions (2)
Abstract: Schema and ontology matching is a critical problem in many application domains, such as semantic web, schema/ontology integration, data warehouses, e-commerce, etc. Many different matching solutions have been proposed so far. In this paper we present a new classification of schema-based matching techniques that builds on the top of state of the art in both schema and ontology matching. Some innovations are in introducing new criteria which are based on (i) general properties of matching techniques, (ii) interpretation of input information, and (iii) the kind of input information. In particular, we distinguish between approximate and exact techniques at schema-level; and syntactic, semantic, and external techniques at element- and structure-level. Based on the classification proposed we overview some of the recent schema/ontology matching systems pointing which part of the solution space they cover. The proposed classification provides a common conceptual basis, and, hence, can be used for comparing different existing schema/ontology matching techniques and systems as well as for designing new ones, taking advantages of state of the art solutions.

...read more

  • Fig. 4. Characteristics of state of the art matching approaches
    Fig. 4. Characteristics of state of the art matching approaches
  • Fig. 2. Matching: Syntactic vs. Semantic
    Fig. 2. Matching: Syntactic vs. Semantic
  • Fig. 1. Two XML schemas
    Fig. 1. Two XML schemas
  • Fig. 3. A revised classification of schema-based matching approaches
    Fig. 3. A revised classification of schema-based matching approaches
Topics: Schema matching (75%), Star schema (67%), Conceptual schema (63%) ...read more

1,276 Citations


Open accessBook ChapterDOI: 10.1016/B978-155860869-6/50060-3
Hong-Hai Do1, Erhard Rahm1Institutions (1)
20 Aug 2002-
Abstract: Schema matching is the task of finding semantic correspondences between elements of two schemas. It is needed in many database applications, such as integration of web data sources, data warehouse loading and XML message mapping. To reduce the amount of user effort as much as possible, automatic approaches combining several match techniques are required. While such match approaches have found considerable interest recently, the problem of how to best combine different match algorithms still requires further work. We have thus developed the COMA schema matching system as a platform to combine multiple matchers in a flexible way. We provide a large spectrum of individual matchers, in particular a novel approach aiming at reusing results from previous match operations, and several mechanisms to combine the results of matcher executions. We use COMA as a framework to comprehensively evaluate the effectiveness of different matchers and their combinations for real-world schemas. The results obtained so far show the superiority of combined match approaches and indicate the high value of reuse-oriented strategies.

...read more

  • Figure 8. Problem size in schema matching tasks
    Figure 8. Problem size in schema matching tasks
  • Figure 5. Schema-level reuse in the Schema matcher
    Figure 5. Schema-level reuse in the Schema matcher
  • Figure 6. Combination of match results
    Figure 6. Combination of match results
  • Figure 12. Quality of best matcher combinations Among the no-reuse combinations, All performs best because many aspects are examined at the same time to
    Figure 12. Quality of best matcher combinations Among the no-reuse combinations, All performs best because many aspects are examined at the same time to
  • Figure 11. Quality of single matchers
    Figure 11. Quality of single matchers
  • + 10

Topics: Schema matching (63%), Data warehouse (53%)

1,163 Citations


Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20222
202127
202038
201934
201814
201734

Top Attributes

Show by:

Topic's top 5 most impactful authors

Avigdor Gal

24 papers, 575 citations

Gunter Saake

15 papers, 226 citations

Fabien Duchateau

15 papers, 306 citations

Erhard Rahm

15 papers, 6.9K citations

Karl Aberer

11 papers, 243 citations

Network Information
Related Topics (5)
Social Semantic Web

14.6K papers, 350.2K citations

84% related
Semantic computing

11.1K papers, 241.3K citations

83% related
Query optimization

17.6K papers, 474.4K citations

83% related
Web query classification

11.9K papers, 339.3K citations

82% related
Business rule

11.1K papers, 270.5K citations

82% related