Home
/
Authors
/
Hong-Hai Do

Author

Hong-Hai Do

Bio: Hong-Hai Do is an academic researcher from Leipzig University. The author has contributed to research in topics: Schema matching & Schema migration. The author has an hindex of 7, co-authored 7 publications receiving 2359 citations.

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

COMA: a system for flexible combination of schema matching approaches

[...]

Hong-Hai Do¹, Erhard Rahm¹•Institutions (1)

Leipzig University¹

20 Aug 2002

TL;DR: This work develops the COMA schema matching system as a platform to combine multiple matchers in a flexible way and uses COMA as a framework to comprehensively evaluate the effectiveness of different matchers and their combinations for real-world schemas.

...read moreread less

Abstract: Schema matching is the task of finding semantic correspondences between elements of two schemas. It is needed in many database applications, such as integration of web data sources, data warehouse loading and XML message mapping. To reduce the amount of user effort as much as possible, automatic approaches combining several match techniques are required. While such match approaches have found considerable interest recently, the problem of how to best combine different match algorithms still requires further work. We have thus developed the COMA schema matching system as a platform to combine multiple matchers in a flexible way. We provide a large spectrum of individual matchers, in particular a novel approach aiming at reusing results from previous match operations, and several mechanisms to combine the results of matcher executions. We use COMA as a framework to comprehensively evaluate the effectiveness of different matchers and their combinations for real-world schemas. The results obtained so far show the superiority of combined match approaches and indicate the high value of reuse-oriented strategies.

...read moreread less

1,199 citations

Proceedings Article•DOI•

Schema and ontology matching with COMA

[...]

David Aumueller¹, Hong-Hai Do¹, Sabine Massmann¹, Erhard Rahm¹•Institutions (1)

Leipzig University¹

14 Jun 2005

TL;DR: Different match strategies can be applied including various forms of reusing previously determined match results and a so-called fragment-based match approach which decomposes a large match problem into smaller problems.

...read moreread less

Abstract: We demonstrate the schema and ontology matching tool COMA++. It extends our previous prototype COMA utilizing a composite approach to combine different match algorithms [3]. COMA++ implements significant improvements and offers a comprehensive infrastructure to solve large real-world match problems. It comes with a graphical interface enabling a variety of user interactions. Using a generic data representation, COMA++ uniformly supports schemas and ontologies, e.g. the powerful standard languages W3C XML Schema and OWL. COMA++ includes new approaches for ontology matching, in particular the utilization of shared taxonomies. Furthermore, different match strategies can be applied including various forms of reusing previously determined match results and a so-called fragment-based match approach which decomposes a large match problem into smaller problems. Finally, COMA++ cannot only be used to solve match problems but also to comparatively evaluate the effectiveness of different match algorithms and strategies.

...read moreread less

683 citations

Journal Article•DOI•

Matching large schemas: Approaches and evaluation

[...]

Hong-Hai Do¹, Erhard Rahm¹•Institutions (1)

Leipzig University¹

01 Sep 2007-Information Systems

TL;DR: This work has developed a new generic schema matching tool, COMA++, providing a library of individual matchers and a flexible infrastructure to combine the matcher and refine their results, and conducted a comprehensive evaluation of the match strategies using large e-Business standard schemas.

...read moreread less

252 citations

Journal Article•DOI•

Matching large XML schemas

[...]

Erhard Rahm¹, Hong-Hai Do¹, Sabine Maßmann¹•Institutions (1)

Leipzig University¹

01 Dec 2004

TL;DR: A fragment-oriented match approach is proposed to decompose a large match problem into several smaller ones and to reuse previous match results at the level of schema fragments.

...read moreread less

Abstract: Current schema matching approaches still have to improve for very large and complex schemas. Such schemas are increasingly written in the standard language W3C XML schema, especially in E-business applications. The high expressive power and versatility of this schema language, in particular its type system and support for distributed schemas and name-spaces, introduce new issues. In this paper, we study some of the important problems in matching such large XML schemas. We propose a fragment-oriented match approach to decompose a large match problem into several smaller ones and to reuse previous match results at the level of schema fragments.

...read moreread less

145 citations

Proceedings Article•DOI•

Quickmig: automatic schema matching for data migration projects

[...]

Christian Drumm, Matthias Schmitt, Hong-Hai Do, Erhard Rahm¹•Institutions (1)

Leipzig University¹

06 Nov 2007

TL;DR: QuickMig is described, a new semi-automatic approach to determining semantic correspondences between schema elements for data migration applications that advances the state of the art with a set of new techniques exploiting sample instances, domain ontologies, and reuse of existing mappings to detect not only element correspondences but also their mapping expressions.

...read moreread less

Abstract: A common task in many database applications is the migration of legacy data from multiple sources into a new one. This requires identifying semantically related elements of the source and target systems and the creation of mapping expressions to transform instances of those elements from the source format to the target format. Currently, data migration is typically done manually, a tedious and timeconsuming process, which is difficult to scale to a high number of data sources. In this paper, we describe QuickMig, a new semi-automatic approach to determining semantic correspondences between schema elements for data migration applications. QuickMig advances the state of the art with a set of new techniques exploiting sample instances, domain ontologies, and reuse of existing mappings to detect not only element correspondences but also their mapping expressions. QuickMig further includes new mechanisms to effectively incorporate domain knowledge of users into the matching process. The results from a comprehensive evaluation using real-world schemas and data indicate the high quality and practicability of the overall approach.

...read moreread less

100 citations

Cited by

PDF

Open Access

More filters

Book•

Ontology Matching

[...]

Jérôme Euzenat, Pavel Shvaiko

05 Jun 2007

TL;DR: The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content.

...read moreread less

Abstract: Ontologies tend to be found everywhere. They are viewed as the silver bullet for many applications, such as database integration, peer-to-peer systems, e-commerce, semantic web services, or social networks. However, in open or evolving systems, such as the semantic web, different parties would, in general, adopt different ontologies. Thus, merely using ontologies, like using XML, does not reduce heterogeneity: it just raises heterogeneity problems to a higher level. Euzenat and Shvaikos book is devoted to ontology matching as a solution to the semantic heterogeneity problem faced by computer systems. Ontology matching aims at finding correspondences between semantically related entities of different ontologies. These correspondences may stand for equivalence as well as other relations, such as consequence, subsumption, or disjointness, between ontology entities. Many different matching solutions have been proposed so far from various viewpoints, e.g., databases, information systems, and artificial intelligence. The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content. In particular, the book includes a new chapter dedicated to the methodology for performing ontology matching. It also covers emerging topics, such as data interlinking, ontology partitioning and pruning, context-based matching, matcher tuning, alignment debugging, and user involvement in matching, to mention a few. More than 100 state-of-the-art matching systems and frameworks were reviewed. With Ontology Matching, researchers and practitioners will find a reference book that presents currently available work in a uniform framework. In particular, the work and the techniques presented in this book can be equally applied to database schema matching, catalog integration, XML schema matching and other related problems. The objectives of the book include presenting (i) the state of the art and (ii) the latest research results in ontology matching by providing a systematic and detailed account of matching techniques and matching systems from theoretical, practical and application perspectives.

...read moreread less

2,579 citations

Book Chapter•DOI•

A survey of schema-based matching approaches

[...]

Pavel Shvaiko¹, Jérôme Euzenat²•Institutions (2)

University of Trento¹, French Institute for Research in Computer Science and Automation²

01 Jan 2005-Journal on Data Semantics

TL;DR: This paper presents a new classification of schema-based matching techniques that builds on the top of state of the art in both schema and ontology matching and distinguishes between approximate and exact techniques at schema-level; and syntactic, semantic, and external techniques at element- and structure-level.

...read moreread less

Abstract: Schema and ontology matching is a critical problem in many application domains, such as semantic web, schema/ontology integration, data warehouses, e-commerce, etc. Many different matching solutions have been proposed so far. In this paper we present a new classification of schema-based matching techniques that builds on the top of state of the art in both schema and ontology matching. Some innovations are in introducing new criteria which are based on (i) general properties of matching techniques, (ii) interpretation of input information, and (iii) the kind of input information. In particular, we distinguish between approximate and exact techniques at schema-level; and syntactic, semantic, and external techniques at element- and structure-level. Based on the classification proposed we overview some of the recent schema/ontology matching systems pointing which part of the solution space they cover. The proposed classification provides a common conceptual basis, and, hence, can be used for comparing different existing schema/ontology matching techniques and systems as well as for designing new ones, taking advantages of state of the art solutions.

...read moreread less

1,285 citations

Journal Article•DOI•

Ontology Matching: State of the Art and Future Challenges

[...]

Pavel Shvaiko, Jérôme Euzenat¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Jan 2013-IEEE Transactions on Knowledge and Data Engineering

TL;DR: It is conjecture that significant improvements can be obtained only by addressing important challenges for ontology matching and presents such challenges with insights on how to approach them, thereby aiming to direct research into the most promising tracks and to facilitate the progress of the field.

...read moreread less

Abstract: After years of research on ontology matching, it is reasonable to consider several questions: is the field of ontology matching still making progress? Is this progress significant enough to pursue further research? If so, what are the particularly promising directions? To answer these questions, we review the state of the art of ontology matching and analyze the results of recent ontology matching evaluations. These results show a measurable improvement in the field, the speed of which is albeit slowing down. We conjecture that significant improvements can be obtained only by addressing important challenges for ontology matching. We present such challenges with insights on how to approach them, thereby aiming to direct research into the most promising tracks and to facilitate the progress of the field.

...read moreread less

1,215 citations

Book Chapter•DOI•

Similarity search for web services

[...]

Xin Dong¹, Alon Halevy¹, Jayant Madhavan¹, Ema Nemes¹, Jun Zhang¹ - Show less +1 more•Institutions (1)

University of Washington¹

31 Aug 2004

TL;DR: Woogle supports similarity search for web services, such as finding similar web-service operations and finding operations that compose with a given one, and novel techniques to support these types of searches are described.

...read moreread less

Abstract: Web services are loosely coupled software components, published, located, and invoked across the web. The growing number of web services available within an organization and on the Web raises a new and challenging search problem: locating desired web services. Traditional keyword search is insufficient in this context: the specific types of queries users require are not captured, the very small text fragments in web services are unsuitable for keyword search, and the underlying structure and semantics of the web services are not exploited. We describe the algorithms underlying the Woogle search engine for web services. Woogle supports similarity search for web services, such as finding similar web-service operations and finding operations that compose with a given one. We describe novel techniques to support these types of searches, and an experimental study on a collection of over 1500 web-service operations that shows the high recall and precision of our algorithms.

...read moreread less

828 citations

Proceedings Article•DOI•

Schema and ontology matching with COMA

[...]

David Aumueller¹, Hong-Hai Do¹, Sabine Massmann¹, Erhard Rahm¹•Institutions (1)

Leipzig University¹

14 Jun 2005

...read moreread less

683 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse