scispace - formally typeset
Search or ask a question
Author

Fabien Duchateau

Bio: Fabien Duchateau is an academic researcher from University of Lyon. The author has contributed to research in topics: Schema matching & Matching (statistics). The author has an hindex of 13, co-authored 53 publications receiving 554 citations. Previous affiliations of Fabien Duchateau include University of Montpellier & Centre national de la recherche scientifique.


Papers
More filters
Book ChapterDOI
09 Nov 2008
TL;DR: This article presents a novel method for selecting the most appropriate schema matching algorithms that makes use of a decision tree to combine the most suitable match algorithms for a given domain.
Abstract: Most of the schema matching tools are assembled from multiple match algorithms, each employing a particular technique to improve matching accuracy and making matching systems extensible and customizable to a particular domain. The solutions provided by current schema matching tools consist in aggregating the results obtained by several match algorithms to improve the quality of the discovered matches. However, aggregation entails several drawbacks. Recently, it has been pointed out that the main issue is how to select the most suitable match algorithms to execute for a given domain and how to adjust the multiple knobs (e.g. threshold, performance, quality, etc.). In this article, we present a novel method for selecting the most appropriate schema matching algorithms. The matching engine makes use of a decision tree to combine the most appropriate match algorithms. As a first consequence of using the decision tree, the performance of the system is improved since the complexity is bounded by the height of the decision tree. Thus, only a subset of these match algorithms is used during the matching process. The second advantage is the improvement of the quality of matches. Indeed, for a given domain, only the most suitable match algorithms are used. The experiments show the effectiveness of our approach w.r.t. other matching tools.

61 citations

Book ChapterDOI
30 Mar 2011
TL;DR: This work provides a generic overview of the existing efforts on benchmarking schema matching and mapping tasks, offers a comprehensive description of the problem, lists the basic comparison criteria and techniques, and provides a description ofThe main functionalities and characteristics of existing systems.
Abstract: The increasing demand of matching and mapping tasks in modern integration scenarios has led to a plethora of tools for facilitating these tasks. While the plethora made these tools available to a broader audience, it led to some form of confusion regarding the exact nature, goals, core functionalities, expected features, and basic capabilities of these tools. Above all, it made performance measurements of these systems and their distinction a difficult task. The need for design and development of comparison standards that will allow the evaluation of these tools is becoming apparent. These standards are particularly important to mapping and matching system users, since they allow them to evaluate the relative merits of the systems and take the right business decisions. They are also important to mapping system developers, since they offer a way of comparing the system against competitors, and motivating improvements and further development. Finally, they are important to researchers as they serve as illustrations of the existing system limitations, triggering further research in the area. In this work, we provide a generic overview of the existing efforts on benchmarking schema matching and mapping tasks. We offer a comprehensive description of the problem, list the basic comparison criteria and techniques, and provide a description of the main functionalities and characteristics of existing systems.

59 citations

Proceedings Article
23 Sep 2007
TL;DR: XBenchMatch is presented, a benchmark which uses as input the result of a schema matching algorithm (set of mappings and/or an integrated schema) and generates statistics about the quality of this input and the performance of the matching tool.
Abstract: We present XBenchMatch, a benchmark which uses as input the result of a schema matching algorithm (set of mappings and/or an integrated schema) and generates statistics about the quality of this input and the performance of the matching tool.

58 citations

Proceedings Article
20 Oct 2006
TL;DR: An automatic method to calculate the similarity measure between two schema elements and it appears that Approxivect, when its parameters are tuned in optimum configurations, discovers most of the relevant couples in the top ranking while COMA++ only finds half of the mappings.
Abstract: The possibility to query heterogeneous and semantically linked data sources depends on the ability to find correspondences between their structure and/or their content. Unfortunately, most of the tools used nowadays to discover those mappings are either manual or semi-automatic. In this article we present an automatic method to calculate the similarity measure between two schema elements. Furthermore, a tool has been implemented, Approxivect, based on the approximation of terminological methods and on the cosine measure between context vectors. Another important feature of our tool is that our method does not use any dictionary or language-based knowledge and works in specialized domain areas. Finally, we have performed experiments showing that our tool provides good results regarding those provided by COMA++. More precisely, it appears that Approxivect, when its parameters are tuned in optimum configurations, discovers most of the relevant couples in the top ranking while COMA++ only finds half of the mappings.

35 citations

Proceedings ArticleDOI
02 Nov 2009
TL;DR: YAM (Yet Another Matcher), which is a schema matcher factory, enables the generation of a dedicated matcher for a given schema matching scenario, according to user inputs, based on machine learning.
Abstract: Discovering correspondences between schema elements is a crucial task for data integration. Most schema matching tools are semi-automatic, e.g. an expert must tune some parameters (thresholds, weights, etc.). They mainly use several methods to combine and aggregate similarity measures. However, their quality results often decrease when one requires to integrate a new similarity measure or when matching particular domain schemas. This paper describes YAM (Yet Another Matcher), which is a schema matcher factory. Indeed, it enables the generation of a dedicated matcher for a given schema matching scenario, according to user inputs. Our approach is based on machine learning since schema matchers can be seen as classifiers. Several bunches of experiments run against matchers generated by YAM and traditional matching tools show how our approach is able to generate the best matcher for a given scenario.

34 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Book
05 Jun 2007
TL;DR: The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content.
Abstract: Ontologies tend to be found everywhere. They are viewed as the silver bullet for many applications, such as database integration, peer-to-peer systems, e-commerce, semantic web services, or social networks. However, in open or evolving systems, such as the semantic web, different parties would, in general, adopt different ontologies. Thus, merely using ontologies, like using XML, does not reduce heterogeneity: it just raises heterogeneity problems to a higher level. Euzenat and Shvaikos book is devoted to ontology matching as a solution to the semantic heterogeneity problem faced by computer systems. Ontology matching aims at finding correspondences between semantically related entities of different ontologies. These correspondences may stand for equivalence as well as other relations, such as consequence, subsumption, or disjointness, between ontology entities. Many different matching solutions have been proposed so far from various viewpoints, e.g., databases, information systems, and artificial intelligence. The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content. In particular, the book includes a new chapter dedicated to the methodology for performing ontology matching. It also covers emerging topics, such as data interlinking, ontology partitioning and pruning, context-based matching, matcher tuning, alignment debugging, and user involvement in matching, to mention a few. More than 100 state-of-the-art matching systems and frameworks were reviewed. With Ontology Matching, researchers and practitioners will find a reference book that presents currently available work in a uniform framework. In particular, the work and the techniques presented in this book can be equally applied to database schema matching, catalog integration, XML schema matching and other related problems. The objectives of the book include presenting (i) the state of the art and (ii) the latest research results in ontology matching by providing a systematic and detailed account of matching techniques and matching systems from theoretical, practical and application perspectives.

2,579 citations

01 Jan 2016
TL;DR: The remote sensing and image interpretation is universally compatible with any devices to read and is available in the digital library an online access to it is set as public so you can get it instantly.
Abstract: Thank you very much for downloading remote sensing and image interpretation. As you may know, people have look hundreds times for their favorite novels like this remote sensing and image interpretation, but end up in malicious downloads. Rather than reading a good book with a cup of tea in the afternoon, instead they are facing with some malicious virus inside their computer. remote sensing and image interpretation is available in our digital library an online access to it is set as public so you can get it instantly. Our book servers spans in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Merely said, the remote sensing and image interpretation is universally compatible with any devices to read.

1,802 citations

Journal ArticleDOI
TL;DR: It is conjecture that significant improvements can be obtained only by addressing important challenges for ontology matching and presents such challenges with insights on how to approach them, thereby aiming to direct research into the most promising tracks and to facilitate the progress of the field.
Abstract: After years of research on ontology matching, it is reasonable to consider several questions: is the field of ontology matching still making progress? Is this progress significant enough to pursue further research? If so, what are the particularly promising directions? To answer these questions, we review the state of the art of ontology matching and analyze the results of recent ontology matching evaluations. These results show a measurable improvement in the field, the speed of which is albeit slowing down. We conjecture that significant improvements can be obtained only by addressing important challenges for ontology matching. We present such challenges with insights on how to approach them, thereby aiming to direct research into the most promising tracks and to facilitate the progress of the field.

1,215 citations

01 Jan 2016
TL;DR: The the uses of argument is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Abstract: Thank you very much for downloading the uses of argument. Maybe you have knowledge that, people have search numerous times for their chosen novels like this the uses of argument, but end up in infectious downloads. Rather than reading a good book with a cup of tea in the afternoon, instead they juggled with some malicious bugs inside their computer. the uses of argument is available in our digital library an online access to it is set as public so you can download it instantly. Our digital library hosts in multiple locations, allowing you to get the most less latency time to download any of our books like this one. Merely said, the the uses of argument is universally compatible with any devices to read.

1,180 citations