scispace - formally typeset
Search or ask a question
Author

Monica Scannapieco

Other affiliations: Sapienza University of Rome
Bio: Monica Scannapieco is an academic researcher from National Institute of Statistics. The author has contributed to research in topics: Data quality & Information quality. The author has an hindex of 24, co-authored 90 publications receiving 2678 citations. Previous affiliations of Monica Scannapieco include Sapienza University of Rome.


Papers
More filters
Book
06 Sep 2006
TL;DR: The book's extensive description of techniques and methodologies from core data quality research as well as from related fields like data mining, probability theory, statistical data analysis, and machine learning gives an excellent overview of the current state of the art.
Abstract: Poor data quality can seriously hinder or damage the efficiency and effectiveness of organizations and businesses. The growing awareness of such repercussions has led to major public initiatives like the "Data Quality Act" in the USA and the "European 2003/98" directive of the European Parliament. Batini and Scannapieco present a comprehensive and systematic introduction to the wide set of issues related to data quality. They start with a detailed description of different data quality dimensions, like accuracy, completeness, and consistency, and their importance in different types of data, like federated data, web data, or time-dependent data, and in different data categories classified according to frequency of change, like stable, long-term, and frequently changing data. The book's extensive description of techniques and methodologies from core data quality research as well as from related fields like data mining, probability theory, statistical data analysis, and machine learning gives an excellent overview of the current state of the art. The presentation is completed by a short description and critical comparison of tools and practical methodologies, which will help readers to resolve their own quality problems. This book is an ideal combination of the soundness of theoretical foundations and the applicability of practical approaches. It is ideally suited for everyone researchers, students, or professionals interested in a comprehensive overview of data quality issues. In addition, it will serve as the basis for an introductory course or for self-study on this topic.

687 citations

Proceedings ArticleDOI
11 Jun 2007
TL;DR: A protocol for record matching that preserves privacy both at the data level and at the schema level is proposed, and by running the protocol two sources can compute the matching of their datasets without sharing their data in clear and only sharing the result of the matching.
Abstract: In many business scenarios, record matching is performed across different data sources with the aim of identifying common information shared among these sources. However such need is often in contrast with privacy requirements concerning the data stored by the sources. In this paper, we propose a protocol for record matching that preserves privacy both at the data level and at the schema level. Specifically, if two sources need to identify their common data, by running the protocol they can compute the matching of their datasets without sharing their data in clear and only sharing the result of the matching. The protocol uses a third party, and maps records into a vector space in order to preserve their privacy. Experimental results show the efficiency of the matching protocol in terms of precision and recall as well as the good computational performance.

198 citations

BookDOI
01 Jan 2016
TL;DR: The first € price and the £ and $ price are net prices, subject to local VAT, and the €(D) includes 7% for Germany, the€(A) includes 10% for Austria.
Abstract: The first € price and the £ and $ price are net prices, subject to local VAT. Prices indicated with * include VAT for books; the €(D) includes 7% for Germany, the €(A) includes 10% for Austria. Prices indicated with ** include VAT for electronic products; 19% for Germany, 20% for Austria. All prices exclusive of carriage charges. Prices and other details are subject to change without notice. All errors and omissions excepted. C. Batini, M. Scannapieco Data and Information Quality

186 citations

Journal ArticleDOI
01 Sep 2004
TL;DR: This paper presents an architecture for managing data quality in cooperative information systems, by focusiag on two specific modules, the Data Quality Broker and the Quality Notification Service.
Abstract: In cooperative information systems, the quality of data exchanged and provided by different data sources is extremely important. A lack of attention to data quality can imply data of low quality to spread all over the cooperative system. At the same time, improvement can be based on comparing data, correcting them and thus disseminating high quality data. In this paper, we present an architecture for managing data quality in cooperative information systems, by focusiag on two specific modules, the Data Quality Broker and the Quality Notification Service. The Data Quality Broker allows for querying and improving data quality values. The Quality Notification Service is specifically targeted to the dissemination of changes on data quality values.

118 citations

Book ChapterDOI
17 Mar 2004
TL;DR: In this article, the authors summarize the conditions under which this revolution would take place as well as an attempt to underline its main properties, limitations and possible applications, and present an effort to summarize their main properties and limitations.
Abstract: Information and communication infrastructures underwent a rapid and extreme decentralization process over the past decade: From a world of statically and partially connected central servers rose an intricate web of millions of information sources loosely connecting one to another. Today, we expect to witness the extension of this revolution with the wide adoption of meta-data standards like RDF or OWL underpinning the creation of a semantic web. Again, we hope for global properties to emerge from a multiplicity of pair-wise, local interactions, resulting eventually in a self-stabilizing semantic infrastructure. This paper represents an effort to summarize the conditions under which this revolution would take place as well as an attempt to underline its main properties, limitations and possible applications.

118 citations


Cited by
More filters
01 Jan 2002

9,314 citations

Journal ArticleDOI

6,278 citations

Journal ArticleDOI
TL;DR: This paper presents a middleware platform which addresses the issue of selecting Web services for the purpose of their composition in a way that maximizes user satisfaction expressed as utility functions over QoS attributes, while satisfying the constraints set by the user and by the structure of the composite service.
Abstract: The paradigmatic shift from a Web of manual interactions to a Web of programmatic interactions driven by Web services is creating unprecedented opportunities for the formation of online business-to-business (B2B) collaborations. In particular, the creation of value-added services by composition of existing ones is gaining a significant momentum. Since many available Web services provide overlapping or identical functionality, albeit with different quality of service (QoS), a choice needs to be made to determine which services are to participate in a given composite service. This paper presents a middleware platform which addresses the issue of selecting Web services for the purpose of their composition in a way that maximizes user satisfaction expressed as utility functions over QoS attributes, while satisfying the constraints set by the user and by the structure of the composite service. Two selection approaches are described and compared: one based on local (task-level) selection of services and the other based on global allocation of tasks to services using integer programming.

2,872 citations

Journal ArticleDOI
TL;DR: This work introduces a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federatedLearning, and federated transfer learning, and provides a comprehensive survey of existing works on this subject.
Abstract: Today’s artificial intelligence still faces two major challenges. One is that, in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated-learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federated learning, and federated transfer learning. We provide definitions, architectures, and applications for the federated-learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allowing knowledge to be shared without compromising user privacy.

2,593 citations

Book
05 Jun 2007
TL;DR: The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content.
Abstract: Ontologies tend to be found everywhere. They are viewed as the silver bullet for many applications, such as database integration, peer-to-peer systems, e-commerce, semantic web services, or social networks. However, in open or evolving systems, such as the semantic web, different parties would, in general, adopt different ontologies. Thus, merely using ontologies, like using XML, does not reduce heterogeneity: it just raises heterogeneity problems to a higher level. Euzenat and Shvaikos book is devoted to ontology matching as a solution to the semantic heterogeneity problem faced by computer systems. Ontology matching aims at finding correspondences between semantically related entities of different ontologies. These correspondences may stand for equivalence as well as other relations, such as consequence, subsumption, or disjointness, between ontology entities. Many different matching solutions have been proposed so far from various viewpoints, e.g., databases, information systems, and artificial intelligence. The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content. In particular, the book includes a new chapter dedicated to the methodology for performing ontology matching. It also covers emerging topics, such as data interlinking, ontology partitioning and pruning, context-based matching, matcher tuning, alignment debugging, and user involvement in matching, to mention a few. More than 100 state-of-the-art matching systems and frameworks were reviewed. With Ontology Matching, researchers and practitioners will find a reference book that presents currently available work in a uniform framework. In particular, the work and the techniques presented in this book can be equally applied to database schema matching, catalog integration, XML schema matching and other related problems. The objectives of the book include presenting (i) the state of the art and (ii) the latest research results in ontology matching by providing a systematic and detailed account of matching techniques and matching systems from theoretical, practical and application perspectives.

2,579 citations