Home
/
Authors
/
Serge Abiteboul

Author

Serge Abiteboul

French Institute for Research in Computer Science and Automation

Other affiliations: University of Southern California, PSL Research University, Tel Aviv University ...read more

Bio: Serge Abiteboul is an academic researcher from French Institute for Research in Computer Science and Automation. The author has contributed to research in topics: XML & Query language. The author has an hindex of 73, co-authored 278 publications receiving 24576 citations. Previous affiliations of Serge Abiteboul include University of Southern California & PSL Research University.

Topics: XML, Query language, Efficient XML Interchange, Datalog, Web service ...read more

Papers published on a yearly basis

2022
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982

Papers

PDF

Open Access

More filters

Proceedings Article•

Object Fusion in Mediator Systems

[...]

Yannis Papakonstantinou, Serge Abiteboul, Hector Garcia-Molina

03 Sep 1996

TL;DR: This paper shows how many common fusion operations can be specified non-procedurally and succinctly and presents key optimization techniques that significantly reduce the processing costs associated with information fusion.

...read moreread less

Abstract: One of the main tasks of mediators is to fuse information from heterogeneous information sources. This may involve, for example, removing redundancies, and resolving inconsistencies in favor of the most reliable source. The problem becomes harder when the sources are unstructured/semistructured and we do not have complete knowledge of their contents and structure. In this paper we show how many common fusion operations can be specified non-procedurally and succinctly. The key to our approach is to assign semantically meaningful object ids to objects as they are “imported” into the mediator. These semantic ids can then be used to specify how various objects are combined or merged into objects “exported” by the mediator. In this paper we also discuss the implementation of a mediation system based on these principles. In particular, we present key optimization techniques that significantly reduce the processing costs associated with information fusion.

...read moreread less

265 citations

Proceedings Article•DOI•

IFO: a formal semantic database model

[...]

Serge Abiteboul¹, Richard Hull²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of Southern California²

02 Apr 1984

TL;DR: A new, formally defined database model is introduced which combines fundamental principles of "semantic" database modeling in a coherent fashion and can serve as the foundation for a theoretical investigation into a wide variety of fundamental issues concerning the logical representation of data in databases.

...read moreread less

Abstract: A new, formally defined database model is introduced which combines fundamental principles of "semantic" database modeling in a coherent fashion. The model provides mechanisms for representing structured objects and functional and ISA relationships between them. It is anticipated that the model can serve as the foundation for a theoretical investigation into a wide variety of fundamental issues concerning the logical representation of data in databases. Preliminary applications of the model include an efficient algorithm for computing the set of object types which can occur in a given entity set, even in the presence of a complex set of ISA relationships. The model can also be applied to precisely articulate "good" design policies.

...read moreread less

262 citations

Proceedings Article•DOI•

Adaptive on-line page importance computation

[...]

Serge Abiteboul¹, Mihai Preda, Gregory Cobena¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

20 May 2003

TL;DR: A new algorithm OPIC is introduced that works on-line, and uses much less resources, and does not require storing the link matrix, and is used to focus crawling to the most interesting pages.

...read moreread less

Abstract: The computation of page importance in a huge dynamic graph has recently attracted a lot of attention because of the web. Page importance, or page rank is defined as the fixpoint of a matrix equation. Previous algorithms compute it off-line and require the use of a lot of extra CPU as well as disk resources (e.g. to store, maintain and read the link matrix). We introduce a new algorithm OPIC that works on-line, and uses much less resources. In particular, it does not require storing the link matrix. It is on-line in that it continuously refines its estimate of page importance while the web/graph is visited. Thus it can be used to focus crawling to the most interesting pages. We prove the correctness of OPIC. We present Adaptive OPIC that also works on-line but adapts dynamically to changes of the web. A variant of this algorithm is now used by Xyleme.We report on experiments with synthetic data. In particular, we study the convergence and adaptiveness of the algorithms for various scheduling strategies for the pages to visit. We also report on experiments based on crawls of significant portions of the web.

...read moreread less

258 citations

Proceedings Article•

Change-Centric Management of Versions in an XML Warehouse

[...]

Amélie Marian¹, Serge Abiteboul, Gregory Cobena, Laurent Mignet•Institutions (1)

Columbia University¹

11 Sep 2001

TL;DR: The foundations of the logical representation and some aspects of the physical storage policy are presented and the implementation of the change-centric method to manage versions in a Web WareHouse of XML data is discussed.

...read moreread less

Abstract: We present a change-centric method to manage versions in a Web WareHouse of XML data. The starting points is a sequence of snapshots of XML documents we obtain from the web. By running a di algorithm, we compute the changes between two consecutive versions. We then represent the sequence using a novel representation of changes based on completed deltas and persistent identi ers. We present the foundations of the logical representation and some aspects of the physical storage policy. The work presented here was developed in the context of the Xyleme project of massive XML warehouse for XML data from the Web. It has been implemented and tested. We brie y discuss the implementation.

...read moreread less

236 citations

Proceedings Article•DOI•

Regular path queries with constraints

[...]

Serge Abiteboul¹, Victor Vianu²•Institutions (2)

Stanford University¹, University of California, San Diego²

01 May 1997

TL;DR: The evaluation of path expression queries on semi-structured data in a distributed asynchronous environment is considered and decidability and complexity results on the implication for path constraints are established.

...read moreread less

Abstract: The evaluation of path expression queries on semi-structured data in a distributed asynchronous environment is considered. The focus is on the use of local information expressed in the form of path constraints in the optimization of path expression queries. In particular, decidability and complexity results on the implication for path constraints are established.

...read moreread less

225 citations

1
…
2
3
4
5
6
7
8
…
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

Collapse

Cited by

PDF

Open Access

More filters

Book•

Data Mining: Concepts and Techniques

[...]

Jiawei Han¹, Micheline Kamber², Jian Pei²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Simon Fraser University²

08 Sep 2000

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.

...read moreread less

Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

...read moreread less

23,600 citations

Journal Article•DOI•

The anatomy of a large-scale hypertextual Web search engine

[...]

Sergey Brin¹, Lawrence Page¹•Institutions (1)

Stanford University¹

01 Apr 1998

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

Abstract: In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

14,696 citations

Journal Article•

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

[...]

Sergey Brin, Lawrence Page

01 Jan 1998-Computer Networks

TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.

...read moreread less

13,327 citations

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

物件導向軟體之架構(Object-Oriented Software Construction)探討

[...]

簡聰富

01 Dec 1989

4,898 citations