scispace - formally typeset
Search or ask a question
Author

Alon Y. Levy

Other affiliations: Stanford University, Bell Labs, AT&T  ...read more
Bio: Alon Y. Levy is an academic researcher from University of Washington. The author has contributed to research in topics: Query language & Query optimization. The author has an hindex of 56, co-authored 107 publications receiving 12717 citations. Previous affiliations of Alon Y. Levy include Stanford University & Bell Labs.


Papers
More filters
Proceedings Article
03 Sep 1996
TL;DR: The Information Manifold is described, an implemented system that provides uniform access to a heterogeneous collection of more than 100 information sources, many of them on the WWW, and algorithms that use the source descriptions to prune effciently the set of information sources for a given query are described.
Abstract: We witness a rapid increase in the number of structured information sources that are available online, especially on the WWW. These sources include commercial databases on product information, stock market information, real estate, automobiles, and entertainment. We would like to use the data stored in these databases to answer complex queries that go beyond keyword searches. We face the following challenges: (1) Several information sources store interrelated data, and any query-answering system must understand the relationships between their contents. (2) Many sources are not full-featured database systems and can answer only a small set of queries over their data (for example, forms on the WWW restrict the set of queries one can (3) Since the number of sources is very large, effective techniques are needed to prune the set of information sources accessed to answer a query. (4) The details of interacting with each source vary greatly. We describe the Information Manifold, an implemented system that provides uniform access to a heterogeneous collection of more than 100 information sources, many of them on the WWW. IM tackles the above problems by providing a mechanism to describe declaratively the contents and query capabilities of available information sources. There is a clean separation between the declarative source description and the actual details of interacting with an information source. We describe algorithms that use the source descriptions to prune effciently the set of information sources for a given query and practical algorithms to generate executable query plans. The query plans we generate can inolve querying several information sources and combining their answers. We also present experimental studies that indicate that the architecture and algorithms used in the Information Manifold scale up well to several hundred information sources

1,302 citations

Journal ArticleDOI
17 May 1999
TL;DR: This work presents a query language for XML, called XML-QL, which is argued to be suitable for performing the above tasks, and can extract data from existing XML documents and construct new XML documents.
Abstract: An important application of XML is the interchange of electronic data (EDI) between multiple data sources on the Web. As XML data proliferates on the Web, applications will need to integrate and aggregate data from multiple source and clean and transform data to facilitate exchange. Data extraction, conversion, transformation, and integration are all well-understood database problems, and their solutions rely on a query language. We present a query language for XML, called XML-QL, which we argue is suitable for performing the above tasks. XML-QL is a declarative, `relational complete' query language and is simple enough that it can be optimized. XML-QL can extract data from existing XML documents and construct new XML documents.

649 citations

Journal ArticleDOI
01 Sep 1998
TL;DR: The primary goal of this survey is to classify the different tasks to which database concepts have been applied, and to emphasize the technical innovations that were required to do so.
Abstract: The popularity of the World-Wide Web (WWW) has made it a prime vehicle for disseminating information. The relevance of database concepts to the problems of managing and querying this information has led to a significant body of recent research addressing these problems. Even though the underlying challenge is the one that has been traditionally addressed by the database community how to manage large volumes of data the novel context of the WWW forces us to significantly extend previous techniques. The primary goal of this survey is to classify the different tasks to which database concepts have been applied, and to emphasize the technical innovations that were required to do so.

642 citations

Proceedings Article
01 Jan 1995

615 citations

Journal ArticleDOI
01 Jun 1999
TL;DR: It is demonstrated that the Tukwila architecture extends previous innovations in adaptive execution (such as query scrambling, mid-execution re-optimization, and choose nodes), and experimental evidence that the techniques result in behavior desirable for a data integration system is presented.
Abstract: Query processing in data integration occurs over network-bound, autonomous data sources. This requires extensions to traditional optimization and execution techniques for three reasons: there is an absence of quality statistics about the data, data transfer rates are unpredictable and bursty, and slow or unavailable data sources can often be replaced by overlapping or mirrored sources. This paper presents the Tukwila data integration system, designed to support adaptivity at its core using a two-pronged approach. Interleaved planning and execution with partial optimization allows Tukwila to quickly recover from decisions based on inaccurate estimates. During execution, Tukwila uses adaptive query operators such as the double pipelined hash join, which produces answers quickly, and the dynamic collector, which robustly and efficiently computes unions across overlapping data sources. We demonstrate that the Tukwila architecture extends previous innovations in adaptive execution (such as query scrambling, mid-execution re-optimization, and choose nodes), and we present experimental evidence that our techniques result in behavior desirable for a data integration system.

477 citations


Cited by
More filters
Book
08 Sep 2000
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

23,600 citations

Journal ArticleDOI
TL;DR: Agent theory is concerned with the question of what an agent is, and the use of mathematical formalisms for representing and reasoning about the properties of agents as discussed by the authors ; agent architectures can be thought of as software engineering models of agents; and agent languages are software systems for programming and experimenting with agents.
Abstract: The concept of an agent has become important in both Artificial Intelligence (AI) and mainstream computer science. Our aim in this paper is to point the reader at what we perceive to be the most important theoretical and practical issues associated with the design and construction of intelligent agents. For convenience, we divide these issues into three areas (though as the reader will see, the divisions are at times somewhat arbitrary). Agent theory is concerned with the question of what an agent is, and the use of mathematical formalisms for representing and reasoning about the properties of agents. Agent architectures can be thought of as software engineering models of agents;researchers in this area are primarily concerned with the problem of designing software or hardware systems that will satisfy the properties specified by agent theorists. Finally, agent languages are software systems for programming and experimenting with agents; these languages may embody principles proposed by theorists. The paper is not intended to serve as a tutorial introduction to all the issues mentioned; we hope instead simply to identify the most important issues, and point to work that elaborates on them. The article includes a short review of current and potential applications of agent technology.

6,714 citations

Book
01 Nov 2001
TL;DR: A multi-agent system (MAS) as discussed by the authors is a distributed computing system with autonomous interacting intelligent agents that coordinate their actions so as to achieve its goal(s) jointly or competitively.
Abstract: From the Publisher: An agent is an entity with domain knowledge, goals and actions. Multi-agent systems are a set of agents which interact in a common environment. Multi-agent systems deal with the construction of complex systems involving multiple agents and their coordination. A multi-agent system (MAS) is a distributed computing system with autonomous interacting intelligent agents that coordinate their actions so as to achieve its goal(s) jointly or competitively.

3,003 citations

Proceedings ArticleDOI
03 Jun 2002
TL;DR: The need for and research issues arising from a new model of data processing, where data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams are motivated.
Abstract: In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work relevant to data stream systems and current projects in the area, the paper explores topics in stream query languages, new requirements and challenges in query processing, and algorithmic issues.

2,933 citations