scispace - formally typeset
Search or ask a question

Showing papers by "Jiawei Han published in 1992"


Proceedings Article
23 Aug 1992
TL;DR: An attribute-oriented induction method has been developed for knowledge discovery in databases that integrates a machine learning paradigm with set-oriented database operations and extracts generalized data from actual data in databases.
Abstract: Knowledge discovery in databases, or data mining, is an important issue in the development of data- and knowledge-base systems. An attribute-oriented induction method has been developed for knowledge discovery in databases. The method integrates a machine learning paradigm, especially learning-from-examples techniques, with set-oriented database operations and extracts generalized data from actual data in databases. An attribute-oriented concept tree ascension technique is applied in generalization, which substantially reduces the computational complex@ of database learning processes. Different kinas of knowledge rules, including characteristic rules, discrimination rules, quantitative rules, and data evolution regularities can be discovered efficiently using the attribute-oriented approach. In addition to learning in relational databases, the approach can be applied to knowledge discovery in nested relational and deductive databases. Learning can also be performed with databases containing noisy data and exceptional cases using database statistics. Furthermore, the rules discovered can be used to query database knowledge, answer cooperative queries and facilitate semantic query optimization. Based upon these principles, a prototyped database learning system, DBLEARN, has been constructed for experimentation.

432 citations


Proceedings ArticleDOI
03 Feb 1992
TL;DR: The analysis and performance study shows that distance-associated spatial join indices substantially improve the performance of spatial queries, and different structures are best suited for different applications.
Abstract: A distance-associated join index structure is developed to speed up spatial queries, especially for spatial range queries. Three distance-associated join indexing mechanisms: basic, ring-structured, and hierarchical, are presented and studied. The analysis and performance study shows that distance-associated spatial join indices substantially improve the performance of spatial queries, and different structures are best suited for different applications. >

47 citations


Journal ArticleDOI
TL;DR: This study shows that linear recursions can be compiled into highly regular compiled forms by the V-matrix expansion technique and such compiled forms can be generated automatically.

16 citations


Journal ArticleDOI
TL;DR: It is demonstrated that based on the graph model all the linear recursive formulas can be classified into a taxonomy of classes and each class shares common characteristics in query compilation and query processing.
Abstract: The authors present a graph model which is powerful in classifying and compiling linear recursive formulas in deductive databases. The graph model consists of two kinds of graphs: I-graph and resolution graph. Essential properties of a recursive formula can be extracted from its I-graph, and the compiled formula and the query evaluation plan of the recursive formulas can be determined from its resolution graph. It is demonstrated that based on the graph model all the linear recursive formulas can be classified into a taxonomy of classes and each class shares common characteristics in query compilation and query processing. The compiled formulas and the corresponding query evaluation plans can be derived based on the study of the compilation of each class. >

15 citations


Book ChapterDOI
23 Mar 1992
TL;DR: A chain-based query evaluation method is developed, which selects an efficient query evaluation algorithm based on the analysis of compiled forms and finiteness, termination and query constraints.
Abstract: List functions occur frequently in deductive database applications. We study efficient evaluation of linear recursions with list functions in deductive databases. Since most linear recursions can be compiled into chain forms, a chain-based query evaluation method is developed, which selects an efficient query evaluation algorithm based on the analysis of compiled forms and finiteness, termination and query constraints. Interesting techniques, such as chain-split, existence checking and constraint-based evaluation, are developed to improve the performance. Moreover, chain-based evaluation can be generalized to the complex recursions compilable to chain forms.

8 citations


Proceedings ArticleDOI
03 Feb 1992
TL;DR: Three chain-split evaluation techniques: magic sets, buffered evaluation, and partial evaluation, are developed and the first one is applicable to the evaluation of function-free recursions.
Abstract: A chain-split evaluation technique for the efficient evaluation of recursions in deductive databases is described. Three chain-split evaluation techniques: magic sets, buffered evaluation, and partial evaluation, are developed. The first one is applicable to the evaluation of function-free recursions. The latter two are applicable to both function-free and functional recursions. Partial evaluation is a further refinement of the buffered evaluation by evaluating the buffered functional predicates as much as possible to reduce the cost of maintaining the sequences of buffered values and facilitates termination judgment and constraint pushing. Chain-split evaluation is an important recursive query processing technique which can be implemented efficiently in deductive databases by extension to the available recursive query evaluation techniques. >

5 citations


Proceedings ArticleDOI
02 Feb 1992
TL;DR: This paper shows that chain-based evaluation facilitates quantitative analysis of recursive queries based on the available chain information, database statistics and other quantitative measurements, and is promising at bridging recursive and nonrecursive database query evaluation.
Abstract: Many recursive query analysis techniques are qualitative in nature. This contracts sharply with relational query optimization which relies heavily on quantitative analysis. This paper shows that chain-based evaluation facilitates quantitative analysis of recursive queries based on the available chain information, database statistics and other quantitative measurements. Chain-based evaluation not only facilitates binding propagation, constraint pushing and the selection of recursive query evaluation algorithms but also provides precise compile chain forms in relational expressions. Since most recursions in database applications can be compiled into highly regular chain forms, chain-based evaluation is promising at bridging recursive and nonrecursive database query evaluation. >

3 citations



Journal ArticleDOI
TL;DR: It is demonstrated that the rule / goal graph cannot capture the binding propagation information for certain kinds of linear recursions, and hence the Magic rule rewriting technique cannot derive the minimal Magic Sets for such recursions.

2 citations


Journal ArticleDOI
TL;DR: This study analyzes the power of query-independent compilation and shows that it captures more binding information than other methods for irregular linear recursions, provides succinct information for selection of efficient query processing methods, and facilitates constraint-based processing of complex queries.
Abstract: Recursive query processing techniques can be classified into three categories: interpretation, query-dependent compilation and query-independent compilation. Query-dependent compilation compiles IDB (Intensional Data Base) programs based on possible query instantiations into query-specific programs, while query-independent compilation compiles IDB programs into query-independent and easily analyzable relational expressions. Previous studies show that linear recursions can be query-independently compiled into highly regular forms. This study analyzes the power of query-independent compilation and shows that (i) query-independent compilation captures more binding information than other methods for irregular linear recursions; (ii) the compilation provides succinct information for selection of efficient query processing methods; and (iii) it facilitates constraint-based processing of complex queries. Finally, query-independent compilation can be applied to more complex recursions as well.

1 citations



Book ChapterDOI
15 May 1992
TL;DR: This study proposes an efficient deduction method which applies query-independent compilation and set-oriented, chain-based evaluation in deductive databases, and an efficient attribute-oriented induction method for knowledge discovery in databases.
Abstract: The development of powerful and efficient deduction and induction mechanisms is the key to the success of Very Large Knowledge-Base systems (VLKBs). Based on our study, we propose (1) an efficient deduction method which applies query-independent compilation and set-oriented, chain-based evaluation in deductive databases, and (2) an efficient attribute-oriented induction method for knowledge discovery in databases. A large knowledge-base system should support both mechanisms and their integration.