scispace - formally typeset
Search or ask a question

Showing papers on "Tuple published in 1998"


Journal ArticleDOI
TL;DR: This work investigates the issue of designing a kernel programming language for mobile computing and describes KLAIM, a language that supports a programming paradigm where processes, like data, can be moved from one computing environment to another.
Abstract: We investigate the issue of designing a kernel programming language for mobile computing and describe KLAIM, a language that supports a programming paradigm where processes, like data, can be moved from one computing environment to another. The language consists of a core Linda with multiple tuple spaces and of a set of operators for building processes. KLAIM naturally supports programming with explicit localities. Localities are first-class data (they can be manipulated like any other data), but the language provides coordination mechanisms to control the interaction protocols among located processes. The formal operational semantics is useful for discussing the design of the language and provides guidelines for implementations. KLAIM is equipped with a type system that statically checks access right violations of mobile agents. Types are used to describe the intentions (read, write, execute, etc.) of processes in relation to the various localities. The type system is used to determine the operations that processes want to perform at each locality, and to check whether they comply with the declared intentions and whether they have the necessary rights to perform the intended operations at the specific localities. Via a series of examples, we show that many mobile code programming paradigms can be naturally implemented in our kernel language. We also present a prototype implementation of KLAIM in Java.

557 citations


Proceedings ArticleDOI
01 May 1998
TL;DR: The complexity of the problem of answering queries using materialized views is studied and it is shown that the complexity depends on whether views are assumed to store all the tuples that satisfy the view definition, or only a subset of it.
Abstract: We study the complexity of the problem of answering queries using materialized views. This problem has attracted a lot of attention recently because of its relevance in data integration. Previous work considered only conjunctive view definitions. We examine the consequences of allowing more expressive view definition languages. The languages we consider for view definitions and user queries are: conjunctive queries with inequality, positive queries, datalog, and first-order logic. We show that the complexity of the problem depends on whether views are assumed to store all the tuples that satisfy the view definition, or only a subset of it. Finally, we apply the results to the view consistency and view self-maintainability problems which arise in data warehousing.

526 citations


Journal ArticleDOI
TL;DR: This paper presents SoftMealy, a novel wrapper representation formalism based on a finite-state transducer and contextual rules that can wrap a wide range of semistructured Web pages because FSTs can encode each different attribute permutation as a path.

476 citations


Proceedings ArticleDOI
01 Jun 1998
TL;DR: Two new spatial join operations, distance join and distance semi-join, are introduced where the join output is ordered by the distance between the spatial attribute values of the joined tuples.
Abstract: Two new spatial join operations, distance join and distance semi-join, are introduced where the join output is ordered by the distance between the spatial attribute values of the joined tuples. Incremental algorithms are presented for computing these operations, which can be used in a pipelined fashion, thereby obviating the need to wait for their completion when only a few tuples are needed. The algorithms can be used with a large class of hierarchical spatial data structures and arbitrary spatial data types in any dimensions. In addition, any distance metric may be employed. A performance study using R-trees shows that the incremental algorithms outperform non-incremental approaches by an order of magnitude if only a small part of the result is needed, while the penalty, if any, for the incremental processing is modest if the entire join result is required.

259 citations


Proceedings ArticleDOI
23 Feb 1998
TL;DR: A new compression algorithm that is tailored to database applications that can be applied to a collection of records, and is especially effective for records with many low to medium cardinality fields and numeric fields, is proposed.
Abstract: We propose a new compression algorithm that is tailored to database applications. It can be applied to a collection of records, and is especially effective for records with many low to medium cardinality fields and numeric fields. In addition, this new technique supports very fast decompression. Promising application domains include decision support systems (DSS), since fact tables, which are by far the largest tables in these applications, contain many low and medium cardinality fields and typically no text fields. Further, our decompression rates are faster than typical disk throughputs for sequential scans; in contrast, gzip is slower. This is important in DSS applications, which often scan large ranges of records. An important distinguishing characteristic of our algorithm, in contrast to compression algorithms proposed earlier, is that we can decompress individual tuples (even individual fields), rather than a full page (or an entire relation) at a time. Also, all the information needed for tuple decompression resides on the same page with the tuple. This means that a page can be stored in the buffer pool and used in compressed form, simplifying the job of the buffer manager and improving memory utilization. Our compression algorithm also improves index structures such as B-trees and R-trees significantly by reducing the number of leaf pages and compressing index entries, which greatly increases the fan-out. We can also use lossy compression on the internal nodes of an index.

240 citations


Book ChapterDOI
TL;DR: The design and the implementation of the MARS system, a coordination tool for Java-based mobile agents that defines Linda-like tuple spaces that can be programmed to react with specific actions to the accesses made by mobile agents.
Abstract: The paper surveys several coordination models for mobile agent applications and outlines the advantages of uncoupled coordination models based on reactive blackboards. On this base, the paper presents the design and the implementation of the MARS system, a coordination tool for Java-based mobile agents. MARS defines Linda-like tuple spaces that can be programmed to react with specific actions to the accesses made by mobile agents.

120 citations


Proceedings ArticleDOI
23 Feb 1998
TL;DR: The paper identifies the notion of time fragment preservation as the essential defining property of an interval based data model, thus providing a new formal basis for characterizing temporal data models and obtaining new insights into the properties of their query languages.
Abstract: The association of timestamps with various data items such as tuples or attribute values is fundamental to the management of time varying information. Using intervals in timestamps, as do most data models, leaves a data model with a variety of choices for giving a meaning to timestamps. Specifically, some such data models claim to be point based while other data models claim to be interval based. The meaning chosen for timestamps is important it has a pervasive effect on most aspects of a data model, including database design, a variety of query language properties, and query processing techniques, e.g., the availability of query optimization opportunities. The paper precisely defines the notions of point based and interval based temporal data models, thus providing a new formal basis for characterizing temporal data models and obtaining new insights into the properties of their query languages. Queries in point based models treat snapshot equivalent argument relations identically. This renders point based models insensitive to coalescing. In contrast, queries in interval based models give significance to the actual intervals used in the timestamps, thus generally treating non identical, but possibly snapshot equivalent relations differently. The paper identifies the notion of time fragment preservation as the essential defining property of an interval based data model.

98 citations


Proceedings Article
24 Aug 1998
TL;DR: A framework for incrementally removing warehouse data (without a need to fully recompute) is presented, and how the system should compensate when data is expired or other parameters changed is shown.
Abstract: Data warehouses collect data into materialized views for analysis. After some time, some of the data may no longer be needed or may not be of interest. In this paper, we handle this by expiring or removing unneeded materialized view tuples. A framework supporting such expiration is presented. Within it, a user or administrator can declaratively request expirations and can specify what type of modifications are expected from external sources. The latter can significantly increase the amount of data that can be expired. We present efficient algorithms for determining what data can be expired (data not needed for maintenance of other views), taking into account the types of updates that may occur

94 citations


Proceedings ArticleDOI
01 May 1998
TL;DR: Overall, this paper believes it is the first to demonstrate by implementation experience that it is practical to build a compiler for HPF using a general and powerful integer-set framework.
Abstract: In this paper, we describe our experience with using an abstract integer-set framework to develop the Rice dHPF compiler, a compiler for High Performance Fortran. We present simple, yet general formulations of the major computation partitioning and communication analysis tasks as well as a number of important optimizations in terms of abstract operations on sets of integer tuples. This approach has made it possible to implement a comprehensive collection of advanced optimizations in dHPF, and to do so in the context of a more general computation partitioning model than previous compilers. One potential limitation of the approach is that the underlying class of integer set problems is fundamentally unable to represent HPF data distributions on a symbolic number of processors. We describe how we extend the approach to compile codes for a symbolic number of processors, without requiring any changes to the set formulations for the above optimizations. We show experimentally that the set representation is not a dominant factor in compile times on both small and large codes. Finally, we present preliminary performance measurements to show that the generated code achieves good speedups for a few benchmarks. Overall, we believe we are the first to demonstrate by implementation experience that it is practical to build a compiler for HPF using a general and powerful integer-set framework.

70 citations


Journal ArticleDOI
TL;DR: The paper motivates the inclusion of many of the primitives, and demonstrates how a well designed set of primitives provides performance and efficiency, and the JavaSpace primitives are used as an example of how the choice ofPrimitives can detrimentally affect the efficiency of the language.
Abstract: In this paper a tuple space based co-ordination language, and a run-time system which supports it, is described. The co-ordination language is called WCL, and it is designed to support agent co-ordination over the Internet between agents which are geographically distributed. WCL uses tuple spaces as used in Linda. WCL provides a richer set of primitives than traditional tuple space based systems, and provides asynchronous and synchronous tuple space access, bulk tuple primitives, and streaming primitives which, as a whole, provide a complete framework more suited to co-ordination over the Internet compared with the Linda primitives. The primitives emphasise efficiency and location transparency (of data and agents) and this is exploited in the current run-time system used to support WCL. The run-time system described in this paper is distributed and uses location transparency and dynamic analysis of tuple space usage to migrate tuple spaces around the distributed system. Some initial experimental results are given which demonstrate the performance gains of using the tuple space migration. The paper motivates the inclusion of many of the primitives, and demonstrates how a well designed set of primitives provides performance and efficiency. The JavaSpace primitives are used as an example of how the choice of primitives can detrimentally affect the efficiency of the language, and exclude required co-ordination constructs.

69 citations


01 Jan 1998
TL;DR: The TuCSoN coordination model for Internet applications based on mobile information agents is discussed, which is based on the notion of tuple centre, a tuple-based interaction space associated to each site and to be used both for inter-agent cooperation and for accessing to local infromation sources.
Abstract: The increasing need to access and elaborate dynamic and heterogeneous information sources distributed over Internet calls for new models and paradigms for application design and development. The mobile agent paradigm promotes the design of applications where agents roam through Internet sites to locally access ad elaborate information and resources, possibly cooperating with each other. This paper focuses on mobile agent coordination, and discusses the TuCSoN coordination model for Internet applications based on mobile information agents. The model is based on the notion of tuple centre, a tuple-based interaction space associated to each site and to be used both for inter-agent cooperation and for accessing to local infromation sources. TuCSoN tuple centres enhance tuple spaces because their behaviour in response to communication events can be programmed. This can be used to deal with heterogeneity and dynamicity of the information sources, as well as to ensure some degree of global data integrity. The effectiveness of the TuCSoN model is shown by means of an application example in the area of Internet information retrieval.

Patent
16 Apr 1998
TL;DR: In this paper, a method and system for generating a decision-tree classifier in parallel in a shared-memory multiprocessor system is disclosed, where the processors independently determine the best splits for their respective assigned lists, and cooperatively determine a global best split for all attribute lists.
Abstract: A method and system for generating a decision-tree classifier in parallel in a shared-memory multiprocessor system is disclosed. The processors first generate in the shared memory an attribute list for each record attribute. Each attribute list is assigned to a processor. The processors independently determine the best splits for their respective assigned lists, and cooperatively determine a global best split for all attribute lists. The attribute lists are reassigned to the processors and split according to the global best split into the lists for child nodes. The split attribute lists are again assigned to the processors and the process is repeated for each new child node until each attribute list for the new child nodes includes only tuples of the same record class or a fixed number of tuples.

Book
01 Jan 1998
TL;DR: The Third Manifesto of relational algebra as discussed by the authors proposes a new algebra for relational algebra, which is based on the relational algebra of objects and relations, and the relation algebra of relations.
Abstract: Preface. I. PRELIMINARIES. 1. Background and Overview. What is The Third Manifesto? Why did we write it? Back to the relational future. Some guiding principles. Some crucial logical differences. Topics deliberately omitted. The Third Manifesto: A summary. 2. Objects and Relations. Introduction. What problem are we trying to solve? Relations vs. relvars. Domains vs. object classes. Relvars vs. object classes. A note on inheritance. Concluding remarks. II. FORMAL SPECIFICATIONS. 3. The Third Manifesto. RM Prescriptions. RM Proscriptions. OO Prescriptions. OO Proscriptions. RM Very Strong Suggestions. OO Very Strong Suggestions. 4. A New Relational Algebra. Introduction. Motivation and justification. BREMOVE(c), BRENAME(c), and BCOMPOSE(c). Treating operators as relations. Formal definitions. Transitive closure. 5. Tutorial D. Introduction. Types and expressions. Scalar definitions. Tuple definitions. Relational definitions. Scalar operations. Tuple operations. Relational operations. Relations and arrays. Statements. Syntax summary. Mapping the relational operations. III. INFORMAL DISCUSSIONS AND EXPLANATIONS. 6. RM Prescriptions. RM Prescription 1: Scalar types. RM Prescription 2: Scalar values are typed. RM Prescription 3: Scalar operators. RM Prescription 4: Actual vs. possible representations. RM Prescription 5: Expose possible representations. RM Prescription 6: Type generator TUPLE. RM Prescription 7: Type generator RELATION. RM Prescription 8: Equality. RM Prescription 9: Tuples. RM Prescription 10: Relations. RM Prescription 11: Scalar variables. RM Prescription 12: Tuple variables. RM Prescription 13: Relation variables (relvars). RM Prescription 14: Real vs. virtual relvars. RM Prescription 15: Candidate keys. RM Prescription 16: Databases. RM Prescription 17: Transactions. RM Prescription 18: Relational algebra. RM Prescription 19: Relvar names, relation selectors, and recursion. RM Prescription 20: Relation-valued operators. RM Prescription 21: Assignments. RM Prescription 22: Comparisons. RM Prescription 23: Integrity constraints. RM Prescription 24: Relvar and database predicates. RM Prescription 25: Catalog. RM Prescription 26: Language design. 7. RM Proscriptions. RM Proscription 1: No attribute ordering. RM Proscription 2: No tuple ordering. RM Proscription 3: No duplicate tuples. RM Proscription 4: No nulls. RM Proscription 5: No nullological mistakes. RM Proscription 6: No internal-level constructs. RM Proscription 7: No tuple-level operations. RM Proscription 8: No composite attributes. RM Proscription 9: No domain check override. RM Proscription 10: Not SQL. 8. OO Prescriptions. OO Prescription 1: Compile-time type checking. OO Prescription 2: Single inheritance (conditional). OO Prescription 3: Multiple inheritance (conditional). OO Prescription 4: Computational completeness. OO Prescription 5: Explicit transaction boundaries. OO Prescription 6: Nested transactions. OO Prescription 7: Aggregates and empty sets. 9. OO Proscriptions. OO Proscription 1: Relvars are not domains. OO Proscription 2: No object IDs. 10. RM Very Strong Suggestions. RM Very Strong Suggestion 1: System keys. RM Very Strong Suggestion 2: Foreign keys. RM Very Strong Suggestion 3: Candidate key inference. RM Very Strong Suggestion 4: Transition constraints. RM Very Strong Suggestion 5: Quota queries. RM Very Strong Suggestion 6: Generalized transitive closure. RM Very Strong Suggestion 7: Tuple and relation parameters. RM Very Strong Suggestion 8: Special ("default") values. RM Very Strong Suggestion 9: SQL migration. 11. OO Very Strong Suggestions. OO Very Strong Suggestion 1: Type inheritance. OO Very Strong Suggestion 2: Types and operators unbundled. OO Very Strong Suggestion 3: Collection type generators OO Very Strong Suggestion 4: Conversions to/from relations OO Very Strong Suggestion 5: Single-level store IV. SUBTYPING AND INHERITANCE. 12. Preliminaries Introduction. Toward a type inheritance model. Single vs. multiple inheritance. Scalars, tuples, and relations. Summary. 13. Formal Specifications. Introduction. IM Proposals. 14. Informal Discussions and Explanations. Introduction. IM Proposal 1: Types are sets. IM Proposal 2: Subtypes are subsets. IM Proposal 3: "Subtype of" is reflexive. IM Proposal 4: Proper subtypes. IM Proposal 5: "Subtype of" is transitive. IM Proposal 6: Immediate subtypes. IM Proposal 7: Single inheritance only. IM Proposal 8: Global root types. IM Proposal 9: Type hierarchies. IM Proposal 10: Subtypes can be proper subsets. IM Proposal 11: Types disjoint unless one a subtype of the other. IM Proposal 12: Scalar values (extended definition). IM Proposal 13: Scalar variables (extended definition). IM Proposal 14: Assignment with inheritance. IM Proposal 15: Comparison with inheritance. IM Proposal 16: Join etc. with inheritance. IM Proposal 17: TREAT DOWN. IM Proposal 18: TREAT UP. IM Proposal 19: Logical operator IS_T(SX). IM Proposal 20: Relational operator RX:IS_T(A). IM Proposal 21: Logical operator IS_MS_T(SX). IM Proposal 22: Relational operator RX:IS_MS_T(A). IM Proposal 23: THE_ pseudovariables. IM Proposal 24: Read-only operator inheritance and value substitutability. IM Proposal 25: Read-only parameters to update operators. IM Proposal 26: Update operator inheritance and variable substitutability. What about specialization by constraint? 15. Multiple Inheritance. Introduction. The running example. IM Proposals 1-26 revisited. Many supertypes per subtype. Type graphs. Least specific types unique. Most specific types unique. Comparison with multiple inheritance. Operator inheritance. 16. Tuple and Relation Types. Introduction. Tuple and relation subtypes and supertypes. IM Proposals 1-11 still apply. Tuple and relation values (extended definitions). Tuple and relation most specific types. Tuple and relation variables (extended definitions). Tuple and relation assignment. Tuple and relation comparison. Tuple and relation TREAT DOWN. IM Proposals 18-26 revisited. Appendixes. Appendix A. A Relational Calculus Version of Tutorial D. Introduction. Boolean expressions. Builtin relation operator invocations. Free and bound range variable references. Relation UPDATE and DELETE operators. Examples. Appendix B. The Database Design Dilemma. Introduction. Encapsulation. Discussion. Further considerations. Appendix C. Specialization by Constraint. Introduction. A closer look. The "3 out of 4" rule. Can the idea be rescued? Appendix D. Subtables and Supertables. Introduction. Some general observations. The terminology is extremely bad. The concept is not type inheritance. Why? Appendix E. A Comparison with SQL3. Introduction. RM Prescriptions. RM Proscriptions. OO Prescriptions. OO Proscriptions. RM Very Strong Suggestions. OO Very Strong Suggestions. IM Proposals (scalar types, single inheritance). IM Proposals (scalar types, multiple inheritance). IM Proposals (tuple and relation types). History of the wrong equation in SQL3. Appendix F. A Comparison with ODMG. Introduction. Overview. RM Prescriptions. RM Proscriptions. OO Prescriptions. OO Proscriptions. RM Very Strong Suggestions. OO Very Strong Suggestions. IM Proposals (scalar types, single inheritance). IM Proposals (scalar types, multiple inheritance). IM Proposals (tuple and relation types). Appendix G. The Next 25 Years of the Relational Model? Remarks on republication. Introduction. Background. The Third Manifesto and SQL. Technical content. More on SQL. Miscellaneous questions. Appendix H. References and Bibliography. Index. 0201309785T04062001

Journal ArticleDOI
TL;DR: The effectiveness of the TuCSoN model is first shown by means of an application example in the area of Internet information retrieval, then discussed in the context of workflow management and electronic commerce.
Abstract: The increasing need to access and elaborate dynamic and heterogeneous information sources distributed over the Internet calls for new models and paradigms for application design and development. The mobile agent paradigm promotes the design of applications where agents roam through Internet sites to locally access and elaborate information and resources, possibly co‐operating with each other. Focuses on mobile agent co‐ordination, and presents the TuCSoN co‐ordination model for Internet applications based on mobile information agents. TuCSoN exploits a notion of local tuple‐based interaction space, called a tuple centre. A tuple centre is a tuple space enhanced with the capability of programming its behaviour in response to communication events. This enables properties to be embedded into the interaction space, and a mobile agent to be designed independently of the peculiarities of the information sources. Several issues critical to Internet applications can then be charged on tuple centres transparently to agents. The effectiveness of the TuCSoN model is first shown by means of an application example in the area of Internet information retrieval, then discussed in the context of workflow management and electronic commerce.

Proceedings ArticleDOI
Motomichi Toyama1
01 Jun 1998
TL;DR: This demonstration shows how TFE reorganize the query results into various media in a universal way, first by grouping tuples according to an arbitrary tree structured schema, and by translating them with the constructors available in the target media.
Abstract: SuperSQL is an extension of SQL that allows query results presented in various media for publishing and presentations with simple but sophisticated formatting capabilities. SuperSQL query can generate various kinds of materials, for example, a LaTeX source file to publish query results in a nested table, HTML or Java source files to present the result on WWW browsers, and other media including MS-Excel worksheet, Tcl/Tk, O2C, etc. O2C is a data manipulation language of O2 and thus useful to migrate data in a relational database to an object oriented database.SuperSQL is meant to provide a theoretical and practical foundation for 4GL-type applications such as report writers and DB/WWW coordinators.In this demonstration, we show how TFE reorganize the query results into various media in a universal way, first by grouping tuples according to an arbitrary tree structured schema, and by translating them with the constructors available in the target media.

Journal ArticleDOI
Mengchi Liu1
TL;DR: Relationlog stands in the same relationship to the nested relational and complex value models as Datalog stands to the relational model, and has a well-defined Herbrand model-theoretic semantics, which captures the intended semantics of nested sets, tuples and relations.
Abstract: This paper presents a novel logic programming based language for nested relational and complex value models called Relationlog. It stands in the same relationship to the nested relational and complex value models as Datalog stands to the relational model. The main novelty of the language is the introduction of powerful mechanisms, namely, partial and complete set terms, for representing and manipulating both partial and complete information on nested sets, tuples and relations. They generalize the set grouping and set enumeration mechanisms of LDL and allow the user to directly encode the open and closed world assumptions on nested sets, tuples, and relations. They allow direct inference and access to deeply embedded values in a complex value relation as if the relation is normalized, which greatly increases the ease of use of the language. As a result, the extended relational algebra operations can be represented in Relationlog directly, and more importantly, recursively in a way similar to Datalog. Like Datalog, Relationlog has a well-defined Herbrand model-theoretic semantics, which captures the intended semantics of nested sets, tuples and relations, and also a well-defined proof-theoretic semantics which coincides with its model-theoretic semantics.

Journal ArticleDOI
TL;DR: This work shows that the relational model is not the only possible semantic reference model for constraint relational databases and it shows how constraint relations can be interpreted under the nested relational model, and introduces two distinct classes of constraint algebras.
Abstract: Constraint relational databases use constraints to both model and query data. A constraint relation contains a finite set of generalized tuples. Each generalized tuple is represented by a conjunction of constraints on a given logical theory and, depending on the logical theory and the specific conjunction of constraints, it may possibly represent an infinite set of relational tuples. For their characteristics, constraint databases are well suited to model multidimensional and structured data, like spatial and temporal data. The definition of an algebra for constraint relational databases is important in order to make constraint databases a practical technology. We extend the previously defined constraint algebra (called generalized relational algebra). First, we show that the relational model is not the only possible semantic reference model for constraint relational databases and we show how constraint relations can be interpreted under the nested relational model. Then, we introduce two distinct classes of constraint algebras, one based on the relational algebra, and one based on the nested relational algebra, and we present an algebra of the latter type. The algebra is proved equivalent to the generalized relational algebra when input relations are modified by introducing generalized tuple identifiers. However, from a user point of view, it is more suitable. Thus, the difference existing between such algebras is similar to the difference existing between the relational algebra and the nested relational algebra, dealing with only one level of nesting. We also show how external functions can be added to the proposed algebra.

Proceedings ArticleDOI
01 Oct 1998
TL;DR: A new, simple, and orthogonal way to add multimethods to single-dispatch object-oriented languages, without affecting existing code is described.
Abstract: Many popular object-oriented programming languages, such as C++, Smalltalk-80, Java, and Eiffel, do not support multiple dispatch. Yet without multiple dispatch, programmers find it difficult to express binary methods and design patterns such as the "visitor" pattern. We describe a new, simple, and orthogonal way to add multimethods to single-dispatch object-oriented languages, without affecting existing code. The new mechanism also clarifies many differences between single and multiple dispatch.

Proceedings Article
27 Aug 1998
TL;DR: It is shown that the lub may not qualify as a reduced version of the given set of tuples, but the interior cover - the subset of internal elements covered by the lubs does qualify, and the theoretical result that such an interior cover exists is established.
Abstract: Data reduction makes datasets smaller but preserves classification structures of interest. In this paper we present a novel approach to data reduction based on lattice and hyper relations. Hyper relations are a generalization of conventional database relations in the sense that we allow sets of values as tuple entries. The advantage of this is that raw data and reduced data can both be represented by hyper relations. The collection of hyper relations can be naturally made into a complete Boolean algebra, and so for any collection of hyper tuples we can find its unique least upper bound (lub) as a reduction of it. We show that the lub may not qualify as a reduced version of the given set of tuples, but the interior cover - the subset of internal elements covered by the lub- does qualify. We establish the theoretical result that such an interior cover exists, and find a way to find it. The proposed method was evaluated using 7 real world datasets. The results were quite remarkable compared with those obtained by C4.5, and the datasets were reduced with reduction ratios up to 99%.

Journal ArticleDOI
TL;DR: It is intended to prove that the exponential complexity of key sets and sets of functional dependencies is rather unusual and almost all minimal keys in a relation have a length which depends mainly on the size of the relation.

Patent
25 Nov 1998
TL;DR: In this article, a machine-implementable method and apparatus for automatic extension of results obtained by querying a database of relationally organized data and expressed in tabular row and column format is presented.
Abstract: A machine-implementable method and apparatus for automatic extension of results obtained by querying a database of relationally organized data and expressed in tabular row and column format. The method involves modifying the query by adding column variables to the query that show a high association with the initial query designated variables. The modified query is then used to access the table. This repeats until a stop condition is sensed. Tuples of values elicited responsive to the modified query are included in an extended response if they are significantly similar to tuples elicited by the original query. Several association and similarity modes are described by which the number of variables and tuples can be reiteratively extended.

Proceedings ArticleDOI
01 Nov 1998
TL;DR: MobiS, a coordination language based on multiple tuple spaces is introduced and it is shown through an example how it can be used for the specification of software architectures containing mobile components.
Abstract: Modern Software Architectures often have to deal with mobile components. Therefore, the structure of these systems is dynamic and continuously changing.We introduce mobiS, a coordination language based on multiple tuple spaces and show through an example how it can be used for the specification of software architectures containing mobile components. The flexibility of the language encodes mobility in the model, so the specification of mobile components and of reconfigurable systems is easy.Due to the non-determinism of the coordination model the behaviors of components context-dependent can be specified and used to make assumptions on the kind of architecture the component can be put into.

Book ChapterDOI
24 Aug 1998
TL;DR: It is proved that all queries expressible by order-invariant first-order formulas are local, that is, whether a tuple in a structure satisfies this query only depends on a small neighborhood of the tuple.
Abstract: A query is local if the decision of whether a tuple in a structure satisfies this query only depends on a small neighborhood of the tuple. We prove that all queries expressible by order-invariant first-order formulas are local.

Patent
Tobin J. Lehman1, Stephen McLaughry1
27 Jan 1998
TL;DR: In this article, a method, apparatus, and article of manufacture for exchanging information in a computer-implemented database system using a new operator known as a Rhonda operator is described.
Abstract: A method, apparatus, and article of manufacture for exchanging information in a computer-implemented database system. The present invention implements this exchange using a new operator known as a Rhonda operator. A Rhonda operator includes a tuple and template as arguments and, when performed, atomically swaps its tuple with a tuple from another Rhonda operator when both their templates match. More specifically, if two processes perform Rhonda operations, and each process' template argument matches the other process' tuple argument, then each process receives the other process' tuple as a result. This atomic synchronization can be performed for two or more Rhonda operators at a time.

Book ChapterDOI
23 Sep 1998
TL;DR: The summaries proposed here allow for a qualitative description of data (instead of the quantitative description given by a probabilistic approach) and they involve linguistic terms to obtain a wider coverage than Boolean summaries.
Abstract: This paper is concerned with knowledge discovery in databases and linguistic summaries of data. The summaries proposed here allow for a qualitative description of data (instead of the quantitative description given by a probabilistic approach) and they involve linguistic terms to obtain a wider coverage than Boolean summaries. They are based on extended functional dependencies and are situated in the framework of the relational model of data. Such summaries express a meta-knowledge about the database content according to the pattern “for any tuple t in relation R: the more A, the more B” (for instance: the taller the player, the higher his score in the NBA championship) where A and B are two linguistic terms. In addition, an algorithm to implement the discovery process (which takes advantage of properties of extended functional dependencies) is given. This algorithm is iterative and each tuple is successively considered in order to refine the set of valid summaries.

Journal ArticleDOI
TL;DR: A theory of multi-relations, which are similar to normal mathematical relations, except for the fact that each tuple has a given multiplicity, is proposed and most of the set-oriented operations on relations can be generalised.
Abstract: This report proposes a theory of multi-relations, which are similar to normal mathematical relations, except for the fact that each tuple has a given multiplicity. It is shown that most of the set-oriented operations on relations, such as union and intersection can be generalised (in the same way in which sets can be generalised to multisets). The typical relational operations of composition and transposition and the theory of ‘lifting’ can be generalised too. Several alternative representations are discussed, including ternary relations, and multisets of tuples. Multi-relations can be visualised as directed graphs where each edge is labeled with a number. Alternatively, the multiplicity could be visualised by giving the edge a certain thickness. The approach is helpful in situations where one is not satisfied with the knowledge that there is a certain connection ('uses', 'calls'etc.) between two units (components, modules, processes), but where one wants to have quantitative information on how many sub-c...

Journal ArticleDOI
TL;DR: This paper applies constraint-based data models to the problem of shape management in multimedia databases, and presents the constraint model and some constraint languages, and shows how constraints can be used to model general shapes.
Abstract: Shape management is an important functionality in multimedia databases. Shape information can be used in both image acquisition and image retrieval. Several approaches have been proposed to deal with shape representation and matching. Among them, the data-driven approach supports searches for shapes based on indexing techniques. Unfortunately, efficient data-driven approaches are often defined only for specific types of shape. This is not sufficient in contexts in which arbitrary shapes should be represented. Constraint databases use mathematical theories to finitely represent infinite sets of relational tuples. They have been proved to be very useful in modeling spatial objects. In this paper, we apply constraint-based data models to the problem of shape management in multimedia databases. We first present the constraint model and some constraint languages. Then, we show how constraints can be used to model general shapes. The use of a constraint language as an internal specification and execution language for querying shapes is also discussed. Finally, we show how a constraint database system can be used to efficiently retrieve shapes, retaining the advantages of the already defined approaches.

Journal ArticleDOI
TL;DR: Two basic view maintenance algorithms are proposed using the use of tags to derive a tagged counting algorithm that further reduces the communication cost, and the performance analysis identifies the situations where a particular algorithm is superior to others.
Abstract: The incremental view maintenance problem deals with the efficient updating of materialized views in response to updates to base relations. This paper considers the problem in a distributed database environment, with communication cost minimization as the primary objective. The views considered are defined based on the relational join operation. The approach is to use ’’yes‘‘/’’no‘‘ tags as auxiliary data on tuples in the base relations to indicate whether the tuples participate in joins. These tags will help avoid sending irrelevant data over the network and thus reduce the communication cost. Two basic view maintenance algorithms are proposed using the tags. In addition to reducing communication costs, an important feature of these two basic algorithms is that they derive the ’’exact change‘‘ to views without looking at the old views. This feature allows us to maintain certain aggregates on views without actually materializing the views themselves; this feature is useful in applications such as active databases where many conditions or constraints must be tested whenever updates occur, since a condition is true exactly when some corresponding view has nonzero number of tuples. The paper then combines the use of tags with the counting algorithm to derive a tagged counting algorithm that further reduces the communication cost. The paper illustrates the algorithms by examples and studies their performance via a statistical analysis. The illustrating examples and the performance analysis show that, under uniform distribution with reasonable join participation rates, the use of tags significantly improves the efficiency of view maintenance over similar algorithms without tags. The performance analysis also identifies the situations where a particular algorithm is superior to others. The use of tags for memoing values of subexpressions in a view definition is also explored in the paper.

ReportDOI
18 Nov 1998
TL;DR: This paper presents an online warehouse update algorithm, that stores multiple versions of data as separate rows (vertical redundancy), and compares it to another online algorithm that storesmultiple versions within each tuple by extending the table schema (horizontal redundancy).
Abstract: : Data warehouse maintenance algorithms usually work off-line, making the warehouse unavailable to users. However, since most organizations require continuous operation, we need be able to perform the updates online, concurrently with user queries. To guarantee that user queries access a consistent view of the warehouse, online update algorithms introduce redundancy in order to store multiple versions of the data objects that are being changed. In this paper, we present an online warehouse update algorithm, that stores multiple versions of data as separate rows (vertical redundancy). We compare our algorithm to another online algorithm that stores multiple versions within each tuple by extending the table schema (horizontal redundancy). We have implemented both algorithms on top of an Informix Dynamic Server and measured their performance under varying workloads, focusing on their impact on query response times. Our experiments show that, except for a limited number of cases, vertical redundancy is a better choice, with respect to storage, implementation overhead, and query performance.

Journal ArticleDOI
TL;DR: In this article, it was shown that for every r and dG 2 there is a C such that for most choices of d permutations p, p, p,..., p of S, the following holds: for any two r-tuples of distinct 12 dn 4
Abstract: We prove that for every r and dG 2 there is a C such that for most choices of d permutations p , p ,..., p of S , the following holds: for any two r-tuples of distinct 12 dn 4