scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Knowledge and Data Engineering in 1998"


Journal Article•DOI•
TL;DR: The experimental results show that the asynchronous weak-commitment search algorithm is, by far more, efficient than the asynchronous backtracking algorithm and can solve fairly large-scale problems.
Abstract: We develop a formalism called a distributed constraint satisfaction problem (distributed CSP) and algorithms for solving distributed CSPs. A distributed CSP is a constraint satisfaction problem in which variables and constraints are distributed among multiple agents. Various application problems in distributed artificial intelligence can be formalized as distributed CSPs. We present our newly developed technique called asynchronous backtracking that allows agents to act asynchronously and concurrently without any global control, while guaranteeing the completeness of the algorithm. Furthermore, we describe how the asynchronous backtracking algorithm can be modified into a more efficient algorithm called an asynchronous weak-commitment search, which can revise a bad decision without exhaustive search by changing the priority order of agents dynamically. The experimental results on various example problems show that the asynchronous weak-commitment search algorithm is, by far more, efficient than the asynchronous backtracking algorithm and can solve fairly large-scale problems.

736 citations


Journal Article•DOI•
TL;DR: The authors explore a new data mining capability that involves mining path traversal patterns in a distributed information-providing environment where documents or objects are linked together to facilitate interactive access and show that the option of selective scan is very advantageous and can lead to prominent performance improvement.
Abstract: The authors explore a new data mining capability that involves mining path traversal patterns in a distributed information-providing environment where documents or objects are linked together to facilitate interactive access. The solution procedure consists of two steps. First, they derive an algorithm to convert the original sequence of log data into a set of maximal forward references. By doing so, one can filter out the effect of some backward references, which are mainly made for ease of traveling and concentrate on mining meaningful user access sequences. Second, they derive algorithms to determine the frequent traversal patterns-i.e., large reference sequences-from the maximal forward references obtained. Two algorithms are devised for determining large reference sequences; one is based on some hashing and pruning techniques, and the other is further improved with the option of determining large reference sequences in batch so as to reduce the number of database scans required. Performance of these two methods is comparatively analyzed. It is shown that the option of selective scan is very advantageous and can lead to prominent performance improvement. Sensitivity analysis on various parameters is conducted.

565 citations


Journal Article•DOI•
TL;DR: The distributed mediator architecture of Disco is described; the data model and its modeling of data source connections; the interface to underlying data sources and the query rewriting process; and query processing semantics are described.
Abstract: Accessing many data sources aggravates problems for users of heterogeneous distributed databases. Database administrators must deal with fragile mediators, that is, mediators with schemas and views that must be significantly changed to incorporate a new data source. When implementing translators of queries from mediators to data sources, database implementers must deal with data sources that do not support all the functionality required by mediators. Application programmers must deal with graceless failures for unavailable data sources. Queries simply return failure and no further information when data sources are unavailable for query processing. The Distributed Information Search COmponent (Disco) addresses these problems. Data modeling techniques manage the connections to data sources, and sources can be added transparently to the users and applications. The interface between mediators and data sources flexibly handles different query languages and different data source functionality. Query rewriting and optimization techniques rewrite queries so they are efficiently evaluated by sources. Query processing and evaluation semantics are developed to process queries over unavailable data sources. In this article, we describe: 1) the distributed mediator architecture of Disco; 2) the data model and its modeling of data source connections; 3) the interface to underlying data sources and the query rewriting process; and 4) query processing semantics. We describe several advantages of our system.

243 citations


Journal Article•DOI•
TL;DR: This paper proposes a hierarchical encoded path view (HEPV) model, and presents complete solutions for all phases of the HEPV approach, including graph partitioning, hierarchy generation, path view encoding and updating, and path retrieval.
Abstract: Efficient path computation is essential for applications such as intelligent transportation systems (ITS) and network routing. In ITS navigation systems, many path requests can be submitted over the same, typically huge, transportation network within a small time window. While path precomputation (path view) would provide an efficient path query response, it raises three problems which must be addressed: 1) precomputed paths exceed the current computer main memory capacity for large networks; 2) disk-based solutions are too inefficient to meet the stringent requirements of these target applications; and 3) path views become too costly to update for large graphs (resulting in out-of-date query results). We propose a hierarchical encoded path view (HEPV) model that addresses all three problems. By hierarchically encoding partial paths, HEPV reduces the view encoding time, updating time and storage requirements beyond previously known path precomputation techniques, while significantly minimizing path retrieval time. We prove that paths retrieved over HEPV are optimal. We present complete solutions for all phases of the HEPV approach, including graph partitioning, hierarchy generation, path view encoding and updating, and path retrieval. In this paper, we also present an in-depth experimental evaluation of HEPV based on both synthetic and real GIS networks. Our results confirm that HEPV offers advantages over alternative path finding approaches in terms of performance and space efficiency.

218 citations


Journal Article•DOI•
TL;DR: The ranking and retrieval algorithms developed in MARS based on the Boolean retrieval model are discussed and the results of the experiments that demonstrate the effectiveness of the developed model for image retrieval are described.
Abstract: To address the emerging needs of applications that require access to and retrieval of multimedia objects, we are developing the Multimedia Analysis and Retrieval System (MARS). In this paper, we concentrate on the retrieval subsystem of MARS and its support for content-based queries over image databases. Content-based retrieval techniques have been extensively studied for textual documents in the area of automatic information retrieval. This paper describes how these techniques can be adapted for ranked retrieval over image databases. Specifically, we discuss the ranking and retrieval algorithms developed in MARS based on the Boolean retrieval model and describe the results of our experiments that demonstrate the effectiveness of the developed model for image retrieval.

204 citations


Journal Article•DOI•
TL;DR: A natural similarity function for shape matching is used, based on concepts from mathematical morphology, and it is shown how it can be lower-bounded by a set of shape features for safely pruning candidates, thus giving fast and correct output.
Abstract: Investigates the problem of retrieving similar shapes from a large database; in particular, we focus on medical tumor shapes (finding tumors that are similar to a given pattern). We use a natural similarity function for shape matching, based on concepts from mathematical morphology, and we show how it can be lower-bounded by a set of shape features for safely pruning candidates, thus giving fast and correct output. These features can be organized in a spatial access method, leading to fast indexing for range queries and nearest-neighbor queries. In addition to the lower-bounding, our second contribution is the design of a fast algorithm for nearest-neighbor searching, achieving significant speedup while provably guaranteeing correctness. Our experiments demonstrate that roughly 90% of the candidates can be pruned using these techniques, resulting in up to 27 times better performance compared to sequential scanning.

204 citations


Journal Article•DOI•
TL;DR: The properties that any arbitration operator should satisfy are investigated, in the style of Alchourron, Gardenfors, and Makinson, and proposed actual operators for arbitration are proposed.
Abstract: Knowledge-based systems must be able to "intelligently" manage a large amount of information coming from different sources and at different moments in time. Intelligent systems must be able to cope with a changing world by adopting a "principled" strategy. Many formalisms have been put forward in the artificial intelligence (AI) and database (DB) literature to address this problem. Among them, belief revision is one of the most successful frameworks to deal with dynamically changing worlds. Formal properties of belief revision have been investigated by Alchourron, Gardenfors, and Makinson, who put forward a set of postulates stating the properties that a belief revision operator should satisfy. Among these properties, a basic assumption of revision is that the new piece of information is totally reliable and, therefore, must be in the revised knowledge base. Different principles must be applied when there are two different sources of information and each one has a different view of the situation-the two views contradicting each other. If we do not have any reason to consider any of the sources completely unreliable, the best we can do is to "merge" the two views in a new and consistent one, trying to preserve as much information as possible. We call this merging process arbitration. In this paper, we investigate the properties that any arbitration operator should satisfy. In the style of Alchourron, Gardenfors, and Makinson we propose a set of postulates, analyze their properties, and propose actual operators for arbitration.

194 citations


Journal Article•DOI•
TL;DR: A knowledge-based approach to retrieve medical images by feature and content with spatial and temporal constructs is developed and the KMeD (Knowledge-based Medical Database) system is implemented using these concepts.
Abstract: A knowledge-based approach to retrieve medical images by feature and content with spatial and temporal constructs is developed. Selected objects of interest in an image are segmented and contours are generated. Features and content are extracted and stored in a database. Knowledge about image features can be expressed as a type abstraction hierarchy (TAH), the high-level nodes of which represent the most general concepts. Traversing TAH nodes allows approximate matching by feature and content if an exact match is not available. TAHs can be generated automatically by clustering algorithms based on feature values in the databases and hence are scalable to large collections of image features. Since TAHs are generated based on user classes and applications, they are context- and user-sensitive. A knowledge-based semantic image model is proposed to represent the various aspects of an image object's characteristics. The model provides a mechanism for accessing and processing spatial, evolutionary and temporal queries. A knowledge-based spatial temporal query language (KSTL) has been developed that extends ODMG's OQL and supports approximate matching of features and content, conceptual terms and temporal logic predicates. Further, a visual query language has been developed that accepts point-click-and-drag visual iconic input on the screen that is then translated into KSTL. User models are introduced to provide default parameter values for specifying query conditions. We have implemented the KMeD (Knowledge-based Medical Database) system using these concepts.

180 citations


Journal Article•DOI•
TL;DR: In this paper, the authors introduce event structures that have temporal constraints with multiple granularities, and define the pattern discovery problem with these structures, and study effective algorithms to solve it.
Abstract: An important usage of time sequences is to discover temporal patterns The discovery process usually starts with a user specified skeleton, called an event structure, which consists of a number of variables representing events and temporal constraints among these variables; the goal of the discovery is to find temporal patterns, ie, instantiations of the variables in the structure that appear frequently in the time sequence The paper introduces event structures that have temporal constraints with multiple granularities, defines the pattern discovery problem with these structures, and studies effective algorithms to solve it The basic components of the algorithms include timed automata with granularities (TAGs) and a number of heuristics The TAGs are for testing whether a specific temporal pattern, called a candidate complex event type, appears frequently in a time sequence Since there are often a huge number of candidate event types for a usual event structure, heuristics are presented aiming at reducing the number of candidate event types and reducing the time spent by the TAGs testing whether a candidate type does appear frequently in the sequence These heuristics exploit the information provided by explicit and implicit temporal constraints with granularity in the given event structure The paper also gives the results of an experiment to show the effectiveness of the heuristics on a real data set

161 citations


Journal Article•DOI•
TL;DR: The authors examine the issues involved in designing efficient access methods for bitemporal databases, and propose the partial-persistence and the double-tree methodologies, which provide the best overall performance, especially for transaction timeslice queries.
Abstract: By supporting the valid and transaction time dimensions, bitemporal databases represent reality more accurately than conventional databases. The authors examine the issues involved in designing efficient access methods for bitemporal databases, and propose the partial-persistence and the double-tree methodologies. The partial-persistence methodology reduces bitemporal queries to partial persistence problems for which an efficient access method is then designed. The double-tree methodology "sees" each bitemporal data object as consisting of two intervals (a valid-time and a transaction-time interval) and divides objects into two categories according to whether the right endpoint of the transaction time interval is already known. A common characteristic of both methodologies is that they take into account the properties of each time dimension. Their performance is compared with a straightforward approach that "sees" the intervals associated with a bitemporal object as composing one rectangle, which is stored in a single multidimensional access method. Given that some limited additional space is available, the experimental results show that the partial-persistence methodology provides the best overall performance, especially for transaction timeslice queries. For those applications that require ready, off-the-shelf, access methods, the double-tree methodology is a good alternative.

143 citations


Journal Article•DOI•
TL;DR: The design method is illustrated by an example involving ontologies of pure substances at several levels of detail, and is intended to enrich the ontology developer's toolkit.
Abstract: Presents a particular way of building ontologies that proceeds in a bottom-up fashion. Concepts are defined in a way that mirrors the way their instances are composed out of smaller objects. The smaller objects themselves may also be modeled as being composed. Bottom-up ontologies are flexible through the use of implicit and, hence, parsimonious part-whole and subconcept-superconcept relations. The bottom-up method complements current practice, where, as a rule, ontologies are built top-down. The design method is illustrated by an example involving ontologies of pure substances at several levels of detail. It is not claimed that bottom-up construction is a generally valid recipe; indeed, such recipes are deemed uninformative or impossible. Rather, the approach is intended to enrich the ontology developer's toolkit.

Journal Article•DOI•
TL;DR: This paper proposes techniques that prune update propagation by exploiting knowledge of the subsumption relationships between classes to identify branches of classes to which the authors do not need to propagate updates and uses derivation ordering to eliminate self-cancelling propagation.
Abstract: View materialization is a promising technique for achieving the data sharing and virtual restructuring capabilities needed by advanced applications such as data warehousing and workflow management systems. Much existing work addresses the problem of how to maintain the consistency of materialized relational views under update operations. However, little progress has been made thus far regarding the topic of view materialization in object-oriented databases (OODBs). In this paper, we demonstrate that there are several significant differences between the relational and object-oriented paradigms that can be exploited when addressing the object-oriented view materialization problem. First, we propose techniques that prune update propagation by exploiting knowledge of the subsumption relationships between classes to identify branches of classes to which we do not need to propagate updates and by using derivation ordering to eliminate self-cancelling propagation. Second, we use encapsulated interfaces, combined with the fact that any unique database property is inherited from a single location, to provide a "registration service" by which virtual classes can register their interest in specific properties and be notified upon modification of those properties. Third, we introduce the notion of hierarchical registrations that further optimizes update propagation by organizing the registration structures according to the class generalization hierarchy, thereby pruning the set of classes that are notified of updates. We have successfully implemented all proposed techniques in the MultiView system on top of the GemStone OODBMS. To the best of our knowledge, MultiView is the first OODB view system to provide updatable materialized virtual classes and virtual schemata. In this paper, we also present a cost model for our update algorithms, and we report results from the experimental studies we have run on the MultiView system, measuring the impact of various optimization strategies incorporated into our materialization update algorithms.

Journal Article•DOI•
TL;DR: An approach which combines static analysis of a rule set at compile-time and detection of endless loops during rule processing at runtime is described, which allows for the identification of rule sets that can be independently monitored and for the optimal selection of cycle monitors.
Abstract: Active rules may interact in complex and sometimes unpredictable ways, thus possibly yielding infinite rule executions by triggering each other indefinitely. This paper presents analysis techniques focused on detecting termination of rule execution. We describe an approach which combines static analysis of a rule set at compile-time and detection of endless loops during rule processing at runtime. The compile-time analysis technique is based on the distinction between mutual triggering and mutual activation of rules. This distinction motivates the introduction of two graphs defining rule interaction, called Triggering and Activation Graphs, respectively. This analysis technique allows us to identify reactive behaviors which are guaranteed to terminate and reactive behaviors which may lead to infinite rule processing. When termination cannot be guaranteed at compile-time, it is crucial to detect infinite rule executions at runtime. We propose a technique for identifying loops which is based on recognizing that a given situation has already occurred in the past and, therefore, will occur an infinite number of times in the future. This technique is potentially very expensive, therefore, we explain how it can be implemented in practice with limited computational effort. A particular use of this technique allows us to develop cycle monitors, which check that critical rule sequences, detected at compile time, do not repeat forever We bridge compile-time analysis to runtime monitoring by showing techniques, based on the result of rule analysis, for the identification of rule sets that can be independently monitored and for the optimal selection of cycle monitors.

Journal Article•DOI•
TL;DR: A unified data model that represents multimedia, timeline, and simulation data utilizing a single set of related data modeling constructs is described, giving multimedia schemas and queries a degree of data independence even for these complex data types.
Abstract: This paper describes a unified data model that represents multimedia, timeline, and simulation data utilizing a single set of related data modeling constructs. A uniform model for multimedia types structures image, sound, video, and long text data in a consistent way, giving multimedia schemas and queries a degree of data independence even for these complex data types. Information that possesses an intrinsic temporal element can all be represented using a construct called a stream. Streams can be aggregated into parallel multistreams, thus providing a structure for viewing multiple sets of time-based information. The unified stream construct permits real-time measurements, numerical simulation data, and visualizations of that data to be aggregated and manipulated using the same set of operators. Prototypes based on the model have been implemented for two medical application domains: thoracic oncology and thermal ablation therapy of brain tumors. Sample schemas, queries, and screenshots from these domains are provided. Finally, a set of examples is included for an accompanying visual query language discussed in detail in another document.

Journal Article•DOI•
TL;DR: GDBR and FIGR are two enhancements of Attribute Oriented Generalization, a well known knowledge discovery from databases technique that are optimal and compare them to two previous algorithms, LCHR and AOI, which are O(n log n) and O(np), respectively.
Abstract: We present GDBR (Generalize DataBase Relation) and FIGR (Fast, Incremental Generalization and Regeneralization), two enhancements of Attribute Oriented Generalization, a well known knowledge discovery from databases technique. GDBR and FIGR are both O(n) and, as such, are optimal. GDBR is an online algorithm and requires only a small, constant amount of space. FIGR also requires a constant amount of space that is generally reasonable, although under certain circumstances, may grow large. FIGR is incremental, allowing changes to the database to be reflected in the generalization results without rereading input data. FIGR also allows fast regeneralization to both higher and lower levels of generality without rereading input. We compare GDBR and FIGR to two previous algorithms, LCHR and AOI, which are O(n log n) and O(np), respectively, where n is the number of input tuples and p the number of tuples in the generalized relation. Both require O(n) space that, for large input, causes memory problems. We implemented all four algorithms and ran empirical tests, and we found that GDBR and FIGR are faster. In addition, their runtimes increase only linearly as input size increases, while the runtimes of LCHR and AOI increase greatly when input size exceeds memory limitations.

Journal Article•DOI•
TL;DR: The results of such an evaluation show that the quality of automatic composition is comparable to-and in some cases, better than-broadcast news video composition.
Abstract: Video production involves the process of capturing, editing and composing video segments for delivery to a consumer. A composition must yield a coherent presentation of an event or narrative. This process can be automated if appropriate domain-specific metadata are associated with video segments and composition techniques are established. Automation leads to the support of dynamic composition and customization for applications such as news on demand. In this paper, we present techniques to achieve dynamic, real-time and cohesive video composition and customization. We also identify metrics for evaluating our techniques with respect to existing manually produced video-based news. The results of such an evaluation show that the quality of automatic composition is comparable to-and in some cases, better than-broadcast news video composition. The results also validate the assertions on which the automatic composition techniques are based.

Journal Article•DOI•
TL;DR: In this article, the authors investigate the characteristics that a model must possess to properly express the timing relationships among multimedia data, and provide a classification for the various models proposed in the literature.
Abstract: Multimedia information systems are considerably more complex than traditional ones in that they deal with very heterogeneous data such as text, video, and audio-characterized by different characteristics and requirements. One of the central characteristics of multimedia data is that of being heavily time-dependent, in that they are usually related by temporal relationships that must be maintained during playout. We discuss problems related to modeling temporal synchronization specifications for multimedia data. We investigate the characteristics that a model must possess to properly express the timing relationships among multimedia data, and we provide a classification for the various models proposed in the literature. For each devised category, several examples are presented, whereas the most representative models of each category are illustrated in detail. Then, the presented models are compared with respect to the devised requirements, and future research issues are discussed.

Journal Article•DOI•
TL;DR: A performance study shows the viability of the proposed multigraph processing for representing and facilitating the processing of multiple queries when compared to an earlier multigraph approach.
Abstract: The efficiency of common subexpression identification is critical to the performance of multiple-query processing. In this paper, we develop a multigraph for representing and facilitating the processing of multiple queries. In addition to the traditional multiple-query processing approaches in exploiting common subexpressions for identical and subsumption cases, the proposed multigraph processing also covers the overlap case. A performance study shows the viability of this technique when compared to an earlier multigraph approach.

Journal Article•DOI•
TL;DR: A set of topological invariants for relations between lines embedded in the 2-dimensional Euclidean space is given and is proven to be necessary and sufficient to characterize topological equivalence classes of binary relations between simple lines.
Abstract: A set of topological invariants for relations between lines embedded in the 2-dimensional Euclidean space is given. The set of invariants is proven to be necessary and sufficient to characterize topological equivalence classes of binary relations between simple lines. The topology of arbitrarily complex geometric scenes is described with a variation of the same set of invariants. Polynomial time algorithms are given to assess topological equivalence of two scenes. Invariants and efficient algorithms is due to application areas of spatial database systems where a model for describing topological relations between planar features is sought.

Journal Article•DOI•
TL;DR: ADOME provides versatile role facilities that serve as "dynamic binders" between data objects and production rules, thereby facilitating flexible data and knowledge management integration.
Abstract: ADOME, which stands for ADvanced Object Modeling Environment, is an approach to integrating data and knowledge management based on object oriented technology. Next generation information systems will require more flexible data modeling capabilities than those provided by current object oriented DBMSs. In particular, integration of data and knowledge management capabilities will become increasingly important. In this context, ADOME provides versatile role facilities that serve as "dynamic binders" between data objects and production rules, thereby facilitating flexible data and knowledge management integration. A prototype that implements this mechanism and the associated operators has been constructed on top of a commercial object oriented DBMS and a rule base system.

Journal Article•DOI•
TL;DR: The authors derive formulas describing the scalability of two popular declustering methods-Disk Module and Fieldwise Xor-for range queries, which are the most common type of queries.
Abstract: Efficient storage and retrieval of multi-attribute data sets has become one of the essential requirements for many data-intensive applications. The Cartesian product file has been known as an effective multi-attribute file structure for partial-match and best-match queries. Several heuristic methods have been developed to decluster Cartesian product files across multiple disks to obtain high performance for disk accesses. Although the scalability of the declustering methods becomes increasingly important for systems equipped with a large number of disks, no analytic studies have been done so far. The authors derive formulas describing the scalability of two popular declustering methods-Disk Module and Fieldwise Xor-for range queries, which are the most common type of queries. These formulas disclose the limited scalability of the declustering methods, and this is corroborated by extensive simulation experiments. From the practical point of view, the formulas given in the paper provide a simple measure that can be used to predict the response time of a given range query and to guide the selection of a declustering method under various conditions.

Journal Article•DOI•
H. Jiang1, A.K. Elmagarmid•
TL;DR: The design and implementation of the WVTDB (Web-based VideoText DataBase) system is described that demonstrates the research on video data modeling, semantic content-based video querying and video database system architectures and addresses adaptivity, data access control and user profile issues.
Abstract: Describes the design and implementation of the WVTDB (Web-based VideoText DataBase) system that demonstrates our research on video data modeling, semantic content-based video querying and video database system architectures. The video data model of WVTDB is based on multi-level video data abstractions and annotation layering, thus allowing dynamic and incremental video annotation and indexing, multi-user view sharing and video data reuse. Users can query, retrieve and browse video data based on their semantic content descriptions and temporal constraints on the video segments. WVTDB employs a modular system architecture that supports distributed video query processing and subquery caching. Several techniques, such as video wrappers and lazy delivery, are also proposed specifically to address the network bandwidth limitations for this kind of Web-based system. We also address adaptivity, data access control and user profile issues.

Journal Article•DOI•
TL;DR: In this article, the authors propose a formalism to express a relevant set of state integrity constraints with a declarative style, and two specialized reasoners, based on the tableaux calculus, to check the consistency of complex objects database schemata expressed with the two formalisms.
Abstract: Integrity constraints are rules that should guarantee the integrity of a database. Provided an adequate mechanism to express them is available, the following question arises: is there any way to populate a database which satisfies the constraints supplied by a database designer? That is, does the database schema, including constraints, admit at least a nonempty model? This work answers the above question in a complex object database environment, providing a theoretical framework, including the following ingredients: (1) two alternative formalisms, able to express a relevant set of state integrity constraints with a declarative style; (2) two specialized reasoners, based on the tableaux calculus, able to check the consistency of complex objects database schemata expressed with the two formalisms. The proposed formalisms share a common kernel, which supports complex objects and object identifiers, and which allow the expression of acyclic descriptions of: classes, nested relations and views, built up by means of the recursive use of record, quantified set, and object type constructors and by the intersection, union, and complement operators. Furthermore, the kernel formalism allows the declarative formulation of typing constraints and integrity rules. In order to improve the expressiveness and maintain the decidability of the reasoning activities, we extend the kernel formalism into two alternative directions. The first formalism, OLCP, introduces the capability of expressing path relations. Because cyclic schemas are extremely useful, we introduce a second formalism, OLCD, with the capability of expressing cyclic descriptions but disallowing the expression of path relations. In fact, we show that the reasoning activity in OLCDP (i.e., OLCP with cycles) is undecidable.

Journal Article•DOI•
TL;DR: The algorithm to reduce the dimensionality of quadratic form-based similarity queries results in a lower-bounding distance function that is proven to provide an optimal filter selectivity.
Abstract: Shape similarity searching is a crucial task in image databases, particularly in the presence of errors induced by segmentation or scanning images. The resulting slight displacements or rotations have not been considered so far in the literature. We present a new similarity model that flexibly addresses this problem. By specifying neighborhood influence weights, the user may adapt the similarity distance functions to his or her own requirements or preferences. Technically, the new similarity model is based on quadratic forms for which we present a multi-step query processing architecture, particularly for high dimensions as they occur in image databases. Our algorithm to reduce the dimensionality of quadratic form-based similarity queries results in a lower-bounding distance function that is proven to provide an optimal filter selectivity. Experiments on our test database of 10,000 images demonstrate the applicability and the performance of our approach, even in dimensions as high as 1,024.

Journal Article•DOI•
TL;DR: A modified form of the algorithm, and a concept of hierarchical fuzzy Petri nets for data abstraction are proposed, which are based on a previous work on fuzzy reasoning.
Abstract: In the paper by S. Chen et al. (see ibid., vol.2, no.3, p.311-19, 1990), the authors proposed an algorithm which determines whether there exists an antecedent-consequence relationship from a fuzzy proposition d/sub s/ to proposition d/sub j/ and if the degree of truth of proposition d/sub s/ is given, then the degree of truth of proposition d/sub j/ can be evaluated. The fuzzy reasoning algorithm proposed by S. Chen et al. (1990) was found not to be working with all types of data. We propose: (1) a modified form of the algorithm, and (2) a concept of hierarchical fuzzy Petri nets for data abstraction.

Journal Article•DOI•
TL;DR: This paper proposes a geometry-based structure for representing the spatial relationships in the images and an associated spatial similarity algorithm that recognizes both translation, scale, and rotation variants of an image, and variants of the image generated by an arbitrary composition of translation, Scale and rotation transformations.
Abstract: A spatial similarity algorithm assesses the degree to which the spatial relationships among the domain objects in a database image conform to those specified in the query image. In this paper, we propose a geometry-based structure for representing the spatial relationships in the images and an associated spatial similarity algorithm. The proposed algorithm recognizes both translation, scale, and rotation variants of an image, and variants of the image generated by an arbitrary composition of translation, scale, and rotation transformations. The algorithm has /spl Theta/(n log n) time complexity in terms of the number of objects common to the database and query images. The retrieval effectiveness of the proposed algorithm is evaluated using the TESSA image collection.

Journal Article•DOI•
TL;DR: A faceted requirement classification scheme for analyzing heterogeneous requirements provides a framework for formally analyzing and modeling conflicts between requirements, and for users to better understand relationships among their requirements.
Abstract: We propose a faceted requirement classification scheme for analyzing heterogeneous requirements. The representation of vague requirements is based on L.A. Zadeh's (1986) canonical form in test score semantics and an extension of the notion of soft conditions. The trade-off among vague requirements is analyzed by identifying the relationship between requirements, which could be either conflicting, irrelevant, cooperative, counterbalance, or independent. Parameterized aggregation operators, fuzzy and/or, are selected to combine individual requirements. An extended hierarchical aggregation structure is proposed to establish a four-level requirements hierarchy to facilitate requirements and criticalities aggregation through the fuzzy and/or. A compromise overall requirement can be obtained through the aggregation of individual requirements based on the requirements hierarchy. The proposed approach provides a framework for formally analyzing and modeling conflicts between requirements, and for users to better understand relationships among their requirements.

Journal Article•DOI•
TL;DR: The paper presents techniques on how assumptions on specific sets of attributes can be automatically derived from the specification of interpolation and conversion functions, and how a user query can be converted into a system query such that the answer of this system query over the explicit data is the same as that of the user queryover the explicit and the implicit data.
Abstract: Data explicitly stored in a temporal database are often associated with certain semantic assumptions. Each assumption can be viewed as a way of deriving implicit information from explicitly stored data. Rather than leaving the task of deriving (possibly infinite) implicit data to application programs, as is the case currently, it is desirable that this be handled by the database management system. To achieve this, the paper formalizes and studies two types of semantic assumptions: point based and interval based. The point based assumptions include those assumptions that use interpolation methods over values at different time instants, while the interval based assumptions include those that involve the conversion of values across different time granularities. The paper presents techniques on: (1) how assumptions on specific sets of attributes can be automatically derived from the specification of interpolation and conversion functions; and (2) given the representation of assumptions, how a user query can be converted into a system query such that the answer of this system query over the explicit data is the same as that of the user query over the explicit and the implicit data. To precisely illustrate concepts and algorithms, the paper uses a logic based abstract query language. The paper also shows how the same concepts can be applied to concrete temporal query languages.

Journal Article•DOI•
TL;DR: This work shows that the relational model is not the only possible semantic reference model for constraint relational databases and it shows how constraint relations can be interpreted under the nested relational model, and introduces two distinct classes of constraint algebras.
Abstract: Constraint relational databases use constraints to both model and query data. A constraint relation contains a finite set of generalized tuples. Each generalized tuple is represented by a conjunction of constraints on a given logical theory and, depending on the logical theory and the specific conjunction of constraints, it may possibly represent an infinite set of relational tuples. For their characteristics, constraint databases are well suited to model multidimensional and structured data, like spatial and temporal data. The definition of an algebra for constraint relational databases is important in order to make constraint databases a practical technology. We extend the previously defined constraint algebra (called generalized relational algebra). First, we show that the relational model is not the only possible semantic reference model for constraint relational databases and we show how constraint relations can be interpreted under the nested relational model. Then, we introduce two distinct classes of constraint algebras, one based on the relational algebra, and one based on the nested relational algebra, and we present an algebra of the latter type. The algebra is proved equivalent to the generalized relational algebra when input relations are modified by introducing generalized tuple identifiers. However, from a user point of view, it is more suitable. Thus, the difference existing between such algebras is similar to the difference existing between the relational algebra and the nested relational algebra, dealing with only one level of nesting. We also show how external functions can be added to the proposed algebra.

Journal Article•DOI•
TL;DR: The result shows that the path dictionary index method is significantly better than the path index method over a wide range of parameters in terms of retrieval and update costs and that the storage overhead grows slowly with the number of indexed attributes.
Abstract: We present a new access method, called the path dictionary index (PDI) method, for supporting nested queries on object-oriented databases. PDI supports object traversal and associative search, respectively, with a path dictionary and a set of attribute indexes built on top of the path dictionary. We discuss issues on indexing and query processing in object-oriented databases; describe the operations of the new mechanism; develop cost models for its storage overhead and query and update costs; and compare the new mechanism to the path index method. The result shows that the path dictionary index method is significantly better than the path index method over a wide range of parameters in terms of retrieval and update costs and that the storage overhead grows slowly with the number of indexed attributes.