Journal ArticleDOI
Efficient knowledge discovery through the integration of heterogeneous data
Bryan Scotney,Sally McClean +1 more
Reads0
Chats0
TLDR
By integrating heterogeneous data it is often possible to discover new information at a finer level of granularity than that available in any of the contributing data sources, and a system may be enabled to induce new rules based on information at that would be possible without integration.Abstract:
It is commonly the case that a distributed database holds data originating from a number of different sources. These heterogeneous data sources may provide different views of the same data, or they may be different samples from the same population. In each case, a methodology is provided for combining data which may be held at different levels of granularity. Variations in granularity arise due to the use of different levels within a concept hierarchy or the use of different concept hierarchies. Data integration is accomplished using the intersection hypergraph to produce the integrated universal classification scheme and to determine the cardinalities of each category within the universal table. In the commonly occurring case of continuous or ordinal data, an explicit and efficient computational algorithm is presented. By integrating heterogeneous data it is often possible to discover new information at a finer level of granularity than that available in any of the contributing data sources. A system may be thus be enabled, so as to induce new rules based on information at a finer level of granularity than that would be possible without integration.read more
Citations
More filters
Journal ArticleDOI
Database aggregation of imprecise and uncertain evidence
Bryan Scotney,Sally McClean +1 more
TL;DR: This paper shows first how this mechanism can be used to resolve inconsistencies and hence provide an essential database capability to perform the operations necessary to respond to queries on imprecise and uncertain data.
Journal ArticleDOI
Optimal and efficient integration of heterogeneous summary tables in a distributed database
TL;DR: In this paper, the classification schemes are described using a matrix representation of the intersection hypergraph, and efficient numerical algorithms are proposed to determine the optimal granularity of the integrated summary data.
Journal ArticleDOI
A scalable approach to integrating heterogeneous aggregate views of distributed databases
TL;DR: This paper develops an approach that can handle data inconsistencies and is thus inherently much more scalable and first construct a dynamic shared ontology by analyzing the correspondence graph that relates the heterogeneous classification schemes.
Journal ArticleDOI
Knowledge discovery by probabilistic clustering of distributed databases
TL;DR: This work clusters databases that hold aggregate count data on categorical attributes that have been classified according to homogeneous or heterogeneous classification schemes, of which the most efficient avoid the need to compute a dynamic shared ontology to homogenise the classification schemes prior to clustering.
Journal ArticleDOI
Conceptual Clustering of Heterogeneous GeneExpression Sequences
TL;DR: A model-based approach that uses a Hidden Markov Model (HMM) that has as states the stages of the underlying process that generates the gene sequences, thus allowing us to handle complex and heterogeneous data.
References
More filters
Journal ArticleDOI
Maximum likelihood from incomplete data via the EM algorithm
Book
Knowledge Discovery in Databases
Gregory Piateski,William Frawley +1 more
TL;DR: Knowledge Discovery in Databases brings together current research on the exciting problem of discovering useful and interesting knowledge in databases, which spans many different approaches to discovery, including inductive learning, bayesian statistics, semantic query optimization, knowledge acquisition for expert systems, information theory, and fuzzy 1 sets.
Knowledge DIscovery in Databases:An Overview
TL;DR: In the 1990s, the AAAI Press book Knowledge Discovery in Databases was published, and the potential benefits of this research were discussed by the contributors to the book as discussed by the authors, who hope that some of this excitement will communicate itself to "AI Magazine readers of this article".