Showing papers on "Knowledge extraction published in 2001"

PDF

Open Access

Book Chapter•DOI•

Knowledge Discovery in Multi-label Phenotype Data

[...]

Amanda Clare¹, Ross D. King¹•Institutions (1)

03 Sep 2001

TL;DR: This work uses KDD to analyse data from mutant phenotype growth experiments with the yeast S. cerevisiae to predict novel gene functions, and learns rules which are accurate and biologically meaningful.

...read moreread less

Abstract: The biological sciences are undergoing an explosion in the amount of available data. New data analysis methods are needed to deal with the data. We present work using KDD to analyse data from mutant phenotype growth experiments with the yeast S. cerevisiae to predict novel gene functions. The analysis of the data presented a number of challenges: multi-class labels, a large number of sparsely populated classes, the need to learn a set of accurate rules (not a complete classification), and a very large amount of missing values. We developed resampling strategies and modified the algorithm C4.5 to deal with these problems. Rules were learnt which are accurate and biologically meaningful. The rules predict function of 83 putative genes of currently unknown function at an estimated accuracy of ≥ 80%.

...read moreread less

714 citations

Book•

Information Visualization in Data Mining and Knowledge Discovery

[...]

Usama M. Fayyad, Georges Grinstein¹, Andreas Wierse•Institutions (1)

University of Massachusetts Lowell¹

03 Sep 2001

TL;DR: Leading researchers from the fields of data mining, data visualization, and statistics present findings organized around topics introduced in two recent international knowledge discovery and data mining workshops as formal chapters that together comprise a complete, cohesive body of research.

...read moreread less

Abstract: From the Publisher: Mainstream data mining techniques significantly limit the role of human reasoning and insight. Likewise, in data visualization, the role of computational analysis is relatively small. The power demonstrated individually by these approaches to knowledge discovery suggests that somehow uniting the two could lead to increased efficiency and more valuable results. But is this true? How might it be achieved? And what are the consequences for data-dependent enterprises? Information Visualization in Data Mining and Knowledge Discovery is the first book to ask and answer these thought-provoking questions. It is also the first book to explore the fertile ground of uniting data mining and data visualization principles in a new set of knowledge discovery techniques. Leading researchers from the fields of data mining, data visualization, and statistics present findings organized around topics introduced in two recent international knowledge discovery and data mining workshops. Collected and edited by three of the area's most influential figures, these chapters introduce the concepts and components of visualization, detail current efforts to include visualization and user interaction in data mining, and explore the potential for further synthesis of data mining algorithms and data visualization techniques. This incisive, groundbreaking research is sure to wield a strong influence in subsequent efforts in both academic and corporate settings. Features Details advances made by leading researchers from the fields of data mining, data visualization, and statistics. Provides a useful introduction to the science of visualization, sketches the current role for visualization in data mining, and then takes a long look into its mostly untapped potential. Presents the findings of recent international KDD workshops as formal chapters that together comprise a complete, cohesive body of research. Offerss compelling and practical information for professionals and researchers in database technology, data mining, knowledge discovery, artificial intelligence, machine learning, neural networks, statistics, pattern recognition, information retrieval, high-performance computing, and data visualization. Author Biography: Usama Fayyad is co-founder, president, and CEO of digiMine, a data warehousing and data mining ASP. Prior to digiMine, he founded and led Microsoft's Data Mining and Exploration Group, where he developed data mining prediction components for Microsoft Site Server and scalable algorithms for mining large databases. Georges G. Grinstein is a professor of computer science, director of the Institute for Visualization and Perception Research, and co-director of the Center for Bioinformatics and Computational Biology at the University of Massachusetts, Lowell. He is currently the chief technologist for AnVil Informatics, a data exploration company. Andreas Wierse is the managing director of VirCinity, a spin-off company of the Computing Centre of the University of Stuttgart. Previously, he worked at the Computer Centre, where he designed and implemented distributed data management for the COVISE visualization system and maintained a wide range of graphics workstations.

...read moreread less

431 citations

Knowledge Discovery in Databases.

[...]

Jörg Rech

01 Jan 2001

375 citations

Geographic Data Mining and Knowledge Discovery

[...]

John F. Roddick, Brian Lees

01 Jan 2001

261 citations

Journal Article•DOI•

Visualizing a knowledge domain's intellectual structure

[...]

Chaomei Chen¹, Ray J. Paul¹•Institutions (1)

Brunel University London¹

01 Mar 2001-IEEE Computer

TL;DR: This work has developed a method that extends and transforms traditional author co-citation analysis by extracting structural patterns from the scientific literature and representing them in a 3D knowledge landscape.

...read moreread less

Abstract: To make knowledge visualizations clear and easy to interpret, we have developed a method that extends and transforms traditional author co-citation analysis by extracting structural patterns from the scientific literature and representing them in a 3D knowledge landscape.

...read moreread less

240 citations

Book•DOI•

Instance Selection and Construction for Data Mining

[...]

Huan Liu, Hiroshi Motoda

01 Jan 2001

TL;DR: This volume serves as a comprehensive reference for graduate students, practitioners and researchers in KDD to report new developments and applications, to share hard-learned experiences in order to avoid similar pitfalls, and to shed light on the future development of instance selection.

...read moreread less

Abstract: The ability to analyze and understand massive data sets lags far behind the ability to gather and store the data. To meet this challenge, knowledge discovery and data mining (KDD) is growing rapidly as an emerging field. However, no matter how powerful computers are now or will be in the future, KDD researchers and practitioners must consider how to manage ever-growing data which is, ironically, due to the extensive use of computers and ease of data collection with computers. Many different approaches have been used to address the data explosion issue, such as algorithm scale-up and data reduction. Instance, example, or tuple selection pertains to methods or algorithms that select or search for a representative portion of data that can fulfill a KDD task as if the whole data is used. Instance selection is directly related to data reduction and becomes increasingly important in many KDD applications due to the need for processing efficiency and/or storage efficiency. One of the major means of instance selection is sampling whereby a sample is selected for testing and analysis, and randomness is a key element in the process. Instance selection also covers methods that require search. Examples can be found in density estimation (finding the representative instances -- data points -- for a cluster); boundary hunting (finding the critical instances to form boundaries to differentiate data points of different classes); and data squashing (producing weighted new data with equivalent sufficient statistics). Other important issues related to instance selection extend to unwanted precision, focusing, concept drifts, noise/outlier removal, data smoothing, etc. Instance Selection and Construction for Data Mining brings researchers and practitioners together to report new developments and applications, to share hard-learned experiences in order to avoid similar pitfalls, and to shed light on the future development of instance selection. This volume serves as a comprehensive reference for graduate students, practitioners and researchers in KDD.

...read moreread less

228 citations

Book•

Knowledge discovery and measures of interest

[...]

Robert J. Hilderman, Howard J. Hamilton

01 Jan 2001

TL;DR: This paper presents a data mining technique and an interestingness framework based on heuristic measures of interestingness that were developed in the second part of this monograph on interestingness and data mining.

...read moreread less

Abstract: List of Figures. List of Tables. Preface. Acknowledgments. 1. Introduction. 2. Background and Related Work. 3. A Data Mining Technique. 4. Heuristic Measures of Interestingness. 5. An Interestingness Framework. 6. Experimental Analyses. 7. Conclusion. Appendices. Index.

...read moreread less

221 citations

Journal Article•DOI•

Symbolic knowledge extraction from trained neural networks: a sound approach

[...]

A.S. d'Avila Garcez¹, Krysia Broda¹, Dov M. Gabbay²•Institutions (2)

Imperial College London¹, King's College London²

01 Jan 2001-Artificial Intelligence

TL;DR: A new method of extraction is presented that captures nonmonotonic rules encoded in the network, and it is proved that such a method is sound and able to keep the soundness of the extraction algorithm.

...read moreread less

217 citations

Journal Article•DOI•

Knowledge discovery in time series databases

[...]

Y. Klein¹, Abraham Kandel²•Institutions (2)

Tel Aviv University¹, University of South Florida²

01 Feb 2001

TL;DR: This correspondence introduces a general methodology for knowledge discovery in TSDB, based on signal processing techniques and the information-theoretic fuzzy approach to knowledge discovery and demonstrates the approach on two types of time series: stock-market data and weather data.

...read moreread less

Abstract: Adding the dimension of time to databases produces time series databases (TSDB) and introduces new aspects and difficulties to data mining and knowledge discovery. In this correspondence, we introduce a general methodology for knowledge discovery in TSDB. The process of knowledge discovery in TSDR includes cleaning and filtering of time series data, identifying the most important predicting attributes, and extracting a set of association rules that can be used to predict the time series behavior in the future. Our method is based on signal processing techniques and the information-theoretic fuzzy approach to knowledge discovery. The computational theory of perception (CTP) is used to reduce the set of extracted rules by fuzzification and aggregation. We demonstrate our approach on two types of time series: stock-market data and weather data.

...read moreread less

192 citations

Patent•

System and method for analysis and clustering of documents for search engine

[...]

Zbigniew Michalewicz, Andrzej Jankowski

03 Aug 2001

TL;DR: In this paper, a system and method for searching documents in a data source and more particularly, to a system for analyzing and clustering of documents for a search engine is presented. But the system is not suitable for large scale data sets.

...read moreread less

Abstract: A system and method for searching documents in a data source and more particularly, to a system and method for analyzing and clustering of documents for a search engine. The system and method includes analyzing and processing documents to secure the infrastructure and standards for optimal document processing. By incorporating Computational Intelligence (CI) and statistical methods, the document information is analyzed and clustered using novel techniques for knowledge extraction. A comprehensive dictionary is built based on the keywords identified by the these techniques from the entire text of the document. The text is parsed for keywords or the number of its occurrences and the context in which the word appears in the documents. The whole document is identified by the knowledge that is represented in its contents. Based on such knowledge extracted from all the documents, the documents are clustered into meaningful groups in a catalog tree. The results of document analysis and clustering information are stored in a database.

...read moreread less

184 citations

Book Chapter•DOI•

Fundamentals of spatial data warehousing for geographic knowledge discovery

[...]

Yvan Bédard¹, Tim Merrett², Jiawei Han²•Institutions (2)

Laval University¹, University of Illinois at Urbana–Champaign²

11 Oct 2001

TL;DR: The penetration of data warehouses into the management and exploitation of spatial databases is a major trend as it is for non-spatial databases.

...read moreread less

Abstract: Recent years have witnessed major changes in the Geographic Information System (GIS) market, from technological offerings to user requests. For example, spatial databases used to be implemented in GISs or in Computer-Assisted Design (CAD) systems coupled with a Relational Data Base Management System (RDBMS). Today, spatial databases are also implemented in spatial extensions of universal servers, in spatial engine software components, in GIS web servers, in analytical packages using so-called 'data cubes' and in spatial data warehouses. Such databases are structured according to either a relational, object-oriented, multi-dimensional or hybrid paradigm. In addition, these offerings are integrated as a piece of the overall technological framework of the organization and they are implemented according to very diverse architectures responding to differing users' contexts: centralized vs distributed, thin-clients vs thick-clients, Local Area Network (LAN) vs intranets, spatial data warehouses vs legacy systems, etc. As one may say, 'Gone are the days of a spatial database implemented solely on a stand-alone GIS' (Bédard 1999). In fact, this evolution of the GIS market follows the general trends of mainstream Information Technologies (IT). Among all these possibilities, the penetration of data warehouses into the management and exploitation of spatial databases is a major trend as it is for non-spatial databases. According to Rawling and Kucera (1997), 'the term Data Warehouse has become the hottest industry buzzword of the decade just behind Internet and information highway'. More specifically, this penetration of data warehouses allows developers to build new solutions geared towards one major need which has never been solved efficiently insofar: to provide a unified view of dispersed heterogeneous databases in order to efficiently feed the decision-support tools used for strategic decision making. In fact, the data warehouse emerged as the unifying solution to a series of individual circumstances related to providing the necessary basis for global knowledge discovery. First, large organizations often have several departmental or application-oriented independent databases which may overlap in content. Usually, such systems work properly for day-today operational-level decisions. However, when one needs to obtain aggregated or summarized information integrating data from these different

...read moreread less

Proceedings Article•DOI•

Making knowledge visible through intranet knowledge maps: concepts, elements, cases

[...]

Martin J. Eppler¹•Institutions (1)

University of St. Gallen¹

03 Jan 2001

TL;DR: The paper conceptualizes five types of knowledge maps that can be used in managing organizational knowledge and proposes a five-step procedure to implement knowledge maps in a corporate intranet.

...read moreread less

Abstract: Establishes the conceptual and empirical basis for an innovative instrument of corporate knowledge management: the knowledge map. It begins by briefly outlining the rationale for knowledge mapping, i.e. providing a common context to access expertise and experience in large companies. It then conceptualizes five types of knowledge maps that can be used in managing organizational knowledge. They are: knowledge sources, assets, structures, applications and development maps. In order to illustrate these five types of maps, a series of examples is presented (from a multimedia agency, a consulting group, a market research firm and a medium-sized services company), and the advantages and disadvantages of the knowledge mapping technique for knowledge management are discussed. The paper concludes with a series of quality criteria for knowledge maps and proposes a five-step procedure to implement knowledge maps in a corporate intranet.

...read moreread less

Jambalaya: Interactive visualization to enhance ontology authoring and knowledge acquisition in Protégé

[...]

Margaret-Anne Storey, Mark A. Musen, John S. Silva, Casey Best, Neil A. Ernst, Ray W. Fergerson, Natasha Noy - Show less +3 more

01 Jan 2001

TL;DR: This paper describes the integration of an interactive visualization user interface with a knowledge management tool called Protege, a general-purpose tool that allows domain experts to build knowledge-based systems by creating and modifying reusable ontologies and problem-solving methods.

...read moreread less

Abstract: This paper describes the integration of an interactive visualization user interface with a knowledge management tool called Protege. Protege is a general-purpose tool that allows domain experts to build knowledge-based systems by creating and modifying reusable ontologies and problem-solving methods, and by instantiating ontologies to construct knowledge bases. The SHriMP (Simple Hierarchical Multi-Perspective) visualization technique was designed to enhance how people browse, explore and interact with complex information spaces. Although SHriMP is information independent, its primary use to date has been for visualizing and documenting software programs. The paper describes how we have applied software visualization techniques to more general knowledge domains. It is hoped that the integrated environment (called Jambalaya) will result in an easier to use and more powerful environment to support ontology evolution and knowledge acquisition. An example scenario of how Jambalaya can be applied to knowledge acquisition is provided.

...read moreread less

Journal Article•DOI•

Confirmation-guided discovery of first-order rules with tertius

[...]

Peter A. Flach¹, Nicolas Lachiche¹•Institutions (1)

University of Bristol¹

01 Jan 2001-Machine Learning

TL;DR: This paper deals with learning first-order logic rules from data lacking an explicit classification predicate, and describes a heuristic measure of confirmation, trading off novelty and satisfaction of the rule.

...read moreread less

Abstract: This paper deals with learning first-order logic rules from data lacking an explicit classification predicate. Consequently, the learned rules are not restricted to predicate definitions as in supervised inductive logic programming. First-order logic offers the ability to deal with structured, multi-relational knowledge. Possible applications include first-order knowledge discovery, induction of integrity constraints in databases, multiple predicate learning, and learning mixed theories of predicate definitions and integrity constraints. One of the contributions of our work is a heuristic measure of confirmation, trading off novelty and satisfaction of the rule. The approach has been implemented in the Tertius system. The system performs an optimal best-first search, finding the k most confirmed hypotheses, and includes a non-redundant refinement operator to avoid duplicates in the search. Tertius can be adapted to many different domains by tuning its parameters, and it can deal either with individual-based representations by upgrading propositional representations to first-order, or with general logical rules. We describe a number of experiments demonstrating the feasibility and flexibility of our approach.

...read moreread less

Journal Article•DOI•

Knowledge networking: a holistic solution for leveraging corporate knowledge

[...]

Gregoris Mentzas¹, Dimitris Apostolou, Ronald Young, Andreas Abecker•Institutions (1)

National Technical University of Athens¹

01 Mar 2001-Journal of Knowledge Management

TL;DR: The Know‐Net solution is presented, that aims to innovatively fuse the process‐centred approach with the product‐ Centred approach by developing a knowledge asset‐centric design and includes a theoretical framework, a corporate transformation and measurement method and a software tool.

...read moreread less

Abstract: Two main approaches to knowledge management (KM) have been followed by early adopters of the principle: the process‐centred approach, that mainly treats KM as a social communication process; and the product‐centred approach, that focuses on knowledge artefacts, their creation, storage and reuse in computer‐based corporate memories. This distinction is evident not only in KM implementations in companies, but also in supporting methodologies and tools. This paper presents the Know‐Net solution that aims to innovatively fuse the process‐centred approach with the product‐centred approach by developing a knowledge asset‐centric design. The Know‐Net solution includes a theoretical framework, a corporate transformation and measurement method and a software tool.

...read moreread less

Book Chapter•DOI•

Discovery of relational association rules

[...]

Luc Dehaspe, Hannu Toironen¹•Institutions (1)

Nokia¹

05 Oct 2001

TL;DR: Algorithms for relational association rule discovery that are well-suited for exploratory data mining are presented, which offer the flexibility required to experiment with examples more complex than feature vectors and patternsMore complex than item sets.

...read moreread less

Abstract: Within KDD, the discovery of frequent patterns has been studied in a variety of settings. In its simplest form, known from association rule mining, the task is to discover all frequent item sets, i.e., all combinations of items that are found in a sufficient number of examples. We present algorithms for relational association rule discovery that are well-suited for exploratory data mining. They offer the flexibility required to experiment with examples more complex than feature vectors and patterns more complex than item sets.

...read moreread less

Journal Article•DOI•

A new approach to online generation of association rules

[...]

Charu C. Aggarwal¹, Philip S. Yu•Institutions (1)

IBM¹

01 Jul 2001-IEEE Transactions on Knowledge and Data Engineering

TL;DR: The problem of online mining of association rules in a large database of sales transactions is discussed, with the use of nonredundant association rules helping significantly in the reduction of irrelevant noise in the data mining process.

...read moreread less

Abstract: We discuss the problem of online mining of association rules in a large database of sales transactions. The online mining is performed by preprocessing the data effectively in order to make it suitable for repeated online queries. We store the preprocessed data in such a way that online processing may be done by applying a graph theoretic search algorithm whose complexity is proportional to the size of the output. The result is an online algorithm which is independent of the size of the transactional data and the size of the preprocessed data. The algorithm is almost instantaneous in the size of the output. The algorithm also supports techniques for quickly discovering association rules from large itemsets. The algorithm is capable of finding rules with specific items in the antecedent or consequent. These association rules are presented in a compact form, eliminating redundancy. The use of nonredundant association rules helps significantly in the reduction of irrelevant noise in the data mining process.

...read moreread less

Proceedings Article•DOI•

Integrating e-commerce and data mining: architecture and challenges

[...]

Suhail Ansari¹, Ron Kohavi, Llew Mason, Zijian Zheng•Institutions (1)

Blue Martini Software¹

29 Nov 2001

TL;DR: It is shown that the e-commerce domain can provide all the right ingredients for successful data mining and an integrated architecture for supporting this integration is described, which can dramatically reduce the pre-processing, cleaning, and data understanding effort in knowledge discovery projects.

...read moreread less

Abstract: We show that the e-commerce domain can provide all the right ingredients for successful data mining. We describe an integrated architecture for supporting this integration. The architecture can dramatically reduce the pre-processing, cleaning, and data understanding effort often documented to take 80% of the time in knowledge discovery projects. We emphasize the need for data collection at the application server layer (not the Web server) in order to support logging of data and metadata that is essential to the discovery process. We describe the data transformation bridges required from the transaction processing systems and customer event streams (e.g., clickstreams) to the data warehouse. We detail the mining workbench, which needs to provide multiple views of the data through reporting, data mining algorithms, visualization, and OLAP. We conclude with a set of challenges.

...read moreread less

Journal Article•

An updated bibliography of temporal, spatial, and spatio-temporal data mining research

[...]

John F. Roddick, Kathleen Hornsby, Myra Spiliopoulou

01 Jan 2001-Lecture Notes in Computer Science

TL;DR: This bibliography subsumes an earlier bibliography and shows that the value of investigating temporal, spatial and spatio-temporal data has been growing in both interest and applicability.

...read moreread less

Abstract: Data mining and knowledge discovery have become important issues for research over the past decade. This has been caused not only by the growth in the size of datasets but also in the availability of otherwise unavailable datasets over the Internet and the increased value that organisations now place on the knowledge that can be gained from data analysis. It is therefore not surprising that the increased interest in temporal and spatial data has led also to an increased interest in mining such data. This bibliography subsumes an earlier bibliography and shows that the value of investigating temporal, spatial and spatio-temporal data has been growing in both interest and applicability.

...read moreread less

Journal Article•DOI•

Data mining approach to policy analysis in a health insurance domain

[...]

Young Moon Chae¹, Seung Hee Ho¹, Kyoung Won Cho¹, Dong Ha Lee², Sun Ha Ji¹ - Show less +1 more•Institutions (2)

Yonsei University¹, Pohang University of Science and Technology²

01 Jul 2001-International Journal of Medical Informatics

TL;DR: This study validated the predictive power of data mining algorithms by comparing the performance of logistic regression and two decision tree algorithms, CHIAD (Chi-squared Automatic Interaction Detection) and C5.0 (a variant of C4.5) using the Korea Medical Insurance Corporation database.

...read moreread less

Proceedings Article•DOI•

Epsilon grid order: an algorithm for the similarity join on massive high-dimensional data

[...]

Christian Bohm¹, Bernhard Braunmüller¹, Florian Krebs¹, Hans-Peter Kriegel¹•Institutions (1)

Ludwig Maximilian University of Munich¹

01 May 2001

TL;DR: The Epsilon Grid Order is proposed, a new algorithm for determining the similarity join of very large data sets, based on a particular sort order of the data points, obtained by laying an equi-distant grid with cell length ε over the data space and comparing the grid cells lexicographically.

...read moreread less

Abstract: The similarity join is an important database primitive which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. The similarity join combines two point sets of a multidimensional vector space such that the result contains all point pairs where the distance does not exceed a parameter e. In this paper, we propose the Epsilon Grid Order, a new algorithm for determining the similarity join of very large data sets. Our solution is based on a particular sort order of the data points, which is obtained by laying an equi-distant grid with cell length e over the data space and comparing the grid cells lexicographically. A typical problem of grid-based approaches such as MSJ or the e-kdB-tree is that large portions of the data sets must be held simultaneously in main memory. Therefore, these approaches do not scale to large data sets. Our technique avoids this problem by an external sorting algorithm and a particular scheduling strategy during the join phase. In the experimental evaluation, a substantial improvement over competitive techniques is shown.

...read moreread less

Journal Article•DOI•

Applications of Data Mining to Electronic Commerce

[...]

Ron Kohavi¹, Foster Provost²•Institutions (2)

Blue Martini Software¹, New York University²

01 Jan 2001-Data Mining and Knowledge Discovery

TL;DR: Electronic commerce is emerging as the killer domain for data—mining technology, and there is support for such a bold statement.

...read moreread less

Abstract: Electronic commerce is emerging as the killer domain for data—mining technology. Is there support for such a bold statement? Data—mining technologies have been around for decades, without moving significantly beyond the domain of computer scientists, statisticians, and hard-core business analysts. Why are electronic commerce systems any different from other data—mining applications?

...read moreread less

Extracting knowledge-rich contexts for terminography

[...]

Ingrid Meyer¹•Institutions (1)

University of Ottawa¹

15 Jun 2001

TL;DR: The concept of a knowledge-rich context, its major types and its components, and a methodology for developing extraction tools that is based on lexical, grammatical and paralinguistic patterns are defined.

...read moreread less

Abstract: Knowledge-rich contexts express conceptual information for a term. Terminographers need such contexts to construct definitions, and to acquire domain knowledge. This paper summarizes what we have learned about extracting knowledge-rich contexts semi-automatically. First, we define the concept of a knowledge-rich context, its major types and its components. Second, we describe a methodology for developing extraction tools that is based on lexical, grammatical and paralinguistic patterns. Third, we outline the most problematic research issues that must be addressed before semi-automatic knowledge extraction can become a fully mature field.

...read moreread less

Journal Article•DOI•

A design knowledge management system to support collaborative information product evolution

[...]

Amrit Tiwana¹, Balasubramaniam Ramesh²•Institutions (2)

Emory University¹, J. Mack Robinson College of Business²

01 Jun 2001

TL;DR: A prototype Knowledge Management System (KMS) that supports linking of artifacts to processes, flexible interaction and hypermedia services, distribution annotation and authoring as well as providing visibility to artifacts as they change over time is discussed.

...read moreread less

Abstract: The Internet has led to the widespread trade of digital information products. These products exhibit unusual properties such as high fixed costs and near-zero marginal costs. They need to be developed on compressed time frames by spatially and temporally distributed teams, have short lifecycles, and high perishability. This paper addresses the challenges that information product development (IPD) teams face. Drawing on the knowledge intensive nature of IPD tasks, we identify potential solutions to these problems that can be provided by a knowledge management system. We discuss a prototype Knowledge Management System (KMS) that supports linking of artifacts to processes, flexible interaction and hypermedia services, distribution annotation and authoring as well as providing visibility to artifacts as they change over time. Using a case from the publishing industry, we illustrate how contextualized decision paths/traces provide a rich base of formal and informal knowledge that supports IPD teams.

...read moreread less

Book Chapter•DOI•

Intelligent Structuring and Reducing of Association Rules with Formal Concept Analysis

[...]

Gerd Stumme¹, Rafik Taouil², Yves Bastide³, Nicolas Pasquier⁴, Lotfi Lakhal - Show less +1 more•Institutions (4)

Karlsruhe Institute of Technology¹, French Institute for Research in Computer Science and Automation², Blaise Pascal University³, University of Nice Sophia Antipolis⁴

19 Sep 2001-Lecture Notes in Computer Science

TL;DR: Based on results about knowledge representation within the theoretical framework of Formal Concept Analysis, relatively small bases for association rules from which all rules can be deduced are presented.

...read moreread less

Abstract: Association rules are used to investigate large databases. The analyst is usually confronted with large lists of such rules and has to find the most relevant ones for his purpose. Based on results about knowledge representation within the theoretical framework of Formal Concept Analysis, we present relatively small bases for association rules from which all rules can be deduced. We also provide algorithms for their calculation.

...read moreread less

Journal Article•DOI•

The Integration of Geographic Visualization with Knowledge Discovery in Databases and Geocomputation

[...]

Mark Gahegan, Monica Wachowicz, Mark Harrower, Theresa-Marie Rhyne

01 Jan 2001-Cartography and Geographic Information Science

TL;DR: The paper stresses the need for the closer integration of three largely disparate technologies: geographic visualization, knowledge discovery in databases, and geocomputation.

...read moreread less

Abstract: This paper details the research agenda of the International Cartographic Association Commission on Visualization: Working Group on Database-Visualization Links. The paper stresses the need for the closer integration of three largely disparate technologies: geographic visualization, knowledge discovery in databases, and geocomputation. The introduction explains the meaning behind these terms, the ethos behind their practice, and their connections within the broad realm of knowledge construction activities. The state of the art is then described for different approaches to knowledge construction, concentrating where possible on visual and geographically oriented methods. From these sections, a research agenda is synthesized in the form of three sets of research questions addressing: (1) visual approaches to data mining; (2) visual support for knowledge construction and geocomputation; and (3) databases and data models that must be satisfied to make visually-led knowledge construction a reality in the geogra...

...read moreread less

Extracting knowledge-rich contexts for terminography: A conceptual and methodological framework

[...]

Ingrid Meyer

15 Jun 2001

TL;DR: Knowledge-rich contexts express conceptual information for a term as discussed by the authors, and they can be used to construct definitions and acquire domain knowledge, which can be extracted semi-automatically using lexical, grammatical and paralinguistic patterns.

...read moreread less

Proceedings Article•

Knowledge extraction by using an ontology-based annotation tool

[...]

Maria Vargas-Vera¹, Enrico Motta, John Domingue, Simon Buckingham Shum, Mattia Lanzoni - Show less +1 more•Institutions (1)

Open University¹

01 Jan 2001

TL;DR: A Semantic Annotation Tool for extraction of knowledge structures from web pages through the use of simple user-defined knowledge extraction patterns and to provide support for ontology population by using the information extraction component.

...read moreread less

Abstract: This paper describes a Semantic Annotation Tool for extraction of knowledge structures from web pages through the use of simple user-defined knowledge extraction patterns. The semantic annotation tool contains: an ontology-based mark-up component which allows the user to browse and to mark-up relevant pieces of information; a learning component (Crystal from the University of Massachusetts at Amherst) which learns rules from examples and an information extraction component which extracts the objects and relation between these objects. Our final aim is to provide support for ontology population by using the information extraction component. Our system uses as domain of study “KMi Planet”, a Webbased news server that helps to communicate relevant information between members in our institute.

...read moreread less

Proceedings Article•DOI•

Creating knowledge repositories from biomedical reports: the medsyndikate text mining system

[...]

Udo Hahn, Martin Romacker, Stefan Schulz

01 Dec 2001

TL;DR: The strong demands MEDSYNDIKATE poses to the availability of expressive knowledge sources are accounted for by two alternative approaches to (semi)automatic ontology engineering.

...read moreread less

Abstract: MEDSYNDIKATE is a natural language processor for automatically acquiring knowledge from medical finding reports. The content of these documents is transferred to formal representation structures which constitute a corresponding text knowledge base. The system architecture integrates requirements from the analysis of single sentences, as well as those of referentially linked sentences forming cohesive texts. The strong demands MEDSYNDIKATE poses to the availability of expressive knowledge sources are accounted for by two alternative approaches to (semi)automatic ontology engineering. We also present data for the knowledge extraction performance of MEDSYNDIKATE for three major syntactic patterns in medical documents.

...read moreread less

Journal Article•

A rough set-based knowledge discovery process

[...]

Ning Zhong¹, Andrzej Skowron²•Institutions (2)

Maebashi Institute of Technology¹, University of Warsaw²

01 Jan 2001-International Journal of Applied Mathematics and Computer Science

TL;DR: A rule discovery process that is based on rough set theory is discussed, using a slope-collapse database as an example showing how rules can be discovered from a large, real-life database.

...read moreread less

Abstract: The knowledge discovery from real-life databases is a multi-phase process consisting of numerous steps, including attribute selection, discretization of realvalued attributes, and rule induction. In the paper, we discuss a rule discovery process that is based on rough set theory. The core of the process is a soft hybrid induction system called the Generalized Distribution Table and Rough Set System (GDT-RS) for discovering classification rules from databases with uncertain and incomplete data. The system is based on a combination of Generalization Distribution Table (GDT) and the Rough Set methodologies. In the preprocessing, two modules, i.e. Rough Sets with Heuristics (RSH) and Rough Sets with Boolean Reasoning (RSBR), are used for attribute selection and discretization of real-valued attributes, respectively. We use a slope-collapse database as an example showing how rules can be discovered from a large, real-life database.

...read moreread less

Collapse