Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book•DOI•

Knowledge Discovery in Databases: PKDD 2004

[...]

Jean-François Boulicaut, Floriana Esposito¹, Fosca Giannotti, Dino Pedreschi•Institutions (1)

University of Bari¹

01 Sep 2004

TL;DR: It is shown how carefully crafted random matrices can achieve distance-preserving dimensionality reduction, accelerate spectral computations, and reduce the sample complexity of certain kernel methods.

...read moreread less

Abstract: We show how carefully crafted random matrices can achieve distance-preserving dimensionality reduction, accelerate spectral computations, and reduce the sample complexity of certain kernel methods.

...read moreread less

184 citations

Patent•

System and method for analysis and clustering of documents for search engine

[...]

Zbigniew Michalewicz, Andrzej Jankowski

03 Aug 2001

TL;DR: In this paper, a system and method for searching documents in a data source and more particularly, to a system for analyzing and clustering of documents for a search engine is presented. But the system is not suitable for large scale data sets.

...read moreread less

Abstract: A system and method for searching documents in a data source and more particularly, to a system and method for analyzing and clustering of documents for a search engine. The system and method includes analyzing and processing documents to secure the infrastructure and standards for optimal document processing. By incorporating Computational Intelligence (CI) and statistical methods, the document information is analyzed and clustered using novel techniques for knowledge extraction. A comprehensive dictionary is built based on the keywords identified by the these techniques from the entire text of the document. The text is parsed for keywords or the number of its occurrences and the context in which the word appears in the documents. The whole document is identified by the knowledge that is represented in its contents. Based on such knowledge extracted from all the documents, the documents are clustered into meaningful groups in a catalog tree. The results of document analysis and clustering information are stored in a database.

...read moreread less

184 citations

Patent•

Method and apparatus for knowledge discovery in databases

[...]

John Duncan Bankier, Charles Allan Beck, Andrew Craig Brind, David John Brown, Kristy Irene Brown, John Dominic Burns, Peter Docherty, John Michael Gilchrist, Timothy Simon Jones, Gordon McIntyre, Alan Ryman, William Wallace - Show less +8 more

26 Aug 1999

TL;DR: In this paper, the authors present a computer-based method and apparatus for knowledge discovery from databases, which involves the user creation of a project plan comprising a plurality of operational components adapted to cooperatively extract desired information from a database.

...read moreread less

Abstract: A computer-based method and apparatus for knowledge discovery from databases. The disclosed method involves the user creation of a project plan comprising a plurality of operational components adapted to cooperatively extract desired information from a database. In one embodiment, the project plan is created within a graphical user interface and consists of objects representing the various functional components of the overall plan interconnected by links representing the flow of data from the data source to a data sink. Data visualization components may be inserted essentially anywhere in the project plan. One or more data links in the project plan may be designated as caching links which maintain copies of the data flowing across them, such that the cached data is available to other components in the project plan. In one embodiment, compression technology is applied to reduce the overall size of the database.

...read moreread less

184 citations

Book Chapter•DOI•

Fundamentals of spatial data warehousing for geographic knowledge discovery

[...]

Yvan Bédard¹, Tim Merrett², Jiawei Han²•Institutions (2)

Laval University¹, University of Illinois at Urbana–Champaign²

11 Oct 2001

TL;DR: The penetration of data warehouses into the management and exploitation of spatial databases is a major trend as it is for non-spatial databases.

...read moreread less

Abstract: Recent years have witnessed major changes in the Geographic Information System (GIS) market, from technological offerings to user requests. For example, spatial databases used to be implemented in GISs or in Computer-Assisted Design (CAD) systems coupled with a Relational Data Base Management System (RDBMS). Today, spatial databases are also implemented in spatial extensions of universal servers, in spatial engine software components, in GIS web servers, in analytical packages using so-called 'data cubes' and in spatial data warehouses. Such databases are structured according to either a relational, object-oriented, multi-dimensional or hybrid paradigm. In addition, these offerings are integrated as a piece of the overall technological framework of the organization and they are implemented according to very diverse architectures responding to differing users' contexts: centralized vs distributed, thin-clients vs thick-clients, Local Area Network (LAN) vs intranets, spatial data warehouses vs legacy systems, etc. As one may say, 'Gone are the days of a spatial database implemented solely on a stand-alone GIS' (Bédard 1999). In fact, this evolution of the GIS market follows the general trends of mainstream Information Technologies (IT). Among all these possibilities, the penetration of data warehouses into the management and exploitation of spatial databases is a major trend as it is for non-spatial databases. According to Rawling and Kucera (1997), 'the term Data Warehouse has become the hottest industry buzzword of the decade just behind Internet and information highway'. More specifically, this penetration of data warehouses allows developers to build new solutions geared towards one major need which has never been solved efficiently insofar: to provide a unified view of dispersed heterogeneous databases in order to efficiently feed the decision-support tools used for strategic decision making. In fact, the data warehouse emerged as the unifying solution to a series of individual circumstances related to providing the necessary basis for global knowledge discovery. First, large organizations often have several departmental or application-oriented independent databases which may overlap in content. Usually, such systems work properly for day-today operational-level decisions. However, when one needs to obtain aggregated or summarized information integrating data from these different

...read moreread less

183 citations

Book Chapter•DOI•

Community mining from multi-relational networks

[...]

Deng Cai¹, Zheng Shao¹, Xiaofei He², Xifeng Yan¹, Jiawei Han¹ - Show less +1 more•Institutions (2)

University of Illinois at Urbana–Champaign¹, University of Chicago²

03 Oct 2005

TL;DR: This paper systematically analyzes the problem of mining hidden communities on heterogeneous social networks and proposes a new method for learning an optimal linear combination of these relations which can best meet the user's expectation.

...read moreread less

Abstract: Social network analysis has attracted much attention in recent years. Community mining is one of the major directions in social network analysis. Most of the existing methods on community mining assume that there is only one kind of relation in the network, and moreover, the mining results are independent of the users' needs or preferences. However, in reality, there exist multiple, heterogeneous social networks, each representing a particular kind of relationship, and each kind of relationship may play a distinct role in a particular task. In this paper, we systematically analyze the problem of mining hidden communities on heterogeneous social networks. Based on the observation that different relations have different importance with respect to a certain query, we propose a new method for learning an optimal linear combination of these relations which can best meet the user's expectation. With the obtained relation, better performance can be achieved for community mining.

...read moreread less

183 citations

Collapse

Network Information

Performance

Metrics

20,644

Papers

453,302

Citations

No. of papers in the topic in previous years
Year	Papers
2023	120
2022	285
2021	506
2020	660
2019	740
2018	683

Knowledge extraction

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics