Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book•DOI•

Instance Selection and Construction for Data Mining

[...]

Huan Liu, Hiroshi Motoda

01 Jan 2001

TL;DR: This volume serves as a comprehensive reference for graduate students, practitioners and researchers in KDD to report new developments and applications, to share hard-learned experiences in order to avoid similar pitfalls, and to shed light on the future development of instance selection.

...read moreread less

Abstract: The ability to analyze and understand massive data sets lags far behind the ability to gather and store the data. To meet this challenge, knowledge discovery and data mining (KDD) is growing rapidly as an emerging field. However, no matter how powerful computers are now or will be in the future, KDD researchers and practitioners must consider how to manage ever-growing data which is, ironically, due to the extensive use of computers and ease of data collection with computers. Many different approaches have been used to address the data explosion issue, such as algorithm scale-up and data reduction. Instance, example, or tuple selection pertains to methods or algorithms that select or search for a representative portion of data that can fulfill a KDD task as if the whole data is used. Instance selection is directly related to data reduction and becomes increasingly important in many KDD applications due to the need for processing efficiency and/or storage efficiency. One of the major means of instance selection is sampling whereby a sample is selected for testing and analysis, and randomness is a key element in the process. Instance selection also covers methods that require search. Examples can be found in density estimation (finding the representative instances -- data points -- for a cluster); boundary hunting (finding the critical instances to form boundaries to differentiate data points of different classes); and data squashing (producing weighted new data with equivalent sufficient statistics). Other important issues related to instance selection extend to unwanted precision, focusing, concept drifts, noise/outlier removal, data smoothing, etc. Instance Selection and Construction for Data Mining brings researchers and practitioners together to report new developments and applications, to share hard-learned experiences in order to avoid similar pitfalls, and to shed light on the future development of instance selection. This volume serves as a comprehensive reference for graduate students, practitioners and researchers in KDD.

...read moreread less

228 citations

Patent•

Methods and apparatus for classifying terminology utilizing a knowledge catalog

[...]

Kelly Wical¹•Institutions (1)

Oracle Corporation¹

31 May 1995

TL;DR: In this article, a knowledge catalog includes a plurality of independent and parallel static ontologies to accurately represent a broad coverage of concepts that define knowledge, and a knowledge classification system that includes the knowledge catalog is also disclosed.

...read moreread less

Abstract: A knowledge catalog includes a plurality of independent and parallel static ontologies to accurately represent a broad coverage of concepts that define knowledge. The actual configuration, structure and orientation of a particular static ontology is dependent upon the subject matter or field of the ontology in that each ontology contains a different point of view. The static ontologies store all senses for each word and concept. A knowledge classification system, that includes the knowledge catalog, is also disclosed. A knowledge catalog processor accesses the knowledge catalog to classify input terminology based on the knowledge concepts in the knowledge catalog. Furthermore, the knowledge catalog processor processes the input terminology prior to attachment in the knowledge catalog. The knowledge catalog further includes a dynamic level that includes dynamic hierarchies. The dynamic level adds details for the knowledge catalog by including additional words and terminology, arranged in a hierarchy, to permit a detailed and in-depth coverage of specific concepts contained in a particular discourse. The static and dynamic ontologies are relational such that the linking of one or more ontologies, or portions thereof, result in a very detailed organization of knowledge concepts.

...read moreread less

228 citations

Proceedings Article•DOI•

KORE: keyphrase overlap relatedness for entity disambiguation

[...]

Johannes Hoffart¹, Stephan Seufert¹, Dat Ba Nguyen¹, Martin Theobald¹, Gerhard Weikum¹ - Show less +1 more•Institutions (1)

Max Planck Society¹

29 Oct 2012

TL;DR: A novel notion of semantic relatedness between two entities represented as sets of weighted (multi-word) keyphrases, with consideration of partially overlapping phrases is developed, which improves the quality of prior link-based models, and also eliminates the need for explicit interlinkage between entities.

...read moreread less

Abstract: Measuring the semantic relatedness between two entities is the basis for numerous tasks in IR, NLP, and Web-based knowledge extraction. This paper focuses on disambiguating names in a Web or text document by jointly mapping all names onto semantically related entities registered in a knowledge base. To this end, we have developed a novel notion of semantic relatedness between two entities represented as sets of weighted (multi-word) keyphrases, with consideration of partially overlapping phrases. This measure improves the quality of prior link-based models, and also eliminates the need for (usually Wikipedia-centric) explicit interlinkage between entities. Thus, our method is more versatile and can cope with long-tail and newly emerging entities that have few or no links associated with them. For efficiency, we have developed approximation techniques based on min-hash sketches and locality-sensitive hashing. Our experiments on semantic relatedness and on named entity disambiguation demonstrate the superiority of our method compared to state-of-the-art baselines.

...read moreread less

224 citations

Journal Article•DOI•

Combining knowledge bases consisting of first‐order theories

[...]

Chitta Baral¹, Sarit Kraus¹, Jack Minker¹, V. S. Subrahmanian¹•Institutions (1)

University of Maryland, College Park¹

01 Feb 1992

TL;DR: In this paper, the authors considered the problem of first-order theories of expert systems and presented techniques for resolving inconsistencies in such knowledge bases, and also provided algorithms for implementing these techniques.

...read moreread less

Abstract: Consider the construction of an expert system by encoding the knowledge of different experts. Suppose the knowledge provided by each expert is encoded into a knowledge base. Then the process of combining the knowledge of these different experts is an important and nontrivial problem. We study this problem here when the expert systems are considered to be first-order theories. We present techniques for resolving inconsistencies in such knowledge bases. We also provide algorithms for implementing these techniques.

...read moreread less

224 citations

Journal Article•DOI•

Different roles and mutual dependencies of data, information, and knowledge—an AI perspective on their integration

[...]

Agnar Aamodt¹, Mads Nygård²•Institutions (2)

Norwegian University of Science and Technology¹, SINTEF²

01 Oct 1995

TL;DR: It is shown that a specific problem solving episode, or case, may be viewed as data, information, or knowledge, depending on its role in decision making and learning from experience, and a conceptual framework for integration is suggested by focusing on their different roles and frames of reference within a decision-making process.

...read moreread less

Abstract: The unclear distinction between data, information, and knowledge has impaired their combination and utilization for the development of integrated systems. There is need for a unified definitional model of data, information, and knowledge based on their roles in computational and cognitive information processing. An attempt to clarify these basic notions is made, and a conceptual framework for integration is suggested by focusing on their different roles and frames of reference within a decision-making process. On this basis, ways of integrating the functionalities of databases, information systems and knowledge-based systems are discussed by taking a knowledge level perspective to the analysis and modeling of systems behaviour. Motivated by recent work in the area of case-based reasoning related to decision support systems, it is further shown that a specific problem solving episode, or case, may be viewed as data, information, or knowledge, depending on its role in decision making and learning from experience. An outline of a case-based system architecture is presented, and used to show that a focus on the retaining and reuse of past cases facilitates a gradual and evolutionary transition from an information system to a knowledge-based system.

...read moreread less

223 citations

Collapse

Network Information

Performance

Metrics

20,644

Papers

453,302

Citations

No. of papers in the topic in previous years
Year	Papers
2023	120
2022	285
2021	506
2020	660
2019	740
2018	683

Knowledge extraction

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics