Home
/
Topics
/
Knowledge extraction

Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1972
1970

Papers

PDF

Open Access

More filters

Book•

Data Mining Methods for Knowledge Discovery

[...]

Krzysztof J. Cios¹, Witold Pedrycz², R.M. Swiniarsk³•Institutions (3)

University of Auckland¹, University of Manitoba², San Diego State University³

31 Aug 1998

TL;DR: This chapter discusses data mining and knowledge discovery through the lens of machine learning, and some of the techniques used in this chapter were previously described in the preface.

...read moreread less

Abstract: Foreword. Preface. 1. Data Mining and Knowledge Discovery. 2. Rough Sets. 3. Fuzzy Sets. 4. Bayesian Methods. 5. Evolutionary Computing. 6. Machine Learning. 7. Neural Networks. 8. Clustering. 9. Preprocessing. Index.

...read moreread less

552 citations

Journal Article•DOI•

A database perspective on knowledge discovery

[...]

Tomasz Imielinski¹, Heikki Mannila²•Institutions (2)

Rutgers University¹, University of Helsinki²

01 Nov 1996-Communications of The ACM

TL;DR: The concept of data mining as a querying process and the first steps toward efficient development of knowledge discovery applications are discussed.

...read moreread less

Abstract: DATABASE MINING IS NOT SIMPLY ANOTHER buzzword for statistical data analysis or inductive learning. Database mining sets new challenges to database technology: new concepts and methods are needed for query languages, basic operations, and query processing strategies. The most important new component is the ad hoc nature of knowledge and data discovery (KDD) queries and the need for efficient query compilation into a multitude of existing and new data analysis methods. Hence, database mining builds upon the existing body of work in statistics and machine learning but provides completely new functionalities. The current generation of database systems are designed mainly to support business applications. The success of Structured Query Language (SQL) has capitalized on a small number of primitives sufficient to support a vast majority of such applications. Unfortunately, these primitives are not sufficient to capture the emerging family of new applications dealing with knowledge discovery. Most current KDD systems offer isolated discovery features using tree inducers, neural nets, and rule discovery algorithms. Such systems cannot be embedded into a large application and typically offer just one knowledge dis-The concept of data mining as a querying process and the first steps toward efficient development of knowledge discovery applications are discussed.

...read moreread less

547 citations

Journal Article•DOI•

Open information extraction from the web

[...]

Oren Etzioni¹, Michele Banko¹, Stephen Soderland¹, Daniel S. Weld¹•Institutions (1)

University of Washington¹

01 Dec 2008-Communications of The ACM

TL;DR: In this paper, a self-supervised learner employs a parser and heuristics to determine criteria that will be used by an extraction classifier (or other ranking model) for evaluating the trustworthiness of candidate tuples that have been extracted from the corpus of text.

...read moreread less

Abstract: To implement open information extraction, a new extraction paradigm has been developed in which a system makes a single data-driven pass over a corpus of text, extracting a large set of relational tuples without requiring any human input. Using training data, a Self-Supervised Learner employs a parser and heuristics to determine criteria that will be used by an extraction classifier (or other ranking model) for evaluating the trustworthiness of candidate tuples that have been extracted from the corpus of text, by applying heuristics to the corpus of text. The classifier retains tuples with a sufficiently high probability of being trustworthy. A redundancy-based assessor assigns a probability to each retained tuple to indicate a likelihood that the retained tuple is an actual instance of a relationship between a plurality of objects comprising the retained tuple. The retained tuples comprise an extraction graph that can be queried for information.

...read moreread less

545 citations

Book•

Relational Data Mining

[...]

Saso Dzeroski, Nada Lavrač

01 Jan 2011

TL;DR: This coherently written multi-author monograph provides a thorough introduction and systematic overview of the area and will become a valuable source of reference for R&D professionals active in relational data mining.

...read moreread less

Abstract: As the first book devoted to relational data mining, this coherently written multi-author monograph provides a thorough introduction and systematic overview of the area. The first part introduces the reader to the basics and principles of classical knowledge discovery in databases and inductive logic programming; subsequent chapters by leading experts assess the techniques in relational data mining in a principled and comprehensive way; finally, three chapters deal with advanced applications in various fields and refer the reader to resources for relational data mining.This book will become a valuable source of reference for R&D professionals active in relational data mining. Students as well as IT professionals and ambitioned practitioners interested in learning about relational data mining will appreciate the book as a useful text and gentle introduction to this exciting new field.

...read moreread less

530 citations

Journal Article•DOI•

Dynamic self-organizing maps with controlled growth for knowledge discovery

[...]

Damminda Alahakoon¹, Saman K. Halgamuge², Bala Srinivasan³•Institutions (3)

Monash University, Clayton campus¹, University of Melbourne², Monash University³

01 May 2000-IEEE Transactions on Neural Networks

TL;DR: The growing self-organizing map (GSOM) is presented in detail and the effect of a spread factor, which can be used to measure and control the spread of the GSOM, is investigated.

...read moreread less

Abstract: The growing self-organizing map (GSOM) algorithm is presented in detail and the effect of a spread factor, which can be used to measure and control the spread of the GSOM, is investigated. The spread factor is independent of the dimensionality of the data and as such can be used as a controlling measure for generating maps with different dimensionality, which can then be compared and analyzed with better accuracy. The spread factor is also presented as a method of achieving hierarchical clustering of a data set with the GSOM. Such hierarchical clustering allows the data analyst to identify significant and interesting clusters at a higher level of the hierarchy, and continue with finer clustering of the interesting clusters only. Therefore, only a small map is created in the beginning with a low spread factor, which can be generated for even a very large data set. Further analysis is conducted on selected sections of the data and of smaller volume. Therefore, this method facilitates the analysis of even very large data sets.

...read moreread less

529 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
…
15
16
17
18
19
20
21
…
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

20,644

Papers

453,302

Citations

No. of papers in the topic in previous years
Year	Papers
2023	120
2022	285
2021	506
2020	660
2019	740
2018	683

Knowledge extraction

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics