scispace - formally typeset
Search or ask a question
Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.


Papers
More filters
Book
15 Jun 2002
TL;DR: Part A: Data mining and knowledge discovery Part B: Fundamental Concepts Part C: The process of knowledge discovery in databases Part D: Discovery Systems Part E: Interdisciplinary links of KDD Part F: Business problems Part G: Industry sectors Part H: KDD in practice: case studies
Abstract: Part A: Data mining and knowledge discovery Part B: Fundamental Concepts Part C: The process of knowledge discovery in databases Part D: Discovery Systems Part E: Interdisciplinary links of KDD Part F: Business problems Part G: Industry sectors Part H: KDD in practice: case studies

502 citations

Journal ArticleDOI
01 Aug 2002
TL;DR: A new algorithm called TITANIC for computing (iceberg) concept lattices is presented, based on data mining techniques with a level-wise approach, and shows an important gain in efficiency, especially for weakly correlated data.
Abstract: We introduce the notion of iceberg concept lattices and show their use in knowledge discovery in databases. Iceberg lattices are a conceptual clustering method, which is well suited for analyzing very large databases. They also serve as a condensed representation of frequent itemsets, as starting point for computing bases of association rules, and as a visualization method for association rules. Iceberg concept lattices are based on the theory of Formal Concept Analysis, a mathematical theory with applications in data analysis, information retrieval, and knowledge discovery. We present a new algorithm called TITANIC for computing (iceberg) concept lattices. It is based on data mining techniques with a level-wise approach. In fact, TITANIC can be used for a more general problem: Computing arbitrary closure systems when the closure operator comes along with a so-called weight function. The use of weight functions for computing closure systems has not been discussed in the literature up to now. Applications providing such a weight function include association rule mining, functional dependencies in databases, conceptual clustering, and ontology engineering. The algorithm is experimentally evaluated and compared with Ganter's Next-Closure algorithm. The evaluation shows an important gain in efficiency, especially for weakly correlated data.

494 citations

Proceedings Article
20 Aug 1995
TL;DR: This research combines the KDD and text categorization paradigms and suggests advances to the state of the art in both areas.
Abstract: The information age is characterized by a rapid growth in the amount of information available in electronic media. Traditional data handling methods are not adequate to cope with this information flood. Knowledge Discovery in Databases (KDD) is a new paradigm that focuses on computerized exploration of large amounts of data and on discovery of relevant and interesting patterns within them. While most work on KDD is concerned with structured databases, it is clear that this paradigm is required for handling the huge amount of information that is available only in unstructured textual form. To apply traditional KDD on texts it is necessary to impose some structure on the data that would be rich enough to allow for interesting KDD operations. On the other hand, we have to consider the severe limitations of current text processing technology and define rather simple structures that can be extracted from texts fairly automatically and in a reasonable cost. We propose using a text categorization paradigm to annotate text articles with meaningful concepts that are organized in hierarchical structure. We suggest that this relatively simple annotation is rich enough to provide the basis for a KDD framework, enabling data summarization, exploration of interesting patterns, and trend analysis. This research combines the KDD and text categorization paradigms and suggests advances to the state of the art in both areas.

493 citations

Journal ArticleDOI
TL;DR: In this article, a data mining model for higher education system in the university is presented, where the classification task is used to evaluate student's performance and as there are many approaches that are used for data classification, the decision tree method is used here.
Abstract: The main objective of higher education institutions is to provide quality education to its students. One way to achieve highest level of quality in higher education system is by discovering knowledge for prediction regarding enrolment of students in a particular course, alienation of traditional classroom teaching model, detection of unfair means used in online examination, detection of abnormal values in the result sheets of the students, prediction about students' performance and so on. The knowledge is hidden among the educational data set and it is extractable through data mining techniques. Present paper is designed to justify the capabilities of data mining techniques in context of higher education by offering a data mining model for higher education system in the university. In this research, the classification task is used to evaluate student's performance and as there are many approaches that are used for data classification, the decision tree method is used here. By this task we extract knowledge that describes students' performance in end semester examination. It helps earlier in identifying the dropouts and students who need special attention and allow the teacher to provide appropriate advising/counseling. Keywords-Educational Data Mining (EDM); Classification; Knowledge Discovery in Database (KDD); ID3 Algorithm.

492 citations

Journal ArticleDOI
TL;DR: The Artequakt project is considered, which links a knowledge extraction tool with an ontology to achieve continuous knowledge support and guide information extraction and is further enhanced using a lexicon-based term expansion mechanism that provides extended ontology terminology.
Abstract: To bring the Semantic Web to life and provide advanced knowledge services, we need efficient ways to access and extract knowledge from Web documents. Although Web page annotations could facilitate such knowledge gathering, annotations are rare and will probably never be rich or detailed enough to cover all the knowledge these documents contain. Manual annotation is impractical and unscalable, and automatic annotation tools remain largely undeveloped. Specialized knowledge services therefore require tools that can search and extract specific knowledge directly from unstructured text on the Web, guided by an ontology that details what type of knowledge to harvest. An ontology uses concepts and relations to classify domain knowledge. Other researchers have used ontologies to support knowledge extraction, but few have explored their full potential in this domain. The paper considers the Artequakt project which links a knowledge extraction tool with an ontology to achieve continuous knowledge support and guide information extraction. The extraction tool searches online documents and extracts knowledge that matches the given classification structure. It provides this knowledge in a machine-readable format that will be automatically maintained in a knowledge base (KB). Knowledge extraction is further enhanced using a lexicon-based term expansion mechanism that provides extended ontology terminology.

490 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
90% related
Support vector machine
73.6K papers, 1.7M citations
90% related
Artificial neural network
207K papers, 4.5M citations
87% related
Fuzzy logic
151.2K papers, 2.3M citations
86% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023120
2022285
2021506
2020660
2019740
2018683