Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Relationships at the Heart of Semantic Web: Modeling, Discovering, and Exploiting Complex Semantic Relationships

[...]

Amit P. Sheth¹, I. Budak Arpinar¹, Vipul Kashyap²•Institutions (2)

University of Georgia¹, National Institutes of Health²

01 Jan 2004

TL;DR: This paper discusses modeling, representation and computation or validation of three types of complex semantic relationships: using predefined multi-ontology relationships for query processing and virtual relationships based on a set of patterns and paths between entities of interest.

...read moreread less

Abstract: The primary goal of today's search and browsing techniques is to find relevant documents. As the current web evolves into the next generation termed the Semantic Web, the emphasis will shift from finding documents to finding facts, actionable information, and insights. Improving ability to extract facts, mainly in the form of entities, embedded within documents leads to the fundamental challenge of discovering relevant and interesting relationships amongst the entities that these documents describe. Relationships are fundamental to semantics—to associate meanings to words, terms and entities. They are a key to new insights. Knowledge discovery is also about discovery of heretofore new relationships. The Semantic Web seeks to associate annotations (i.e., metadata), primarily consisting of based on concepts (often representing entities) from one or more ontologies/vocabularies with all Web-accessible resources such that programs can associate "meaning with data". Not only it supports the goal of automatic interpretation and processing (access, invoke, utilize, and analyze), it also enables improvements in scalability compared to approaches that are not semantics-based. Identification, discovery, validation and utilization of relationships (such as during query evaluation), will be a critical computation on the Semantic Web. Based on our research over the last decade, this paper takes an empirical look at various types of simple and complex relationships, what is captured and how they are represented, and how they are identified, discovered or validated, and exploited. These relationships may be based only on what is contained in or directly derived from data (direct content based relationships), or may be based on information extraction, external and prior knowledge and user defined computations (content descriptive relationships). We also present some recent techniques for discovering indirect (i.e., transitive) and virtual (i.e., user-defined) yet meaningful (i.e., contextually relevant) relationships based on a set of patterns and paths between entities of interest. In particular, we will discuss modeling, representation and computation or validation of three types of complex semantic relationships: (a) using predefined multi-ontology relationships for query processing and

...read moreread less

281 citations

Patent•

Method and apparatus for the integration of information and knowledge

[...]

Ronald M. Swartz¹, Jeffrey L. Winkler¹, Evelyn A. Janos¹, Igor Markidan¹, Qun Dou¹ - Show less +1 more•Institutions (1)

Xerox¹

29 Jun 1998

TL;DR: In this paper, the authors present a method and apparatus for first integrating the operation of various independent software applications directed to the management of information within an enterprise, which is an expandable architecture with built-in knowledge integration features that facilitate the monitoring of information flow into, out of, and between the integrated information management applications.

...read moreread less

Abstract: The present invention is a method and apparatus for first integrating the operation of various independent software applications directed to the management of information within an enterprise. The system architecture is, however, an expandable architecture, with built-in knowledge integration features that facilitate the monitoring of information flow into, out of, and between the integrated information management applications so as to assimilate knowledge information and facilitate the control of such information. Also included are additional tools which, using the knowledge information enable the more efficient use of the knowledge within an enterprise, including the ability to develop a context for and visualization of such knowledge.

...read moreread less

280 citations

Proceedings Article•

Making machine learning models interpretable

[...]

Alfredo Vellido Alcacena¹, Jose David Martin Guerrero, Paulo J. G. Lisboa²•Institutions (2)

Polytechnic University of Catalonia¹, Liverpool John Moores University²

01 Jan 2012

TL;DR: This paper is a brief introduction to the special session on interpretable models in machine learning, organized as part of the 20 th European Symposium on Artificial Neural Networks, Computational In- telligence and Machine Learning, with an overview of the context of wider research on interpretability of machine learning models.

...read moreread less

Abstract: Data of different levels of complexity and of ever growing diversity of characteristics are the raw materials that machine learning practitioners try to model using their wide palette of methods and tools. The obtained models are meant to be a synthetic representation of the available, observed data that captures some of their intrinsic regularities or patterns. Therefore, the use of machine learning techniques for data analysis can be understood as a problem of pattern recognition or, more informally, of knowledge discovery and data mining. There exists a gap, though, between data modeling and knowledge extraction. Models, de- pending on the machine learning techniques employed, can be described in diverse ways but, in order to consider that some knowledge has been achieved from their description, we must take into account the human cog- nitive factor that any knowledge extraction process entails. These models as such can be rendered powerless unless they can be interpreted ,a nd the process of human interpretation follows rules that go well beyond techni- cal prowess. For this reason, interpretability is a paramount quality that machine learning methods should aim to achieve if they are to be applied in practice. This paper is a brief introduction to the special session on interpretable models in machine learning, organized as part of the 20 th European Symposium on Artificial Neural Networks, Computational In- telligence and Machine Learning. It includes a discussion on the several works accepted for the session, with an overview of the context of wider research on interpretability of machine learning models.

...read moreread less

280 citations

Applications of Data Mining Techniques in Healthcare and Prediction of Heart Attacks

[...]

K. Srinivas, B. Kavihta, A. Govrdhan

01 Jan 2010

TL;DR: The potential use of classification based data mining techniques such as Rule based, Decision tree, Naive Bayes and Artificial Neural Network to massive volume of healthcare data is examined.

...read moreread less

Abstract: The healthcare environment is generally perceived as being 'information rich' yet 'knowledge poor'. There is a wealth of data available within the healthcare systems. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. Knowledge discovery and data mining have found numerous applications in business and scientific domain. Valuable knowledge can be discovered from application of data mining techniques in healthcare system. In this study, we briefly examine the potential use of classification based data mining techniques such as Rule based, Decision tree, Naive Bayes and Artificial Neural Network to massive volume of healthcare data. The healthcare industry collects huge amounts of healthcare data which, unfortunately, are not "mined" to discover hidden information. For data preprocessing and effective decision making One Dependency Augmented Naive Bayes classifier (ODANB) and naive credal classifier 2 (NCC2) are used. This is an extension of naive Bayes to imprecise probabilities that aims at delivering robust classifications also when dealing with small or incomplete data sets. Discovery of hidden patterns and relationships often goes unexploited. Using medical profiles such as age, sex, blood pressure and blood sugar it can predict the likelihood of patients getting a heart disease. It enables significant knowledge, e.g. patterns, relationships between medical factors related to heart disease, to be established.

...read moreread less

279 citations

Journal Article•DOI•

WaveCluster: a wavelet-based clustering approach for spatial data in very large databases

[...]

Gholamhosein Sheikholeslami¹, Surojit Chatterjee¹, Aidong Zhang¹•Institutions (1)

University at Buffalo¹

01 Feb 2000

TL;DR: WaveCluster is proposed, a novel clustering approach based on wavelet transforms, which satisfies all the above requirements and can effectively identify arbitrarily shaped clusters at different degrees of detail.

...read moreread less

Abstract: Many applications require the management of spatial data in a multidimensional feature space. Clustering large spatial databases is an important problem, which tries to find the densely populated regions in the feature space to be used in data mining, knowledge discovery, or efficient information retrieval. A good clustering approach should be efficient and detect clusters of arbitrary shape. It must be insensitive to the noise (outliers) and the order of input data. We propose WaveCluster, a novel clustering approach based on wavelet transforms, which satisfies all the above requirements. Using the multiresolution property of wavelet transforms, we can effectively identify arbitrarily shaped clusters at different degrees of detail. We also demonstrate that WaveCluster is highly efficient in terms of time complexity. Experimental results on very large datasets are presented, which show the efficiency and effectiveness of the proposed approach compared to the other recent clustering methods.

...read moreread less

279 citations

Collapse

Network Information

Performance

Metrics

20,644

Papers

453,302

Citations

No. of papers in the topic in previous years
Year	Papers
2023	120
2022	285
2021	506
2020	660
2019	740
2018	683

Knowledge extraction

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics