Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Knowledge modeling at the millennium : The design and evolution of Protégé-2000

[...]

William Grosso, Henrik Eriksson, Ray W. Fergerson, John H. Gennari, Samson W. Tu, Mark A. Musen - Show less +2 more

01 Jan 1999

TL;DR: An overview of the evolution of Protégé is given, examining the methodological assumptions underlying the original ProtÉgé system and discussing the ways in which the methodology has changed over time.

...read moreread less

Abstract: It has been 13 years since the first version of Protégé was run. The original tool was a small application, aimed mainly at building knowledge-acquisition tools for a few very specialized programs (it grew out of the ONCOCIN project and the subsequent attempts to build expert systems for protocol-based therapy planning). The most recent version, Protégé-2000, incorporates the Open Knowledge Base Connectivity (OKBC) knowledge model, is written to run across a wide variety of platforms, supports customized user-interface extensions, and has been used by over 300 individuals and research groups, most of whom are only peripherally interested in medical informatics. Researchers not directly involved in the project might well wonder how Protégé evolved, what are the reasons for the repeated reimplementations, and how to tell the various versions apart. In this paper, we give an overview of the evolution of Protégé, examining the methodological assumptions underlying the original Protégé system and discussing the ways in which the methodology has changed over time. We conclude with an overview of the latest version of Protégé, Protégé-2000. 1. MOTIVATION AND A TIMELINE The Protégé applications (hereafter ‘Protégé’) are a set of tools that have been evolving for over a decade, from a simple program which helped construct specialized knowledge-bases to a set of general purpose knowledge-base creation and maintenance tools. While Protégé began as a small application designed for a medical domain (protocol-based therapy planning), it has grown and evolved to become a much more general-purpose set of tools for building knowledge-based systems. The original goal of Protégé was to reduce the knowledge-acquisition bottleneck (Hayes-Roth et al, 1983) by minimizing the role of the knowledge-engineer in constructing knowledge-bases. In order to do this, Musen (1988, 1989b) posited that knowledge-acquisition proceeds in welldefined stages and that knowledge acquired in one stage could be used to generate and customize knowledge-acquisition tools for subsequent stages. In (Musen, 1988), Protégé was defined as an application that takes advantage of this structured information to simplify the knowledgeacquisition process. The original Protégé was described this way (Musen, 1988): Protégé is neither an expert system itself nor a program that builds expert systems directly. Instead, Protégé is a tool that helps users build other tools that are custom-tailored to assist with knowledgeacquisition for expert systems in specific application areas. The original Protégé demonstrated the viability of this approach, and of the use of task-specific knowledge to generate and customize knowledge-acquisition tools. But as with many first-

...read moreread less

295 citations

Proceedings Article•

Information extraction from HTML: application of a general machine learning approach

[...]

Dayne Freitag

01 Jul 1998

TL;DR: This work shows how information extraction can be cast as a standard machine learning problem, and argues for the suitability of relational learning in solving it, and the implementation of a general-purpose relational learner for information extraction, SRV.

...read moreread less

Abstract: Because the World Wide Web consists primarily of text, information extraction is central to any effort that would use the Web as a resource for knowledge discovery. We show how information extraction can be cast as a standard machine learning problem, and argue for the suitability of relational learning in solving it. The implementation of a general-purpose relational learner for information extraction, SRV, is described. In contrast with earlier learning systems for information extraction, SRV makes no assumptions about document structure and the kinds of information available for use in learning extraction patterns. Instead, structural and other information is supplied as input in the form of an extensible token-oriented feature set. We demonstrate the effectiveness of this approach by adapting SRV for use in learning extraction rules for a domain consisting of university course and research project pages sampled from the Web. Making SRV Web-ready only involves adding several simple HTML-specific features to its basic feature set.

...read moreread less

294 citations

Book Chapter•DOI•

Integrating Web Usage and Content Mining for More Effective Personalization

[...]

Bamshad Mobasher¹, Honghua Dai¹, Tao Luo¹, Yuqing Sun¹, Jiang Zhu¹ - Show less +1 more•Institutions (1)

DePaul University¹

04 Sep 2000

TL;DR: This paper presents a framework for Web usage mining, distinguishing between the offine tasks of data preparation and mining, and the online process of customizing Web pages based on a user's active session, and describes effective techniques based on clustering to obtain a uniform representation for both site usage and site content profiles.

...read moreread less

Abstract: Recent proposals have suggested Web usage mining as an enabling mechanism to overcome the problems associated with more traditional Web personalization techniques such as collaborative or content-based filtering. These problems include lack of scalability, reliance on subjective user ratings or static profiles, and the inability to capture a richer set of semantic relationships among objects (in content-based systems). Yet, usage-based personalization can be problematic when little usage data is available pertaining to some objects or when the site content changes regularly. For more effective personalization, both usage and content attributes of a site must be integrated into a Web mining framework and used by the recommendation engine in a uniform manner. In this paper we present such a framework, distinguishing between the offine tasks of data preparation and mining, and the online process of customizing Web pages based on a user's active session. We describe effective techniques based on clustering to obtain a uniform representation for both site usage and site content profiles, and we show how these profiles can be used to perform real-time personalization.

...read moreread less

293 citations

Journal Article•DOI•

Data, Information, and Knowledge in Visualization

[...]

Min Chen¹, David S. Ebert², Hans Hagen³, Robert S. Laramee¹, R. van Liere, Kwan-Liu Ma⁴, William Ribarsky⁵, Gerik Scheuermann⁶, Deborah Silver⁷ - Show less +5 more•Institutions (7)

Swansea University¹, Purdue University², Kaiserslautern University of Technology³, University of California, Davis⁴, University of North Carolina at Charlotte⁵, Leipzig University⁶, Rutgers University⁷

01 Jan 2009-IEEE Computer Graphics and Applications

TL;DR: It is suggested that data, information, and knowledge could serve as both the input and output of a visualization process, raising questions about their exact role in visualization.

...read moreread less

Abstract: In visualization, we use the terms data, information and knowledge extensively, often in an interrelated context. In many cases, they indicate different levels of abstraction, understanding, or truthfulness. For example, "visualization is concerned with exploring data and information," "the primary objective in data visualization is to gain insight into an information space," and "information visualization" is for "data mining and knowledge discovery." In other cases, these three terms indicate data types, for instance, as adjectives in noun phrases, such as data visualization, information visualization, and knowledge visualization. These examples suggest that data, information, and knowledge could serve as both the input and output of a visualization process, raising questions about their exact role in visualization.

...read moreread less

293 citations

Journal Article•DOI•

Knowledge discovery

[...]

Toshinori Munakata¹•Institutions (1)

Cleveland State University¹

01 Nov 1999-Communications of The ACM

292 citations

Collapse

Network Information

Performance

Metrics

20,644

Papers

453,302

Citations

No. of papers in the topic in previous years
Year	Papers
2023	120
2022	285
2021	506
2020	660
2019	740
2018	683

Knowledge extraction

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics