Home
/
Authors
/
Sanghee Kim

Author

Sanghee Kim

Bio: Sanghee Kim is an academic researcher from University of Southampton. The author has contributed to research in topics: Ontology (information science) & Knowledge extraction. The author has an hindex of 10, co-authored 16 publications receiving 835 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Automatic ontology-based knowledge extraction from Web documents

[...]

Harith Alani¹, Sanghee Kim¹, David E. Millard¹, Mark J. Weal¹, Wendy Hall¹, Paul H. Lewis¹, Nigel Shadbolt¹ - Show less +3 more•Institutions (1)

University of Southampton¹

01 Jan 2003-IEEE Intelligent Systems

TL;DR: The Artequakt project is considered, which links a knowledge extraction tool with an ontology to achieve continuous knowledge support and guide information extraction and is further enhanced using a lexicon-based term expansion mechanism that provides extended ontology terminology.

...read moreread less

Abstract: To bring the Semantic Web to life and provide advanced knowledge services, we need efficient ways to access and extract knowledge from Web documents. Although Web page annotations could facilitate such knowledge gathering, annotations are rare and will probably never be rich or detailed enough to cover all the knowledge these documents contain. Manual annotation is impractical and unscalable, and automatic annotation tools remain largely undeveloped. Specialized knowledge services therefore require tools that can search and extract specific knowledge directly from unstructured text on the Web, guided by an ontology that details what type of knowledge to harvest. An ontology uses concepts and relations to classify domain knowledge. Other researchers have used ontologies to support knowledge extraction, but few have explored their full potential in this domain. The paper considers the Artequakt project which links a knowledge extraction tool with an ontology to achieve continuous knowledge support and guide information extraction. The extraction tool searches online documents and extracts knowledge that matches the given classification structure. It provides this knowledge in a machine-readable format that will be automatically maintained in a knowledge base (KB). Knowledge extraction is further enhanced using a lexicon-based term expansion mechanism that provides extended ontology terminology.

...read moreread less

490 citations

Artequakt: Generating Tailored Biographies from Automatically Annotated Fragments from the Web

[...]

Sanghee Kim, Harith Alani, Wendy Hall, Paul H. Lewis, David E. Millard, Nigel Shadbolt, Mark J. Weal - Show less +3 more

01 Jan 2002

TL;DR: An overview of the Artequakt system architecture is presented here and the three key components of that architecture are explained in detail, namely knowledge extraction, information management and biography construction.

...read moreread less

Abstract: The Artequakt project seeks to automatically generate narrative biographies of artists from knowledge that has been extracted from the Web and maintained in a knowledge base. An overview of the system architecture is presented here and the three key components of that architecture are explained in detail, namely knowledge extraction, information management and biography construction. Conclusions are drawn from the initial experiences of the project and future progress is detailed.

...read moreread less

98 citations

SoFAR with DIM Agents: An agent framework for Distributed Information Management

[...]

Luc Moreau, Nicholas Gibbins, David C. DeRoure, Samhaa R. El-Beltagy, Wendy Hall, Gareth M. Hughes, Dan W. Joyce, Sanghee Kim, Danius T. Michaelides, Dave Millard, Sigi Reich, Robert Tansley, Mark J. Weal - Show less +9 more

01 Jan 2000

TL;DR: A versatile multi-agent framework designed for Distributed Information Management tasks, SoFAR embraces the notion of proactivity as the opportunistic reuse of the services provided by other agents, and provides the means to enable agents to locate suitable service providers.

...read moreread less

Abstract: In this paper we present SoFAR, a versatile multi-agent framework designed for Distributed Information Management tasks. SoFAR embraces the notion of proactivity as the opportunistic reuse of the services provided by other agents, and provides the means to enable agents to locate suitable service providers. The contribution of SoFAR is to combine some ideas from the distributed computing community with the performative-based communications used in other agent systems: communications in SoFAR are based on the startpoint/endpoint paradigm, which is the foundation of Nexus, the communication layer at the heart of the Computational Grid. We explain the rationale behind our design decisions, and describe the predefined set of agents which make up the core of the system. Two distributed information management applications have been written, a general query architecture and an open hypermedia application, and we recount their design and operations.

...read moreread less

46 citations

Automatic extraction of knowledge from web documents

[...]

Harith Alani¹, Sanghee Kim, David E. Millard, Mark J. Weal, Paul H. Lewis, Wendy Hall, Nigel Shadbolt - Show less +3 more•Institutions (1)

University of Southampton¹

01 Jan 2003

TL;DR: This paper provides an update on the Artequakt system which uses natural language tools to automatically extract knowledge about artists from multiple documents based on a predefined ontology.

...read moreread less

Abstract: A large amount of digital information available is written as text documents in the form of web pages, reports, papers, emails, etc. Extracting the knowledge of interest from such documents from multiple sources in a timely fashion is therefore crucial. This paper provides an update on the Artequakt system which uses natural language tools to automatically extract knowledge about artists from multiple documents based on a predefined ontology. The ontology represents the type and form of knowledge to extract. This knowledge is then used to generate tailored biographies. The information extraction process of Artequakt is detailed and evaluated in this paper.

...read moreread less

46 citations

Book Chapter•DOI•

SCULPTEUR: towards a new paradigm for multimedia museum information handling

[...]

Matthew Addis¹, Michael Boniface¹, Simon Goodall¹, Paul Grimwood¹, Sanghee Kim¹, Paul H. Lewis¹, Kirk Martinez¹, A Stevenson¹ - Show less +4 more•Institutions (1)

University of Southampton¹

20 Oct 2003

TL;DR: The design and prototype implementation of a novel architecture for integrated concept, metadata and content based browsing and retrieval of museum information is described, part of a European project involving several major galleries and the aim is to provide more versatile access to digital collections of museum artefacts.

...read moreread less

Abstract: This paper describes the design and prototype implementation of a novel architecture for integrated concept, metadata and content based browsing and retrieval of museum information. The work is part of a European project involving several major galleries and the aim is to provide more versatile access to digital collections of museum artefacts, including 2-D images, 3-D models and other multimedia representations. An ontology for the museum domain, based on the CIDOC Conceptual Reference Model, is being developed as a semantic layer with references to the digital collection as instance information. A graphical concept browser is an integral component in the user interface, allowing navigation through the semantic layer, display of thumbnails, or full representations of artefacts and textual information in appropriate viewers and the invocation of conventional content based searching or combined querying. Semantic Web technologies are used in system integration to describe how tools for analysis and visualisation can be applied to different data types and sources. This supports flexible and managed formulation, execution and interpretation of the results of distributed multimedia queries. Combined searches using concepts, content and metadata can be initiated from a single user interface.

...read moreread less

45 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Journal Article•DOI•

Sorting Things Out: Classification and Its Consequences.

[...]

Allan H. Young

01 Feb 2001-Journal of Nervous and Mental Disease

1,014 citations

Journal Article•DOI•

Automatic ontology-based knowledge extraction from Web documents

[...]

Harith Alani¹, Sanghee Kim¹, David E. Millard¹, Mark J. Weal¹, Wendy Hall¹, Paul H. Lewis¹, Nigel Shadbolt¹ - Show less +3 more•Institutions (1)

University of Southampton¹

01 Jan 2003-IEEE Intelligent Systems

...read moreread less

490 citations

Proceedings Article•

Data Driven Ontology Evaluation

[...]

Christopher Brewster¹, Harith Alani¹, Srinandan Dasmahapatra², Yorick Wilks²•Institutions (2)

University of Sheffield¹, University of Southampton²

01 May 2004

TL;DR: It is proposed in this paper that one approach to ontology evaluation should be corpus or data driven, because a corpus is the most accessible form of knowledge and its use allows a measure to be derived of the ‘fit’ between an ontology and a domain of knowledge.

...read moreread less

Abstract: The evaluation of ontologies is vital for the growth of the Semantic Web. We consider a number of problems in evaluating a knowledge artifact like an ontology. We propose in this paper that one approach to ontology evaluation should be corpus or data driven. A corpus is the most accessible form of knowledge and its use allows a measure to be derived of the ‘fit’ between an ontology and a domain of knowledge. We consider a number of methods for measuring this ‘fit’ and propose a measure to evaluate structural fit, and a probabilistic approach to identifying the best ontology.

...read moreread less

407 citations

Journal Article•DOI•

A fuzzy ontology and its application to news summarization

[...]

Chang-Shing Lee¹, Zhi-Wei Jian², Lin-Kai Huang¹•Institutions (2)

Chang Jung Christian University¹, National Cheng Kung University²

01 Oct 2005

TL;DR: The experimental results show that the news agent based on the fuzzy ontology can effectively operate for news summarization and an experimental website is constructed to test the approach.

...read moreread less

Abstract: In this paper, a fuzzy ontology and its application to news summarization are presented. The fuzzy ontology with fuzzy concepts is an extension of the domain ontology with crisp concepts. It is more suitable to describe the domain knowledge than domain ontology for solving the uncertainty reasoning problems. First, the domain ontology with various events of news is predefined by domain experts. The document preprocessing mechanism will generate the meaningful terms based on the news corpus and the Chinese news dictionary defined by the domain expert. Then, the meaningful terms will be classified according to the events of the news by the term classifier. The fuzzy inference mechanism will generate the membership degrees for each fuzzy concept of the fuzzy ontology. Every fuzzy concept has a set of membership degrees associated with various events of the domain ontology. In addition, a news agent based on the fuzzy ontology is also developed for news summarization. The news agent contains five modules, including a retrieval agent, a document preprocessing mechanism, a sentence path extractor, a sentence generator, and a sentence filter to perform news summarization. Furthermore, we construct an experimental website to test the proposed approach. The experimental results show that the news agent based on the fuzzy ontology can effectively operate for news summarization.

...read moreread less

377 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157

Collapse