What is the use of the ontology in the case of electronic publishing?

For instance, in the case of electronic publishing, the ontology is used to enrich news items, which are submitted either through email or through a web-based form.

How long did it take to develop the ontology?

The design of the ontology was based on the analysis of scholarly articles from a range of different fields, and took about two person weeks’ effort.

What are the main issues that have been highlighted in the development of Planet?

In addition to the need for better search and retrieval facilities, the experience of a day-to-day use of Planet over more than two years has highlighted a number of other issues.

What is the key to successful knowledge management?

A key to successful knowledge management is tointegrate these different media to provide the appropriate services in the relevant scenarios.

What is the paradoxical use of ontologies?

It might appear paradoxical to propose the use of ontologies to support scholarly communities in managing their knowledge, since conflicting worldviews, evidence and frames of reference lie at the heart of research and debate.

Why do the authors usually include pointers to WebOnto in their application interfaces?

i) given that WebOnto has been designed to be as easy to use as possible and ii) in many cases end users want to inspect an ontology directly (for instance, to gain a better understanding of the underlying organization), the authors usually include pointers to WebOnto in their application interfaces.

(Open Access) Ontology-driven document enrichment (2000) | Enrico Motta

Q: What have the authors contributed in "Ontology-driven document enrichment: principles, tools and applications" ?

In this paper the authors present an approach to document enrichment, which consists of developing and integrating formal knowledge models with archives of documents, to provide intelligent knowledge retrieval and ( possibly ) additional knowledge-intensive services, beyond what is currently available using 'standard ' information retrieval and search facilities. Their approach is ontology-driven, in the sense that the construction of the knowledge model is carried out in a top-down fashion, by populating a given ontology, rather than in a bottom-up fashion, by annotating a particular document. In the paper the authors give an overview of the approach and they examine the various types of issues ( e. g., modelling, organizational and user interface issues ) which need to be tackled to effectively deploy their approach in the workplace. In addition the authors also discuss a number of technologies they have developed to support ontology-driven document enrichment and they illustrate their ideas in the domains of electronic news publishing, scholarly discourse and medical guidelines.

Q: Why do the authors prefer to use the term "Enrichment"?

Because their model construction process is ontology-driven, the authors prefer to use the term "enrichment" (Sumner et al., 1998), rather than "conversion" or "annotation", to refer to the process of associating a formal model to a document (or set of documents).

Q: What technologies have been developed to support the collaborative development of knowledge models?

These technologies include a knowledge modelling language1, form-based interfaces for adding and retrieving knowledge from a model, and a webbased browser/editor, which supports the collaborative development of knowledge models over the World-Wide-Web.

Open Research Online

The Open University’s repository of research publications

and other research outputs

Ontology-driven document enrichment: principles,

tools and applications

Journal Item

How to cite:

Motta, Enrico; Buckingham Shum, Simon and Domingue, John (2000). Ontology-driven document enrichment:

principles, tools and applications. International Journal of Human-Computer Studies, 52(6) pp. 1071–1109.

For guidance on citations see FAQs.

 2000 Academic Press

Version: Accepted Manuscript

Link(s) to article on publisher’s website:

http://dx.doi.org/doi:10.1006/ijhc.2000.0384

owners. For more information on Open Research Online’s data policy on reuse of materials please consult the policies

page.

oro.open.ac.uk

To appear in the International Journal of Human-Computer Studies

Ontology-Driven Document Enrichment:

Principles, Tools and Applications

Enrico Motta, Simon Buckingham Shum and John Domingue

Knowledge Media Institute

The Open University

Walton Hall, MK7 6AA

Milton Keynes, UK

{e.motta, s.buckingham.shum, j.b.domingue}@open.ac.uk

Abstract. In this paper we present an approach to document enrichment, which

consists of developing and integrating formal knowledge models with archives of

documents, to provide intelligent knowledge retrieval and (possibly) additional

knowledge-intensive services, beyond what is currently available using 'standard'

information retrieval and search facilities. Our approach is ontology-driven, in the

sense that the construction of the knowledge model is carried out in a top-down

fashion, by populating a given ontology, rather than in a bottom-up fashion, by

annotating a particular document. In the paper we give an overview of the approach

and we examine the various types of issues (e.g., modelling, organizational and user

interface issues) which need to be tackled to effectively deploy our approach in the

workplace. In addition we also discuss a number of technologies we have developed

to support ontology-driven document enrichment and we illustrate our ideas in the

domains of electronic news publishing, scholarly discourse and medical guidelines.

1. INTRODUCTION

An important activity in knowledge management is "to convert text to knowledge" (O’Leary,

1998). This activity is central to knowledge management for two reasons: i) work practices and

information flow in organizations tend to be document-centred and ii) documents themselves do

not normally exhibit the amount of structure required to support semantically-aware search

engines or other forms of intelligent services. For these reasons there has been much interest in

technology to support the specification of structured information in textual documents, especially

web pages. The web standardisation community has focused on the underlying representational

infrastructure: XML (XML, 1999) has been proposed as the basic annotation formalism to

support the specification of structured information in web pages, while RDF builds on the XML

syntax to provide a standard declarative representation, which allows users to express semantic

relationships between items on the Web. Approaches such as Ontobroker (Fensel et al., 1998)

and Shoe (Heflin et al., 1998) provide formalisms and associated interpreters which make it

Ontology-Driven Document Enrichment. Page 2

possible to embed knowledge representation structures in web pages and use them to perform

inferences.

In this paper we look at the wider issues concerning "the conversion of text to knowledge" and

discuss a comprehensive approach to document enrichment (Sumner et al., 1998), which we are

trying out in a number of projects here at the Knowledge Media Institute. The approach is

characterized in terms of a set of activities, with associated informal guidelines. In the paper we

also describe a number of technologies, which we have developed to support our approach to

document-centred knowledge management. These technologies include a knowledge modelling

language

, form-based interfaces for adding and retrieving knowledge from a model, and a web-

based browser/editor, which supports the collaborative development of knowledge models over

the World-Wide-Web. Finally, we discuss the application of our approach to three domains:

electronic news publishing (Domingue and Motta, 1999), scholarly discourse (Buckingham

Shum et al., 1999) and medical guidelines (PatMan, 1998).

The paper is organized as follows: in the next section we give an overview of our approach, in

terms of the underlying methodological assumptions and the associated process model. In

section 3 we describe the technology we have developed to support the approach. In sections 4,

5 and 6 we discuss the application of the approach to the three aforementioned domains. Finally,

in sections 7 and 8 we discuss related work and reiterate the main contributions of this paper.

2. ONTOLOGY-DRIVEN DOCUMENT ENRICHMENT

Our approach is ontology-driven, in the sense that the construction of the knowledge model is

carried out in a top-down fashion, by populating a given ontology (Gruber, 1993), rather than in a

bottom-up fashion, by annotating a particular document. Figure 1 underlines this point

graphically, by emphasizing that the construction of a knowledge model is driven by a pre-

existing ontology, a set of documents and other sources of knowledge, such as appropriate

(human) experts. Following Gruber, we use the term “ontology” to indicate “a specification of a

reusable conceptualization”. More simply, an ontology can be seen as providing a vocabulary for

describing a range of models. For instance, an ontology for medical guidelines provides a

generic set of concepts and relations (e.g., medical condition, diagnostic guideline, guideline user

type), which can then be instantiated for particular guidelines to build guideline-specific models,

in domains such as stroke management or prevention of pressure ulcer.

An ontology-driven approach to model construction affords several advantages. Instantiating an

ontology is usually simpler and speedier than developing a model from scratch. In addition,

Here we use the term “knowledge modelling” as a short form for “knowledge-level modelling”, an expression

introduced by Allen Newell (1982) to describe models of knowledge-intensive behaviour which abstract from the

way this behaviour is implemented and focus instead on the knowledge employed by an agent and the goals the

agent is trying to achieve.

Ontology-Driven Document Enrichment. Page 3

because an ontology makes explicit the conceptualization underlying a particular model, it

becomes easier to maintain, reuse and interoperate the model with other components. Finally,

reasoning modules can be associated with an ontology and these are then applicable to all models

built by instantiating the ontology in question. For instance, in the case of medical guidelines,

one can envisage building ontology-specific guideline verification tools, which can then be used

to verify individual guidelines developed by instantiating the same generic guideline ontology.

Figure 1. Ontology-driven Document Enrichment

Because our model construction process is ontology-driven, we prefer to use the term

"enrichment" (Sumner et al., 1998), rather than "conversion" or "annotation", to refer to the

process of associating a formal model to a document (or set of documents). In general, a

representation, whether formal, graphical or textual, can be enriched in several different ways -

e.g., i) by providing information about the context in which it was created, ii) by linking it to

related artefacts of the same nature, or iii) by linking it to related artefacts of a different nature.

Although in our document-centred knowledge management work we provide multiple forms of

document enrichment, such as associating discussion spaces to documents (Sumner and

Ontology-Driven Document Enrichment. Page 4

Buckingham Shum, 1998), in this paper we will primarily concentrate on the association of

formal knowledge models to documents

Thus, an important facet of an ontology-centred approach to document enrichment is that the

formalised knowledge is not meant to be a translation of what is informally specified in the

associated document. Hence the knowledge model typically plays a different role from the

associated text. For instance, in the medical guideline scenario the knowledge model helps to

verify that all the kinds of knowledge expected to be found in a document describing a medical

guideline are indeed there. In the scholarly discourse scenario the knowledge model is meant to

capture the meta-knowledge required to structure academic debates (e.g., theory X contradicts

theory Y), which is often expressed only implicitly in publications (i.e., acquiring it typically

requires some interpretation effort) and is not modelled at all in traditional libraries. In a

nutshell, the emphasis in our approach is in identifying the added value (in terms of enabling

semantic retrieval and document indexing capabilities, or other reasoning services), which can be

provided by a formalised knowledge model. Our methodology comprises the following six steps.

1. Identify use scenario.

2. Characterize viewpoint for ontology.

3. Develop the ontology.

4. Perform ontology-driven model construction.

5. Customise query interface for semantic knowledge retrieval.

6. Develop additional reasoning services on top of knowledge model.

These steps are briefly described in the next sub-sections.

2.1 Identify Use Scenario

At this stage the services to be delivered by the knowledge management system are defined. In

particular, issues of feasibility and cost are investigated. Addressing the latter involves

answering questions such as: “What is the added value provided by the knowledge model,

considering the non-trivial costs associated with the development and instantiation of an

ontology?”, “Is there the need for a ‘full-blown’ knowledge model and for going beyond the

facilities provided by off-the-shelf search engines?”, “What additional reasoning services will be

provided, beyond deductive knowledge retrieval?”. Addressing feasibility issues requires

assessing (among other things) whether or not it is feasible to expect the target user community

to perform document enrichment or whether specialized human editors will be needed. This

latter solution introduces a significant bottleneck in the process and moreover assumes that to

introduce a central editor in the model development is actually feasible. This is definitely not the

case in some of our application domains. For instance, in the scholarly discourse scenario our

Having said so, the medical guideline scenario described in section 6 does integrate a formal model with a set of

discussion spaces, to provide multiple forms of document enrichment.

Ontology-driven document enrichment

Figures

Citations

The Semantic Grid: A Future e‐Science Infrastructure

Towards a dialogic understanding of the relationship between CSCL and teaching thinking skills

The semantic web: yet another hip?

User acceptance of intergovernmental services: An example of electronic document management system

Ontology library systems: the key to successful ontology re-use

References

A translation approach to portable ontology specifications

Ontologies: principles, methods and applications

Letizia: an agent that assists web browsing

Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project

Building large knowledge-based systems

Related Papers (5)

A translation approach to portable ontology specifications

The Semantic Web" in Scientific American

Ontologies: principles, methods and applications

Toward principles for the design of ontologies used for knowledge sharing

The Ontolingua Server : a Tool for Collaborative Ontology Construction

Frequently Asked Questions (11)

Q1. What have the authors contributed in "Ontology-driven document enrichment: principles, tools and applications" ?

Q2. Why do the authors prefer to use the term "Enrichment"?

Q3. What is the use of the ontology in the case of electronic publishing?

Q4. What is the purpose of the knowledge model in the scholarly discourse scenario?

Q5. How long did it take to develop the ontology?

Q6. What technologies have been developed to support the collaborative development of knowledge models?

Q7. What are the main issues that have been highlighted in the development of Planet?

Q8. What is the key to successful knowledge management?

Q9. What is the paradoxical use of ontologies?

Q10. Why do the authors usually include pointers to WebOnto in their application interfaces?

Q11. Why is the ontology a better term for a model?