scispace - formally typeset
Search or ask a question

Showing papers by "Carole Goble published in 2002"


Proceedings ArticleDOI
01 Dec 2002
TL;DR: Detailed investigation of the properties of these information content based measures are presented, and various properties of GO are examined, which may have implications for its future design.
Abstract: Many bioinformatics resources hold data in the form of sequences. Often this sequence data is associated with a large amount of annotation. In many cases this data has been hard to model, and has been represented as scientific natural language, which is not readily computationally amenable. The development of the Gene Ontology provides us with a more accessible representation of some of this data. However it is not clear how this data can best be searched, or queried. Recently we have adapted information content based measures for use with the Gene Ontology (GO). In this paper we present detailed investigation of the properties of these measures, and examine various properties of GO, which may have implications for its future design.

248 citations


Proceedings ArticleDOI
01 Dec 2002
TL;DR: The paper introduces DAML+OIL and demonstrates the activity within each stage of the methodology and the functionality gained.
Abstract: The Gene Ontology Next Generation Project (GONG) is developing a staged methodology to evolve the current representation of the Gene Ontology into DAML+OIL in order to take advantage of the richer formal expressiveness and the reasoning capabilities of the underlying description logic. Each stage provides a step level increase in formal explicit semantic content with a view to supporting validation, extension and multiple classification of the Gene Ontology. The paper introduces DAML+OIL and demonstrates the activity within each stage of the methodology and the functionality gained.

144 citations


Journal ArticleDOI
01 Dec 2002
TL;DR: The Grid is an emerging platform to support on-demand "virtual organisations" for coordinated resource sharing and problem solving on a global scale and to realise its potential it also stands to benefit from Semantic Web technologies.
Abstract: The Grid is an emerging platform to support on-demand "virtual organisations" for coordinated resource sharing and problem solving on a global scale. The application thrust is large-scale scientific endeavour, and the scale and complexity of scientific data presents challenges for databases. The Grid is beginning to exploit technologies developed for Web Services and to realise its potential it also stands to benefit from Semantic Web technologies; conversely, the Grid and its scientific users provide application pull which will benefit the Semantic Web.

94 citations


Book ChapterDOI
30 Oct 2002
TL;DR: A number of candidate interpretations of annotation are identified, and the impact these interpretations may have on Semantic Web applications is discussed.
Abstract: Semantic metadata will playa significant role in the provision of the Semantic Web. Agents will need metadata that describes the content of resources in order to perform operations, such as retrieval, over those resources. In addition, if rich semantic metadata is supplied, those agents can then employ reasoning over the metadata, enhancing their processing power. Keyto this approach is the provision of annotations, both through automatic and human means. The semantics of these annotations, however, in terms of the mechanisms through which they are interpreted and presented to the user, are sometimes unclear. In this paper, we identify a number of candidate interpretations of annotation, and discuss the impact these interpretations mayha ve on Semantic Web applications.

66 citations


Journal ArticleDOI
01 Jun 2002
TL;DR: The early stages of building an ontology component of a bioinformatics resource querying application are described and the conceptualization is encoded using the ontology inference layer (OIL), a knowledge representation language that combines the modeling style of frame-based systems with the expressiveness and reasoning power of description logics (DLs).
Abstract: This paper describes the initial stages of building an ontology of bioinformatics and molecular biology. The conceptualization is encoded using the ontology inference layer (OIL), a knowledge representation language that combines the modeling style of frame-based systems with the expressiveness and reasoning power of description logics (DLs). This paper is the second of a pair in this special issue. The first described the core of the OIL language and the need to use ontologies to deliver semantic bioinformatics resources. In this paper, the early stages of building an ontology component of a bioinformatics resource querying application are described. This ontology (TaO) holds the information about molecular biology represented in bioinformatics resources and the bioinformatics tasks performed over these resources. It, therefore, represents the metadata of the resources the application can query. It also manages the terminologies used in constructing the query plans used to retrieve instances from those external resources. The methodology used in this task capitalizes upon features of OIL-The conceptualization afforded by the frame-based view of OIL's syntax; the expressive power and reasoning of the logical formalism; and the ability to encode both handcrafted, hierarchies of concepts, as well as defining concepts in terms of their properties, which can then be used to establish a classification and infer relationships not encoded by the ontologist. This ability forms the basis of the methodology described here: For each portion of the TaO, a basic framework of concepts is asserted by the ontologist. Then, the properties of these concepts are defined by the ontologist and the logic's reasoning power used to reclassify and infer further relationships. This cycle of elaboration and refinement is iterated on each portion of the ontology until a satisfactory ontology has been created.

65 citations


Proceedings ArticleDOI
24 Jul 2002
TL;DR: This paper presents an approach to answering queries over an ontology modelled using a description logic that combines the use of the expressive ALCQI description logic with a global-as-view approach to relating the ontology to the sources.
Abstract: This paper presents an approach to answering queries over an ontology modelled using a description logic. The ontology acts as a global schema, providing a declarative description of the concepts of the domain, the instances of which are stored in (potentially many) object-wrapped sources. Queries are expressed using terms from the rich vocabulary of the ontology, and are translated into an equivalent calculus expression, which references only the objects available in the source databases. The query is then optimized on the basis of information from the ontology and the source databases. Distinctive features of the approach include: the use of the expressive ALCQI description logic, which supports both ontology definition and query expression; the adoption of a global-as-view approach to relating the ontology to the sources; and the use of the ontology to direct semantic optimization of queries phrased over specific sources. The approach is being developed in, and is illustrated using examples from, bioinformatics.

36 citations


Journal ArticleDOI
01 Jun 2002
TL;DR: A solution derived from the semantic Web [a machine understandable World-Wide Web], the ontology inference layer (OIL), is described, as a solution for semantic bioinformatics resources.
Abstract: The complex questions and analyses posed by biologists, as well as the diverse data resources they develop, require the fusion of evidence from different, independently developed, and heterogeneous resources. The web, as an enabler for interoperability, has been an excellent mechanism for data publication and transportation. Successful exchange and integration of information, however, depends on a shared language for communication (a terminology) and a shared understanding of what the data means (an ontology). Without this kind of understanding, semantic heterogeneity remains a problem for both humans and machines. One means of dealing with heterogeneity in bioinformatics resources is through terminology founded upon an ontology. Bioinformatics resources tend to be rich in human readable and understandable annotation, with each resource using its own terminology. These resources are machine readable, but not machine understandable. Ontologies have a role in increasing this machine understanding, reducing the semantic heterogeneity between resources and thus promoting the flexible and reliable interoperation of bioinformatics resources. This paper describes a solution derived from the semantic Web [a machine understandable World-Wide Web (WWW)], the ontology inference layer (OIL), as a solution for semantic bioinformatics resources. The nature of the heterogeneity problems are presented along with a description of how metadata from domain ontologies can be used to alleviate this problem. A companion paper in this issue gives an example of the development of a bio-ontology using OIL.

29 citations


Proceedings ArticleDOI
17 Dec 2002
TL;DR: A service-oriented knowledge engineering approach that seeks to provide knowledge orientated support for distributed grid-based computing and demonstrates how knowledge has been captured and modelled as well as illustrating how ontologies have been developed and deployed.
Abstract: Computing increasingly addresses collaboration; sharing; and interaction involving distributed resources. This has been fuelled in part by the emergence of Grid technologies and web services. Drawing on our expertise in the Geodise project. We argue that there is a growing requirement for knowledge engineering methods that provide a semantic foundation for such distributed computing. Such methods also support the sharing and coordinated use of knowledge itself. In this paper we introduce a service-oriented knowledge engineering approach that seeks to provide knowledge orientated support for distributed grid-based computing. This approach has been implemented in a generic integrated architecture. The application context is the process of design search and optimisation in engineering. It demonstrates how knowledge has been captured and modelled, as well as illustrating how ontologies have been developed and deployed. The knowledge acquired has been made available and accessible through a portal that invokes a number of basic services.

21 citations


Journal ArticleDOI
01 Dec 2002
TL;DR: A panel entitled Scientific Data Integration was held on March 25, 2002, at the 8 th Conference on Extending Database Technology (EDBT) in Prague, in the Czech Republic and focused on the issues needed to be addressed to enable scientific data integration and discussed solutions.
Abstract: As scientific research becomes an increasingly larger portion of corporate expenditures, pressure is mounting to make the processes more efficient. Data acquisition, access, management, analysis and the sharing of all available resources will be at the core of the transformations needed to achieve the next level of efficiency in research organizations. However current data management technology is geared toward business data and several technical challenges remain to make it suitable for scientific data. A panel entitled Scientific Data Integration was held on March 25 ~h, 2002, at the 8 th Conference on Extending Database Technology (EDBT) in Prague, in the Czech Republic. The panel focused on the issues needed to be addressed to enable scientific data integration and discussed solutions. Omar Boucelma, Universit~ de Provence, and Zod Lacroix, Arizona State University, moderated the panel which included Silvana Castano, Universit~ di Milano, Carole Goble, University of Manchester, and Bertram LudRscher, San Diego Supercomputer Center.

15 citations


02 Sep 2002
TL;DR: In the process of optimisation and design search, the modelling and analysis of engineering problems are explited to yield improved designs as discussed by the authors, where the engineer explores various design parameters that he wishes to optimise and a measure of the quality of a particular design (the objective function) is computed using an appropriate model.
Abstract: During the process of optimisation and design search, the modelling nad analysis of engineering problems are explited to yield improved designs The engineer explores various design parameters that he wishes to optimise and a measure of the quality of a particular design (the objective function) is computed using an appropriate model A number of algorithms may be used to yield more information about the behaviour of a model, and to minimise/maximise the objective function, and hence improve the quality of the design This process may include lengthy and repetitive calculations to obtain the value of the objective function with respect to the design variables

15 citations


01 Jan 2002
TL;DR: My Grid as discussed by the authors is an e-Science Grid project that aims to help biologists and bioinformaticians to perform workflow-based in silico experiments, and help them to automate the management of such workflows through personalisation, notification of change and publication of experiments.
Abstract: My Grid is an e-Science Grid project that aims to help biologists and bioinformaticians to perform workflow-based in silico experiments, and help them to automate the management of such workflows through personalisation, notification of change and publication of experiments. In this paper, we describe the architecture of my Grid and how it will be used by the scientist. We then show how my Grid can benefit from agents technologies. We have identified three key uses of agent technologies in my Grid: user agents, able to customize and personalise data, agent communication languages offering a generic and portable communication medium, and negotiation allowing multiple distributed entities to reach service level agreements.

Proceedings ArticleDOI
17 Dec 2002
TL;DR: This position paper compares e-Science and e-Business using the discipline of bioinformatics and argues that the individual e-Scientist is now demanding more than the simple web interfaces prevalent in consumer e-commerce.
Abstract: We have models of commerce in a Web setting: business to business (B2B) and business to consumer (B2C). Now scientists commonly use Web based services to perform in-silico experiments. Thus we are prompted to ask the question "Are e-Scientists doing e-Business?". Do the infra-structure and models offered by e-Commerce support the activities e-Scientists need to perform? In this position paper we compare e-Science and e-Business using the discipline of bioinformatics. Such a comparison should inform the reuse of existing e-Business technologies in e-Science projects. We argue that the individual e-Scientist is now demanding more than the simple web interfaces prevalent in consumer e-commerce. Individual e-Scientists need to interact in a manner more akin to the B2B model than the B2C style previously used. We examine how the infrastructure prevalent in the B2B arena of e-commerce can be reused and extended to support the needs of today's e-Scientists. We illustrate this argument with reference to the myGrid e-Science middleware project.


02 Sep 2002
TL;DR: A service-oriented approach to providing knowledge support for distributed computing is introduced, and a generic knowledge service architecture is developed to realise this approach design search and optimisation (Geodise) to enhance the design process and also for architecture validation.
Abstract: Key Objectives: Introduce a service-oriented approach to providing knowledge support for distributed computing, develop a generic knowledge service architecture to realise such an approach, apply this approach design search and optimisation (Geodise) to enhance the design process and also for architecture validation. Motivation for the work: While computing increasingly addresses collaboration, sharing and interaction involve distributed services, there is a growing demand for knowledge services that provide underlying semantic support for such distributed services and also support the sharing and coordinated use of knowledge itself.

Proceedings ArticleDOI
24 Jul 2002
TL;DR: A Grid Enabled Optimisation and Design Search system that offers grid-based access to a state-of-the-art collection of optimisation and design search tools, industrial strength analysis codes, and distributed computing and data resources.
Abstract: We are developing a Grid Enabled Optimisation and Design Search system (GEODISE). It offers grid-based access to a state-of-the-art collection of optimisation and design search tools, industrial strength analysis codes, and distributed computing and data resources.

Proceedings Article
01 Jan 2002
TL;DR: This paper presents an approach to answering queries over an ontology modelled using a description logic, where queries are expressed using terms from the rich vocabulary of the ontology, and translated into an equivalent calculus expression, which references only the objects available in the source databases.
Abstract: This paper presents an approach to answering queries over an ontology modelled using a description logic. The ontology acts as a global schema, providing a declarative description of the concepts of the domain, the instances of which are stored in (potentially many) object-wrapped sources. Queries are expressed using terms from the rich vocabulary of the ontology, and are translated into an equivalent calculus expression, which references only the objects available in the source databases. The query is then optimised on the basis of information from the ontology and the source databases.

Book ChapterDOI
04 Sep 2002
TL;DR: This work introduces the Semantic Web concept and gives a number of examples of how AI has already contributed to its development, primarily through knowledge representation languages, and explores the reasons why it is a challenging environment for AI.
Abstract: The Semantic Web is a vision to move the Web from a place where information is processed by humans to one where processing can be automated. Currently, AI seems to be making an impact on bringing the vision to reality. To add semantics to the web requires languages for representing knowledge. To infer relationships between resources or new facts requires web-scale automated reasoning. However, there is some skepticism in the web community that AI can be made "web appropriate" and work on a web scale. I will introduce the Semantic Web concept and give a number of examples of how AI has already contributed to its development, primarily through knowledge representation languages. I will explore the reasons why the Semantic Web is a challenging environment for AI. I will suggest that this could be a killer app for AI, but we must recognize that the web is a vast and untidy place, and only a combination of approaches will yield success.


Journal ArticleDOI
01 Mar 2002
TL;DR: BNCOD 2001 included scientific papers, invited talks, a panel and a poster session, and the audience of 60 attendees was chiefly drawn from the UK database community.
Abstract: The annual series of the British National Conference on Databases has been a forum for UK database practitioners and a focus for database research since 1981. In recent years, interest in this conference series has extended well beyond the UK.BNCOD 2001, the 18th conference in the series, was held at the CLRC Rutherford Appleton Laboratory (RAL) from 9th -11th July 2001. RAL hosts national large-scale facilities for advanced scientific research. The Information Technology Department collaborates with the Laboratory's data centres that manage terabytes of data in remote sensing, high-energy physics and astronomy.BNCOD 2001 included scientific papers, invited talks, a panel and a poster session. The BNCOD Programme Committee, chaired by Professor Carole Goble of Manchester University, selected for presentation at the meeting eleven papers, about one third of those papers submitted. Contributors were drawn from the Netherlands, Germany, Sweden, Canada and USA, as well as the UK. The audience of 60 attendees was chiefly drawn from the UK database community. The Proceedings are published by Springer-Verlag in the Lecture Notes in Computer Science series, and are available online at: http://link.springer.de/link/service/series/0558/tocs/t2097.htm.

Proceedings Article
01 Jan 2002
TL;DR: This paper gives an overview of current business discovery initiatives and examines the consequences of their lack of support for business-related search criteria and presents and discusses a potential first step towards a solution.
Abstract: The emergence of the Internet has changed the way in which many aspects of business are conducted. One such aspect, that of resource discovery, is being changed by the development of a number of resource discovery systems, which facilitate the discovery of businesses (and their resources) anywhere in the world, through the medium of the Internet. However, these systems lack the power to discover potential trading partners based on business policy criteria, simply because they do not contain this sort of information. In this paper, we give an overview of current business discovery initiatives (focussing in particular on UDDI) and examine the consequences of their lack of support for business-related search criteria. We also present and discuss a potential first step towards a solution.