scispace - formally typeset
Search or ask a question

Showing papers by "Carole Goble published in 2006"


Journal ArticleDOI
TL;DR: Taverna is an application that eases the use and integration of the growing number of molecular biology tools and databases available on the web, especially web services, to perform a range of different analyses, such as sequence analysis and genome annotation.
Abstract: Taverna is an application that eases the use and integration of the growing number of molecular biology tools and databases available on the web, especially web services. It allows bioinformaticians to construct workflows or pipelines of services to perform a range of different analyses, such as sequence analysis and genome annotation. These high-level workflows can integrate many different resources into a single analysis. Taverna is available freely under the terms of the GNU Lesser General Public License (LGPL) from http://taverna.sourceforge.net/.

1,033 citations


Journal ArticleDOI
TL;DR: The Taverna Workbench as discussed by the authors is a Grid environment for the composition and execution of workflows for the life sciences community, which is based on the myGrid project's workbench.
Abstract: Life sciences research is based on individuals, often with diverse skills, assembled into research groups. These groups use their specialist expertise to address scientific problems. The in silico experiments undertaken by these research groups can be represented as workflows involving the co-ordinated use of analysis programs and information repositories that may be globally distributed. With regards to Grid computing, the requirements relate to the sharing of analysis and information resources rather than sharing computational power. The myGrid project has developed the Taverna Workbench for the composition and execution of workflows for the life sciences community. This experience paper describes lessons learnt during the development of Taverna. A common theme is the importance of understanding how workflows fit into the scientists' experimental context. The lessons reflect an evolving understanding of life scientists' requirements on a workflow environment, which is relevant to other areas of data intensive and exploratory science.

729 citations


Journal IssueDOI
TL;DR: The Taverna Workbench as mentioned in this paper is a Grid environment for the composition and execution of workflows for the life sciences community, which is based on the myGrid project's workbench.
Abstract: Life sciences research is based on individuals, often with diverse skills, assembled into research groups. These groups use their specialist expertise to address scientific problems. The in silico experiments undertaken by these research groups can be represented as workflows involving the co-ordinated use of analysis programs and information repositories that may be globally distributed. With regards to Grid computing, the requirements relate to the sharing of analysis and information resources rather than sharing computational power. The myGrid project has developed the Taverna Workbench for the composition and execution of workflows for the life sciences community. This experience paper describes lessons learnt during the development of Taverna. A common theme is the importance of understanding how workflows fit into the scientists' experimental context. The lessons reflect an evolving understanding of life scientists' requirements on a workflow environment, which is relevant to other areas of data intensive and exploratory science. Copyright © 2005 John Wiley & Sons, Ltd.

115 citations


Journal ArticleDOI
TL;DR: A Reference Architecture is proposed that extends OGSA to support the explicit handling of semantics, and defines the associated knowledge services to support a spectrum of service capabilities.

104 citations


Proceedings ArticleDOI
18 Sep 2006
TL;DR: This paper presents one solution to workflow discovery, a mechanism for ranking workflow fragments is provided based on graph sub-isomorphism matching, and finds that the average human ranking can largely be reproduced.
Abstract: Much has been written on the promise of Web service discovery and (semi-) automated composition. In this discussion, the value to practitioners of discovering and reusing existing service compositions, captured in workflows, is mostly ignored. This paper presents one solution to workflow discovery. Through a survey with 21 scientists and developers from the myGrid workflow environment, workflow discovery requirements are elicited. Through a user experiment with 13 scientists, an attempt is made to build a gold standard for workflow ranking. Through the design and implementation of a workflow discovery tool, a mechanism for ranking workflow fragments is provided based on graph sub-isomorphism matching. The tool evaluation, drawing on a corpus of 89 public workflows from bioinformatics and the results of the user experiment, finds that the average human ranking can largely be reproduced.

82 citations


Book ChapterDOI
05 Nov 2006
TL;DR: It is shown how information can be inferred about the semantics of operation parameters based on their connections to other (annotated) operation parameters within tried-and-tested workflows.
Abstract: Semantic annotations of web services can facilitate the discovery of services, as well as their composition into workflows. At present, however, the practical utility of such annotations is limited by the small number of service annotations available for general use. Resources for manual annotation are scarce, and therefore some means is required by which services can be automatically (or semi-automatically) annotated. In this paper, we show how information can be inferred about the semantics of operation parameters based on their connections to other (annotated) operation parameters within tried-and-tested workflows. In an open-world context, we can infer only constraints on the semantics of parameters, but these so-called loose annotations are still of value in detecting errors within workflows, annotations and ontologies, as well as in simplifying the manual annotation task.

44 citations


Book ChapterDOI
07 Oct 2006
TL;DR: The Semantic Web is an extension of the current Web in which information is given well-defined meaning to facilitate sharing and reuse, better enabling computers and people to work in cooperation.
Abstract: e-Science is scientific investigation performed through distributed global collaborations between scientists and their resources, and the computing infrastructure that enables this Scientific progress increasingly depends on pooling know-how and results; making connections between ideas, people, and data; and finding and reusing knowledge and resources generated by others in perhaps unintended ways It is about harvesting and harnessing the “collective intelligence” of the scientific community The Semantic Web is an extension of the current Web in which information is given well-defined meaning to facilitate sharing and reuse, better enabling computers and people to work in cooperation Applying the Semantic Web paradigm to e-Science has the potential to bring significant benefits to scientific discovery We identify the benefits of lightweight and heavyweight approaches, based on our experiences in the Life Sciences

22 citations


Book ChapterDOI
03 May 2006
TL;DR: The identity problem in bioinformatics data is described, and a protocol for managing identity co-references and allocating identity to gathered and computed data products is presented.
Abstract: myGrid is an e-Science project assisting life scientists to build workflows that gather data from distributed, autonomous, replicated and heterogeneous resources. The provenance logs of workflow executions are recorded as RDF graphs. The log of one workflow run is used to trace the history of its execution process. However, by aggregating provenance logs of many workflow runs, one may gather the provenance of a common data product shared in multiple derivation paths. A successful aggregation relies on accurate and universal identification of each data product. The nature of bioinformatics data and services, however, makes this difficult. We describe the identity problem in bioinformatics data, and present a protocol for managing identity co-references and allocating identity to gathered and computed data products. The ability to overcome this problem means that the provenance of workflows in bioinformatics and other domains can be exploited to enhance the practice of e-Science.

22 citations


Proceedings Article
23 May 2006
TL;DR: This year's WWW2006 has made a number of structural innovations over previous conferences including integrating both the developer track and tutorials with the main conference program instead of holding them on separate days.
Abstract: On behalf of the entire organizing committee, we welcome you to Edinburgh and WWW2006. This is the 15th conference in a series that was started in Geneva in 1994 and has subsequently been held in locations all over the world. WWW2006 is the first conference in the series to be held in the United Kingdom. This year's conference has made a number of structural innovations over previous conferences including integrating both the developer track and tutorials with the main conference program instead of holding them on separate days. There are two days of workshops compared with one day at previous conferences. The rich content provided over the five days of the conference include the latest research laying the groundwork for the future of the Web, the latest technology for Web developers, and presentations and discussions on how the Web is affecting industry, education, government, and society in general.The conference includes a wide variety of technical presentations, panel discussions, tutorials, and keynote addresses. We have many very distinguished speakers including Professor Sir Tim Berners-Lee, inventor of the Web, and the event is opened by Jack McConnell (First Minister of Scotland).

21 citations


01 Feb 2006
TL;DR: The Grid middleware and the Grid applications they support thrive on the metadata that describes resources in all their forms, the VOs, the policies that drive then and so on, together with the knowledge to apply that metadata intelligently.
Abstract: The Grid aims to support secure, flexible and coordinated resource sharing through providing a middleware platform for advanced distributing computing. Consequently, the Grid's infrastructural machinery aims to allow collections of any kind of resources computing, storage, data sets, digital libraries, scientific instruments, people, etc. to easily form Virtual Organisations (VOs) that cross organisational boundaries in order to work together to solve a problem. A Grid depends on understanding the available resources, their capabilities, how to assemble them and how to best exploit them. Thus Grid middleware and the Grid applications they support thrive on the metadata that describes resources in all their forms, the VOs, the policies that drive then and so on, together with the knowledge to apply that metadata intelligently.

16 citations


22 May 2006
TL;DR: This second workshop is decidedly cross disciplinary in nature and brings together users, accessibility experts, graphic designers, and technologists from academia and industry to discuss how accessibility can be supported.
Abstract: After the launch of the Mobile Web Initiative at the World Wide Web Conference 2005 we are beginning to realise that, today, mobile Web access suffers from interoperability and usability problems that make the Web difficult to use. With the move to small screen size, low bandwidth, and different operating modalities, technology is in effect simulating the sensory and cognitive impairments experienced by disabled users within the wider population of mobile device users. In this our third workshop we ask the question "Is engineering, designing, and building for the mobile Web just a rehash of the same old Web accessibility problems?"These proceedings bring together a cross section of the Web design, accessibility and mobile Web communities. The papers included here report on developments, discuss the issues, and suggest cross-pollinated solutions.Conventional workshops on accessibility tend to be single disciplinary in nature. However, we are concerned that this focus on a single participant group prevents the cross-pollination of ideas, needs, and technologies from other related but separate fields. As with the first, this second workshop is decidedly cross disciplinary in nature and brings together users, accessibility experts, graphic designers, and technologists from academia and industry to discuss how accessibility can be supported. We also encourage the participation of users and other interested parties as an additional balance to the discussion. Our aim is to focus on accessibility by encouraging participation from many disciplines. Views often bridge academia, commerce, and industry and arguments encompass a range of beliefs across the design-accessibility spectrum.

01 Jan 2006
TL;DR: This paper reports on the discussions and recommendations of the workshop on the challenges of Scientific Workflows, sponsored by the National Science Foundation and held on May 1-2, 2006, to discuss requirements of future scientific applications and the challenges that they present to current workflow technologies.
Abstract: Workflows have recently emerged as a paradigm for representing and managing complex distributed scientific computations and therefore accelerate the pace of scientific progress. A recent workshop on the Challenges of Scientific Workflows, sponsored by the National Science Foundation and held on May 1-2, 2006, brought together domain scientists, computer scientists, and social scientists to discuss requirements of future scientific applications and the challenges that they present to current workflow technologies. This paper reports on the discussions and recommendations of the workshop, the full report can be found at http://www.isi.edu/nsf-workflows06. 1. Introduction Significant scientific advances are increasingly achieved through complex sets of computations and data analyses. These computations may comprise thousands of steps, where each step may integrate diverse models and data sources developed by different groups. The applications and data may be also distributed in the execution environment. The assembly and management of such complex distributed computations present many challenges, and increasingly ambitious scientific inquiry is continuously pushing the limits of current technology.

Journal ArticleDOI
TL;DR: A generic framework for engineering and managing services’ Semantic Metadata (SMD) with the ultimate purpose of facilitating interoperability, automation, and knowledgeable reuse of services for problem solving is presented.
Abstract: Web/Grid services’ metadata and semantics are becoming increasing important for service sharing and effective reuse. In this paper we present a generic framework for engineering and managing services’ Semantic Metadata (SMD) with the ultimate purpose of facilitating interoperability, automation, and knowledgeable reuse of services for problem solving. The framework addresses fundamental issues, approaches, and tools for the whole lifecycle of SMD management, in other words, those of acquiring, modeling, representing, publishing, and reusing services’ SMD. It adopts ontologies and the Semantic Web technologies as the enabling technologies by which services’ metadata are semantically enriched and made interoperable, understandable, and accessible on the Web/Grid for both humans and machines. In particular, mechanisms are proposed to make use of service SMD for service discovery and composition. The paper also describes a service SMD management system in the context of the UK e-Science project GEODISE. A suite of tools are developed, which forms the core of the SMD management infrastructure. We demonstrate the added value of the use of SMD through the integration of SMD management with GEODISE application systems.

Proceedings ArticleDOI
04 Dec 2006
TL;DR: The paper describes the design of FAME (Flexible Access Middleware Extension) architecture aimed at providing multi-level user authentication service for Shibboleth, which is endorsed by the Joint Information Systems Committee (JISC) as the next generation authentication and authorisation infrastructure for the UK Higher Education community.
Abstract: The paper describes the design of FAME (Flexible Access Middleware Extension) architecture aimed at providing multi-level user authentication service for Shibboleth, which is endorsed by the Joint Information Systems Committee (JISC) as the next generation authentication and authorisation infrastructure for the UK Higher Education community. FAME derives authentication assurance level based upon the strength of the authentication token and protocol used by the user when authenticating and feeds it to the PERMIS authorisation decision engine via Shibboleth to enable more fine-grained access control. In this way, access to resources is controlled according to the strength of the authentication method so that more sensitive resources may require users to identify themselves using a higher level of authentication.

Book ChapterDOI
21 Jun 2006
TL;DR: This paper discusses some of these limitations, developments in Semantic Web technologies and presents a system – COHSE – that dynamically links Web pages and concludes with remarks on future directions for semantics-based linking.
Abstract: Since Ted Nelson coined the term “Hypertext”, there has been extensive research on non-linear documents. With the enormous success of the Web, non-linear documents have become an important part of our daily life activities. However, the underlying hypertext infrastructure of the Web still lacks many features that Hypertext pioneers envisioned. With advances in the Semantic Web, we can address and improve some of these limitations. In this paper, we discuss some of these limitations, developments in Semantic Web technologies and present a system – COHSE – that dynamically links Web pages. We conclude with remarks on future directions for semantics-based linking.

Proceedings Article
01 Jun 2006
TL;DR: This work presents the implementation of an S-OGSA observant semantically-enabled Grid authorization scenario, which demonstrates the utility of explicit semantics for undertaking an essential activity in the Grid: resource access control.
Abstract: The Semantic Grid initiative aims to exploit knowledge in the Grid to increase the automation, interoperability and flexibility of Grid middleware and applications. To bring a principled approach to developing Semantic Grid Systems, and to outline their core capabilities and behaviors, we have devised a reference Semantic Grid Architecture called S-OGSA. We present the implementation of an S-OGSA observant semantically-enabled Grid authorization scenario, which demonstrates two aspects: 1) the roles of different middleware components, be them semantic or non-semantic, and 2) the utility of explicit semantics for undertaking an essential activity in the Grid: resource access control.

Journal ArticleDOI
TL;DR: The Third W4A International Cross-Disciplinary Workshop on Web Accessibility was held on Monday 22nd and Tuesday 23rd May 2006 as part of the Fifteenth International World Wide Web Conference (WWW2006) located at the Edinburgh International Conference Centre.
Abstract: The Third W4A International Cross-Disciplinary Workshop on Web Accessibility was held on Monday 22nd and Tuesday 23rd May 2006 as part of the Fifteenth International World Wide Web Conference (WWW2006) located at the Edinburgh International Conference Centre. We ran over 2 days, welcomed 73 attendees, and were the biggest workshop at the conference. We accepted 41.6% of all submissions, each paper was peer reviewed by three of our programme committee. We published ISBN'ed proceedings as part of the ACM Digital Library, and eight of our authors have been invited to submit extended papers to the Springer Journal, Universal Access in the Information Society. Comments from our attendees, and our workshop evaluation questionnaires, suggested that they enjoyed the workshop and would be participating again next year. Our social programme also attracted 20 of our delegates. Overall we judge the workshop to be a great success.




Proceedings Article
01 Oct 2006

Book Chapter
01 Dec 2006
TL;DR: The Semantic Grid reference architecture, S-OGSA, includes semantic provisioning services that are able to produce semantic annotations of Grid resources, and semantically aware Gridservices that can exploit those annotations in various ways as discussed by the authors.
Abstract: The Semantic Grid reference architecture, S-OGSA, includes semantic provisioning services that are able to produce semantic annotations of Grid resources, and semantically aware Gridservices that are able to exploit those annotations in various ways. In this paper we describe the dynamic aspects of S-OGSA by presenting the typical patterns of interaction among these services. A use case for a Grid meta-scheduling service is used to illustrate how the patterns are applied in practice.