scispace - formally typeset
Search or ask a question

Showing papers on "Web modeling published in 2009"


Journal ArticleDOI
TL;DR: The authors describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked data community as it moves forward.
Abstract: The term “Linked Data” refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions— the Web of Data. In this article, the authors present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. They describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward.

5,113 citations


Book
06 Aug 2009
TL;DR: This book concentrates on Semantic Web technologies standardized by the World Wide Web Consortium: RDF and SPARQL enable data exchange and querying, RDFS and OWL provide expressive ontology modeling, and RIF supports rule-based modeling.
Abstract: With more substantial funding from research organizations and industry, numerous large-scale applications, and recently developed technologies, the Semantic Web is quickly emerging as a well-recognized and important area of computer science. While Semantic Web technologies are still rapidly evolving, Foundations of Semantic Web Technologies focuses on the established foundations in this area that have become relatively stable over time. It thoroughly covers basic introductions and intuitions, technical details, and formal foundations.The book concentrates on Semantic Web technologies standardized by the World Wide Web Consortium: RDF and SPARQL enable data exchange and querying, RDFS and OWL provide expressive ontology modeling, and RIF supports rule-based modeling. The text also describes methods for specifying, querying, and reasoning with ontological information. In addition, it explores topics that are clearly beyond foundations, such as tools, applications, and engineering aspects.Written by highly respected researchers with a deep understanding of the material, this text centers on the formal specifications of the subject and supplies many pointers that are useful for employing Semantic Web technologies in practice.Updates, errata, slides for teaching, and links to further resources are available at http://semantic-web-book.org/

720 citations


Journal ArticleDOI
TL;DR: As work in Web page classification is reviewed, the importance of these Web-specific features and algorithms are noted, state-of-the-art practices are described, and the underlying assumptions behind the use of information from neighboring pages are tracked.
Abstract: Classification of Web page content is essential to many tasks in Web information retrieval such as maintaining Web directories and focused crawling. The uncontrolled nature of Web content presents additional challenges to Web page classification as compared to traditional text classification, but the interconnected nature of hypertext also provides features that can assist the process.As we review work in Web page classification, we note the importance of these Web-specific features and algorithms, describe state-of-the-art practices, and track the underlying assumptions behind the use of information from neighboring pages.

502 citations


Proceedings ArticleDOI
06 Jul 2009
TL;DR: The comprehensive experimental analysis shows that WSRec achieves better prediction accuracy than other approaches, and includes a user-contribution mechanism for Web service QoS information collection and an effective and novel hybrid collaborative filtering algorithm for Web Service QoS value prediction.
Abstract: As the abundance of Web services on the World Wide Web increase,designing effective approaches for Web service selection and recommendation has become more and more important. In this paper, we present WSRec, a Web service recommender system, to attack this crucial problem. WSRec includes a user-contribution mechanism for Web service QoS information collection and an effective and novel hybrid collaborative filtering algorithm for Web service QoS value prediction. WSRec is implemented by Java language and deployed to the real-world environment. To study the prediction performance, A total of 21,197 public Web services are obtained from the Internet and a large-scale real-world experiment is conducted, where more than 1.5 millions test results are collected from 150 service users in different countries on 100 publicly available Web services located all over the world. The comprehensive experimental analysis shows that WSRec achieves better prediction accuracy than other approaches.

436 citations


Proceedings ArticleDOI
20 Apr 2009
TL;DR: Triplify is implemented as a light-weight software component, which can be easily integrated into and deployed by the numerous, widely installed Web applications and is usable to publish very large datasets, such as 160GB of geo data from the OpenStreetMap project.
Abstract: In this paper we present Triplify - a simplistic but effective approach to publish Linked Data from relational databases. Triplify is based on mapping HTTP-URI requests onto relational database queries. Triplify transforms the resulting relations into RDF statements and publishes the data on the Web in various RDF serializations, in particular as Linked Data. The rationale for developing Triplify is that the largest part of information on the Web is already stored in structured form, often as data contained in relational databases, but usually published by Web applications only as HTML mixing structure, layout and content. In order to reveal the pure structured information behind the current Web, we have implemented Triplify as a light-weight software component, which can be easily integrated into and deployed by the numerous, widely installed Web applications. Our approach includes a method for publishing update logs to enable incremental crawling of linked data sources. Triplify is complemented by a library of configurations for common relational schemata and a REST-enabled data source registry. Triplify configurations containing mappings are provided for many popular Web applications, including osCommerce, WordPress, Drupal, Gallery, and phpBB. We will show that despite its light-weight architecture Triplify is usable to publish very large datasets, such as 160GB of geo data from the OpenStreetMap project.

321 citations


Book ChapterDOI
01 Jan 2009
TL;DR: An expanded Web Content Analysis (WebCA) paradigm is proposed, in which insights from paradigms such as discourse analysis and social network analysis are operationalized and implemented within a general content analytic framework.
Abstract: Are established methods of content analysis (CA) adequate to analyze web content, or should new methods be devised to address new technological developments? This article addresses this question by contrasting narrow and broad interpretations of the concept of web content analysis. The utility of a broad interpretation that subsumes the narrow one is then illustrated with reference to research on weblogs (blogs), a popular web format in which features of HTML documents and interactive computer-mediated communication converge. The article concludes by proposing an expanded Web Content Analysis (WebCA) paradigm in which insights from paradigms such as discourse analysis and social network analysis are operationalized and implemented within a general content analytic framework.

304 citations


Journal ArticleDOI
TL;DR: The paper concludes by stating that the Web has succeeded as a single global information space that has dramatically changed the way the authors use information, disrupted business models, and led to profound societal change.
Abstract: The paper discusses the semantic Web and Linked Data. The classic World Wide Web is built upon the idea of setting hyperlinks between Web documents. These hyperlinks are the basis for navigating and crawling the Web.Technologically, the core idea of Linked Data is to use HTTP URLs not only to identify Web documents, but also to identify arbitrary real world entities.Data about these entities is represented using the Resource Description Framework (RDF). Whenever a Web client resolves one of these URLs, the corresponding Web server provides an RDF/ XML or RDFa description of the identified entity. These descriptions can contain links to entities described by other data sources.The Web of Linked Data can be seen as an additional layer that is tightly interwoven with the classic document Web. The author mentions the application of Linked Data in media, publications, life sciences, geographic data, user-generated content, and cross-domain data sources. The paper concludes by stating that the Web has succeeded as a single global information space that has dramatically changed the way we use information, disrupted business models, and led to profound societal change.

293 citations


01 Jan 2009
TL;DR: Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input, is introduced.
Abstract: The World Wide Web contains a significant amount of information expressed using natural language. While unstructured text is often difficult for machines to understand, the field of Information Extraction (IE) offers a way to map textual content into a structured knowledge base. The ability to amass vast quantities of information from Web pages has the potential to increase the power with which a modern search engine can answer complex queries. IE has traditionally focused on acquiring knowledge about particular relationships within a small collection of domain-specific text. Typically, a target relation is provided to the system as input along with extraction patterns or examples that have been specified by hand. Shifting to a new relation requires a person to create new patterns or examples. This manual labor scales linearly with the number of relations of interest. The task of extracting information from the Web presents several challenges for existing IE systems. The Web is large and heterogeneous; the number of potentially interesting relations is massive and their identity often unknown. To enable large-scale knowledge acquisition from the Web, this thesis presents Open Information Extraction, a novel extraction paradigm that automatically discovers thousands of relations from unstructured text and readily scales to the size and diversity of the Web.

279 citations


Book
27 Apr 2009
TL;DR: This book argues that it can be useful for social scientists to measure aspects of the web and explains how this can be achieved on both a small and large scale.
Abstract: Webometrics is concerned with measuring aspects of the web: web sites, web pages, parts of web pages, words in web pages, hyperlinks, web search engine results. The importance of the web itself as a communication medium and for hosting an increasingly wide array of documents, from journal articles to holiday brochures, needs no introduction. Given this huge and easily accessible source of information, there are limitless possibilities for measuring or counting on a huge scale (e.g., the number of web sites, the number of web pages, the number of blogs) or on a smaller scale (e.g., the number of web sites in Ireland, the number of web pages in the CNN web site, the number of blogs mentioning Barack Obama before the 2008 presidential campaign). This book argues that it can be useful for social scientists to measure aspects of the web and explains how this can be achieved on both a small and large scale. The book is intended for social scientists with research topics that are wholly or partly online (e.g., social networks, news, political communication) and social scientists with offline research topics with an online reflection, even if this is not a core component (e.g., diaspora communities, consumer culture, linguistic change). The book is also intended for library and information science students in the belief that the knowledge and techniques described will be useful for them to guide and aid other social scientists in their research. In addition, the techniques and issues are all directly relevant to library and information science research problems. Table of Contents: Introduction / Web Impact Assessment / Link Analysis / Blog Searching / Automatic Search Engine Searches: LexiURL Searcher / Web Crawling: SocSciBot / Search Engines and Data Reliability / Tracking User Actions Online / Advaned Techniques / Summary and Future Directions

257 citations


Journal ArticleDOI
01 Aug 2009
TL;DR: Octopus is a system that combines search, extraction, data cleaning and integration, and enables users to create new data sets from those found on the Web, to offer the user a set of best-effort operators that automate the most labor-intensive tasks.
Abstract: The Web contains a vast amount of structured information such as HTML tables, HTML lists and deep-web databases; there is enormous potential in combining and re-purposing this data in creative ways. However, integrating data from this relational web raises several challenges that are not addressed by current data integration systems or mash-up tools. First, the structured data is usually not published cleanly and must be extracted (say, from an HTML list) before it can be used. Second, due to the vastness of the corpus, a user can never know all of the potentially-relevant databases ahead of time (much less write a wrapper or mapping for each one); the source databases must be discovered during the integration process. Third, some of the important information regarding the data is only present in its enclosing web page and needs to be extracted appropriately.This paper describes Octopus, a system that combines search, extraction, data cleaning and integration, and enables users to create new data sets from those found on the Web. The key idea underlying Octopus is to offer the user a set of best-effort operators that automate the most labor-intensive tasks. For example, the Search operator takes a search-style keyword query and returns a set of relevance-ranked and similarity-clustered structured data sources on the Web; the Context operator helps the user specify the semantics of the sources by inferring attribute values that may not appear in the source itself, and the Extend operator helps the user find related sources that can be joined to add new attributes to a table. Octopus executes some of these operators automatically, but always allows the user to provide feedback and correct errors. We describe the algorithms underlying each of these operators and experiments that demonstrate their efficacy.

255 citations


Journal ArticleDOI
TL;DR: This work presents UDDI registry by example (Urbe), a novel approach for Web service retrieval based on the evaluation of similarity between Web service interfaces that is implemented in a prototype that extends a universal description, discovery and integration compliant Web service registry.
Abstract: In this work, we present UDDI registry by example (Urbe), a novel approach for Web service retrieval based on the evaluation of similarity between Web service interfaces. Our approach assumes that the Web service interfaces are defined with Web service description language (WSDL) and the algorithm combines the analysis of their structures and the analysis of the terms used inside them. The higher the similarity, the less are the differences among their interfaces. As a consequence, Urbe is useful when we need to find a Web service suitable to replace an existing one that fails. Especially in autonomic systems, this situation is very common since we need to ensure the self-management, the self-configuration, the self-optimization, the self-healing, and the self-protection of the application that is based on the failed Web service. A semantic-oriented variant of the approach is also proposed, where we take advantage of annotations semantically enriching WSDL specifications. Semantic Annotation for WSDL (SAWSDL) is adopted as a language to annotate a WSDL description. The Urbe approach has been implemented in a prototype that extends a universal description, discovery and integration (UDDI) compliant Web service registry.

Book ChapterDOI
06 Nov 2009
TL;DR: This paper presents TripleRank, a novel approach for faceted authority ranking in the context of RDF knowledge bases that captures the additional latent semantics of Semantic Web data by means of statistical methods in order to produce richer descriptions of the available data.
Abstract: The Semantic Web fosters novel applications targeting a more efficient and satisfying exploitation of the data available on the web, e.g. faceted browsing of linked open data. Large amounts and high diversity of knowledge in the Semantic Web pose the challenging question of appropriate relevance ranking for producing fine-grained and rich descriptions of the available data, e.g. to guide the user along most promising knowledge aspects. Existing methods for graph-based authority ranking lack support for fine-grained latent coherence between resources and predicates (i.e. support for link semantics in the linked data model). In this paper, we present TripleRank, a novel approach for faceted authority ranking in the context of RDF knowledge bases. TripleRank captures the additional latent semantics of Semantic Web data by means of statistical methods in order to produce richer descriptions of the available data. We model the Semantic Web by a 3-dimensional tensor that enables the seamless representation of arbitrary semantic links. For the analysis of that model, we apply the PARAFAC decomposition, which can be seen as a multi-modal counterpart to Web authority ranking with HITS. The result are groupings of resources and predicates that characterize their authority and navigational (hub) properties with respect to identified topics. We have applied TripleRank to multiple data sets from the linked open data community and gathered encouraging feedback in a user evaluation where TripleRank results have been exploited in a faceted browsing scenario.

Journal ArticleDOI
TL;DR: It is hard to find a truly context‐aware web service‐based system that is interoperable and secure, and operates on multi‐organizational environments.
Abstract: Purpose – This survey aims to study and analyze current techniques and methods for context‐aware web service systems, to discuss future trends and propose further steps on making web services systems context‐aware.Design/methodology/approach – The paper analyzes and compares existing context‐aware web service‐based systems based on techniques they support, such as context information modeling, context sensing, distribution, security and privacy, and adaptation techniques. Existing systems are also examined in terms of application domains, system type, mobility support, multi‐organization support and level of web services implementation.Findings – Supporting context‐aware web service‐based systems is increasing. It is hard to find a truly context‐aware web service‐based system that is interoperable and secure, and operates on multi‐organizational environments. Various issues, such as distributed context management, context‐aware service modeling and engineering, context reasoning and quality of context, se...

Journal ArticleDOI
TL;DR: The base of Web 3.0 applications resides in the resource description framework (RDF) for providing a means to link data from multiple Web sites or databases, and with the SPARQL query language, applications can use native graph-based RDF stores and extract RDF data from traditional databases.
Abstract: While Web 3.0 technologies are difficult to define precisely, the outline of emerging applications has become clear over the past year. We can thus essentially view Web 3.0 as semantic Web technologies integrated into, or powering, large-scale Web applications. The base of Web 3.0 applications resides in the resource description framework (RDF) for providing a means to link data from multiple Web sites or databases. With the SPARQL query language, a SQL-like standard for querying RDF data, applications can use native graph-based RDF stores and extract RDF data from traditional databases.

Book
27 Mar 2009
TL;DR: This book explains examines how this powerful new technology can unify and fully leverage the ever-growing data, information, and services that are available on the Internet.
Abstract: The next major advance in the Web?Web 3.0?will be built on semantic Web technologies, which will allow data to be shared and reused across application, enterprise, and community boundaries. Written by a team of highly experienced Web developers, this book explains examines how this powerful new technology can unify and fully leverage the ever-growing data, information, and services that are available on the Internet. Helpful examples demonstrate how to use the semantic Web to solve practical, real-world problems while you take a look at the set of design principles, collaborative working groups, and technologies that form the semantic Web. The companion Web site features full code, as well as a reference section, a FAQ section, a discussion forum, and a semantic blog.

Proceedings Article
10 Aug 2009
TL;DR: Gazelle is introduced, a secure web browser constructed as a multi-principal OS that exclusively manages resource protection and sharing across web site principals and exposes intricate design issues that no previous work has identified.
Abstract: Original web browsers were applications designed to view static web content. As web sites evolved into dynamic web applications that compose content from multiple web sites, browsers have become multiprincipal operating environments with resources shared among mutually distrusting web site principals. Nevertheless, no existing browsers, including new architectures like IE 8, Google Chrome, and OP, have a multi-principal operating system construction that gives a browser-based OS the exclusive control to manage the protection of all system resources among web site principals. In this paper, we introduce Gazelle, a secure web browser constructed as a multi-principal OS. Gazelle's browser kernel is an operating system that exclusively manages resource protection and sharing across web site principals. This construction exposes intricate design issues that no previous work has identified, such as crossprotection-domain display and events protection. We elaborate on these issues and provide comprehensive solutions. Our prototype implementation and evaluation experience indicates that it is realistic to turn an existing browser into a multi-principal OS that yields significantly stronger security and robustness with acceptable performance.

Journal ArticleDOI
01 Sep 2009
TL;DR: This paper shows how to publish a BPEL process as a RESTful Web service, by exposing selected parts of its execution state using the REST interaction primitives and discusses how the proposed extensions affect the architecture of a process execution engine.
Abstract: Current Web service technology is evolving towards a simpler approach to define Web service APIs that challenges the assumptions made by existing languages for Web service composition. RESTful Web services introduce a new kind of abstraction, the resource, which does not fit well with the message-oriented paradigm of the Web service description language (WSDL). RESTful Web services are thus hard to compose using the Business Process Execution Language (WS-BPEL), due to its tight coupling to WSDL. The goal of the BPEL for REST extensions presented in this paper is twofold. First, we aim to enable the composition of both RESTful Web services and traditional Web services from within the same process-oriented service composition language. Second, we show how to publish a BPEL process as a RESTful Web service, by exposing selected parts of its execution state using the REST interaction primitives. We include a detailed example on how BPEL for REST can be applied to orchestrate a RESTful e-Commerce scenario and discuss how the proposed extensions affect the architecture of a process execution engine.

Journal ArticleDOI
TL;DR: An advanced architecture for a personalization system to facilitate Web mining is proposed and the meaning of several recommendations are described, starting from the rules discovered by the Web mining algorithms.
Abstract: Nowadays, the application of Web mining techniques in e-learning and Web-based adaptive educational systems is increasing exponentially. In this paper, we propose an advanced architecture for a personalization system to facilitate Web mining. A specific Web mining tool is developed and a recommender engine is integrated into the AHA! system in order to help the instructor to carry out the whole Web mining process. Our objective is to be able to recommend to a student the most appropriate links/Web pages within the AHA! system to visit next. Several experiments are carried out with real data provided by Eindhoven University of Technology students in order to test both the architecture proposed and the algorithms used. Finally, we have also described the meaning of several recommendations, starting from the rules discovered by the Web mining algorithms.

01 Jan 2009
TL;DR: Web 2.0 tools present a vast array of opportunities—for companies that know how to use them and what to do with them.
Abstract: Web 2.0 tools present a vast array of opportunities—for companies that know how to use them.

Proceedings ArticleDOI
01 Apr 2009
TL;DR: A multi-process browser architecture is presented that isolates web program instances from each other, improving fault tolerance, resource management, and performance.
Abstract: Many of today's web sites contain substantial amounts of client-side code, and consequently, they act more like programs than simple documents. This creates robustness and performance challenges for web browsers. To give users a robust and responsive platform, the browser must identify program boundaries and provide isolation between them.We provide three contributions in this paper. First, we present abstractions of web programs and program instances, and we show that these abstractions clarify how browser components interact and how appropriate program boundaries can be identified. Second, we identify backwards compatibility tradeoffs that constrain how web content can be divided into programs without disrupting existing web sites. Third, we present a multi-process browser architecture that isolates these web program instances from each other, improving fault tolerance, resource management, and performance. We discuss how this architecture is implemented in Google Chrome, and we provide a quantitative performance evaluation examining its benefits and costs.

01 Jan 2009
TL;DR: In-depth analysis of Web Log Data of NASA website to find information about a web site, top errors, potential visitors of the site etc, which help system administrator and Web designer to improve their system by determining occurred systems errors, corrupted and broken links by using web using mining.
Abstract: Summary Web usage mining is application of data mining techniques to discover usage patterns from web data, in order to better serve the needs of web based applications. The user access log files present very significant information about a web server. This paper is concerned with the in-depth analysis of Web Log Data of NASA website to find information about a web site, top errors, potential visitors of the site etc. which help system administrator and Web designer to improve their system by determining occurred systems errors, corrupted and broken links by using web using mining. The obtained results of the study will be used in the further development of the web site in order to increase its effectiveness.

Proceedings ArticleDOI
25 May 2009
TL;DR: It is shown that Web protocols and technologies are good candidates to design the Internet of Things, and a new way to design embedded Web servers is proposed, using a dedicated TCP/IP stack and numerous cross-layer off-line pre-calculation.
Abstract: In this paper, we show that Web protocols and technologies are good candidates to design the Internet of Things. This approach allows anyone to access embedded devices through a Web application, via a standard Web browser. This Web of Things requires to embed Web servers in hardware-constrained devices. We first analyze the traffics embedded Web servers have to handle. Starting from this analysis, we propose a new way to design embedded Web servers, using a dedicated TCP/IP stack and numerous cross-layer off-line pre-calculation (where information are shared between IP, TCP, HTTP and the Web application). We finally present a prototype -- named Smews -- as a proof of concept of our proposals. It has been embedded in tiny devices (smart cards, sensors and other embedded devices), with a requirement of only 200~bytes of RAM and 7~kilo-bytes of code. We show that it is significantly faster than other state of the art solutions. We made Smews source code publically available under an open-source license.

Journal ArticleDOI
TL;DR: This article shows how linked data sets can be exploited to build rich Web applications with little effort.
Abstract: Semantic Web technologies have been around for a while. However, such technologies have had little impact on the development of real-world Web applications to date. With linked data, this situation has changed dramatically in the past few months. This article shows how linked data sets can be exploited to build rich Web applications with little effort.

Proceedings ArticleDOI
06 Mar 2009
TL;DR: A survey of page ranking algorithms and comparison of some important algorithms in context of performance has been carried out.
Abstract: Web mining is an active research area in present scenario. Web Mining is defined as the application of data mining techniques on the World Wide Web to find hidden information, This hidden information i. e. knowledge could be contained in content of web pages or in link structure of WWW or in web server logs. Based upon the type of knowledge, web mining is usually divided in three categories: web content mining, web structure mining and web usage mining. An application of web mining can be seen in the case of search engines. Most of the search engines are ranking their search results in response to users' queries to make their search navigation easier. In this paper, a survey of page ranking algorithms and comparison of some important algorithms in context of performance has been carried out.

Proceedings ArticleDOI
06 Jul 2009
TL;DR: The challenges of composing RESTful Web services are discussed and a formal model for describing individual Web services and automating the composition is proposed and demonstrated by applying it to a real-world RESTfulWeb service composition problem.
Abstract: Emerging as the popular choice for leading Internet companies to expose internal data and resources, Restful Web services are attracting increasing attention in the industry.While automating WSDL/SOAP based Web service composition has been extensively studied in the research community, automated RESTful Web service composition in the context of service-oriented architecture (SOA), to the best of our knowledge, is less explored. As an early paper addressing this problem, this paper discusses the challenges of composing RESTful Web services and proposes a formal model for describing individual Web services and automating the composition. It demonstrates our approach by applying it to a real-world RESTful Web service composition problem. This paper represents our initial efforts towards the problem of automated RESTful Web service composition.We are hoping that it will draw interests from the research community on Web services, and engage more researchers in this challenge.

Journal ArticleDOI
TL;DR: A matchmaking algorithm for the ranking of functionally equivalent services can be used to enhance Web services self-healing properties in reaction to QoS-related service failures; second, it can be exploited in process optimization for the online reconfiguration of candidate Web services QoS SLAs.
Abstract: The extensive adoption of Web service-based applications in dynamic business scenarios, such as on-demand computing or highly reconfigurable virtual enterprises, advocates for methods and tools for the management of Web service nonfunctional aspects, such as Quality of Service (QoS). Concerning contracts on Web service QoS, the literature has mostly focused on the contract definition and on mechanisms for contract enactment, such as the monitoring of the satisfaction of negotiated QoS guarantees. In this context, this article proposes a framework for the automation of the Web service contract specification and establishment. An extensible model for defining both domain-dependent and domain-independent Web service QoS dimensions and a method for the automation of the contract establishment phase are proposed. We describe a matchmaking algorithm for the ranking of functionally equivalent services, which orders services on the basis of their ability to fulfill the service requestor requirements, while maintaining the price below a specified budget. We also provide an algorithm for the configuration of the negotiable part of the QoS Service-Level Agreement (SLA), which is used to configure the agreement with the top-ranked service identified in the matchmaking phase. Experimental results show that, in a utility theory perspective, the contract establishment phase leads to efficient outcomes. We envision two advanced application scenarios for the Web service contracting framework proposed in this article. First, it can be used to enhance Web services self-healing properties in reaction to QoS-related service failures; second, it can be exploited in process optimization for the online reconfiguration of candidate Web services QoS SLAs.

Book ChapterDOI
14 Jan 2009
TL;DR: A novel taxonomy is proposed that captures the possible failures that can arise in Web service composition, and classifies the faults that might cause them, and covers physical, development and interaction faults that can cause a variety of observable failures in a system's normal operation.
Abstract: Web services are becoming progressively popular in the building of both inter- and intra-enterprise business processes. These processes are composed from existing Web services based on defined requirements. In collecting together the services for such a composition, developers can employ languages and standards for the Web that facilitate the automation of Web service discovery, execution, composition and interoperation. However, there is no guarantee that a composition of even very good services will always work. Mechanisms are being developed to monitor a composition and to detect and recover from faults automatically. A key factor in such self-healing is to know what faults to look for. If the nature of a fault is known, the system can suggest a suitable recovery mechanism sooner. This paper proposes a novel taxonomy that captures the possible failures that can arise in Web service composition, and classifies the faults that might cause them. The taxonomy covers physical, development and interaction faults that can cause a variety of observable failures in a system's normal operation. An important use of the taxonomy is identifying the faults that can be excluded when a failure occurs. Examples of using the taxonomy are presented.

Journal ArticleDOI
TL;DR: This paper proposes techniques to automatically gather, discover and integrate features related to a set of WSDL files and cluster them into naturally occurring service groups and demonstrates the great potential of the proposed techniques.
Abstract: The idea of a decentralised, self-organising service-oriented architecture seems to be more and more plausible than the traditional registry-based ones in view of the success of the web and the reluctance in taking up web service technologies. Automatically clustering Web Service Description Language (WSDL) files on the web into functionally similar homogeneous service groups can be seen as a bootstrapping step for creating a service search engine and, at the same time, reducing the search space for service discovery. This paper proposes techniques to automatically gather, discover and integrate features related to a set of WSDL files and cluster them into naturally occurring service groups. Despite the lack of a common platform for assessing the performance of web service cluster discovery, our initial experiments using real-world WSDL files demonstrated the great potential of the proposed techniques.

Book ChapterDOI
10 Nov 2009
TL;DR: This paper proposes a unified component model and a universal, event-based composition model, both able to abstract from low-level implementation details and technology specifics, and provides universal composition as a service in form of an easy-to-use graphical development tool equipped with an execution environment for fast deployment and execution of composite Web applications.
Abstract: Information integration, application integration and component-based software development have been among the most important research areas for decades The last years have been characterized by a particular focus on web services, the very recent years by the advent of web mashups, a new and user-centric form of integration on the Web However, while service composition approaches lack support for user interfaces, web mashups still lack well engineered development approaches and mature technological foundations In this paper, we aim to overcome both these shortcomings and propose what we call a universal composition approach that naturally brings together data and application services with user interfaces We propose a unified component model and a universal, event-based composition model, both able to abstract from low-level implementation details and technology specifics Via the mashArt platform, we then provide universal composition as a service in form of an easy-to-use graphical development tool equipped with an execution environment for fast deployment and execution of composite Web applications

Journal ArticleDOI
TL;DR: This paper proposes a relation-based page rank algorithm to be used in conjunction with semantic Web search engines that simply relies on information that could be extracted from user queries and on annotated resources.
Abstract: With the tremendous growth of information available to end users through the Web, search engines come to play ever a more critical role. Nevertheless, because of their general-purpose approach, it is always less uncommon that obtained result sets provide a burden of useless pages. The next-generation Web architecture, represented by the Semantic Web, provides the layered architecture possibly allowing overcoming this limitation. Several search engines have been proposed, which allow increasing information retrieval accuracy by exploiting a key content of semantic Web resources, that is, relations. However, in order to rank results, most of the existing solutions need to work on the whole annotated knowledge base. In this paper, we propose a relation-based page rank algorithm to be used in conjunction with semantic Web search engines that simply relies on information that could be extracted from user queries and on annotated resources. Relevance is measured as the probability that a retrieved resource actually contains those relations whose existence was assumed by the user at the time of query definition.