scispace - formally typeset
Search or ask a question

Showing papers on "Web standards published in 2011"


Book
02 Feb 2011
TL;DR: This Synthesis lecture provides readers with a detailed technical introduction to Linked Data, including coverage of relevant aspects of Web architecture, as the basis for application development, research or further study.
Abstract: The World Wide Web has enabled the creation of a global information space comprising linked documents. As the Web becomes ever more enmeshed with our daily lives, there is a growing desire for direct access to raw data not currently available on the Web or bound up in hypertext documents. Linked Data provides a publishing paradigm in which not only documents, but also data, can be a first class citizen of the Web, thereby enabling the extension of the Web with a global data space based on open standards - the Web of Data. In this Synthesis lecture we provide readers with a detailed technical introduction to Linked Data. We begin by outlining the basic principles of Linked Data, including coverage of relevant aspects of Web architecture. The remainder of the text is based around two main themes - the publication and consumption of Linked Data. Drawing on a practical Linked Data scenario, we provide guidance and best practices on: architectural approaches to publishing Linked Data; choosing URIs and vocabularies to identify and describe resources; deciding what data to return in a description of a resource on the Web; methods and frameworks for automated linking of data sets; and testing and debugging approaches for Linked Data deployments. We give an overview of existing Linked Data applications and then examine the architectures that are used to consume Linked Data from the Web, alongside existing tools and frameworks that enable these. Readers can expect to gain a rich technical understanding of Linked Data fundamentals, as the basis for application development, research or further study.

2,174 citations


Journal ArticleDOI
TL;DR: The practice of crowdsourcing is transforming the Web and giving rise to a new field of inquiry called "crowdsourcing", which aims to provide real-time information about events in a democratic manner.

1,165 citations


Journal ArticleDOI
TL;DR: This paper proposes a collaborative filtering approach for predicting QoS values of Web services and making Web service recommendation by taking advantages of past usage experiences of service users, and shows that the algorithm achieves better prediction accuracy than other approaches.
Abstract: With increasing presence and adoption of Web services on the World Wide Web, Quality-of-Service (QoS) is becoming important for describing nonfunctional characteristics of Web services. In this paper, we present a collaborative filtering approach for predicting QoS values of Web services and making Web service recommendation by taking advantages of past usage experiences of service users. We first propose a user-collaborative mechanism for past Web service QoS information collection from different service users. Then, based on the collected QoS data, a collaborative filtering approach is designed to predict Web service QoS values. Finally, a prototype called WSRec is implemented by Java language and deployed to the Internet for conducting real-world experiments. To study the QoS value prediction accuracy of our approach, 1.5 millions Web service invocation results are collected from 150 service users in 24 countries on 100 real-world Web services in 22 countries. The experimental results show that our algorithm achieves better prediction accuracy than other approaches. Our Web service QoS data set is publicly released for future research.

741 citations


Journal ArticleDOI
TL;DR: The vision and architecture of a Semantic Web of Things is described: a service infrastructure that makes the deployment and use of semantic applications involving Internet-connected sensors almost as easy as building, searching, and reading a web page today.
Abstract: The developed world is awash with sensors. However, they are typically locked into unimodal closed systems. To unleash their full potential, access to sensors should be opened such that their data and services can be integrated with data and services available in other information systems, facilitating novel applications and services that are based on the state of the real world. We describe our vision and architecture of a Semantic Web of Things: a service infrastructure that makes the deployment and use of semantic applications involving Internet-connected sensors almost as easy as building, searching, and reading a web page today.

337 citations


Journal ArticleDOI
TL;DR: The architecture and some key enabling technologies of Web of Things (WoT) are elaborate and many systematic comparisons are made to provide the insight in the evolution and future of WoT.
Abstract: In the vision of the Internet of Things (IoT), an increasing number of embedded devices of all sorts (e.g., sensors, mobile phones, cameras, smart meters, smart cars, traffic lights, smart home appliances, etc.) are now capable of communicating and sharing data over the Internet. Although the concept of using embedded systems to control devices, tools and appliances has been proposed for almost decades now, with every new generation, the ever-increasing capabilities of computation and communication pose new opportunities, but also new challenges. As IoT becomes an active research area, different methods from various points of view have been explored to promote the development and popularity of IoT. One trend is viewing IoT as Web of Things (WoT) where the open Web standards are supported for information sharing and device interoperation. By penetrating smart things into existing Web, the conventional web services are enriched with physical world services. This WoT vision enables a new way of narrowing the barrier between virtual and physical worlds.In this paper, we elaborate the architecture and some key enabling technologies of WoT. Some pioneer open platforms and prototypes are also illustrated. The most recent research results are carefully summarized. Furthermore, many systematic comparisons are made to provide the insight in the evolution and future of WoT. Finally, we point out some open challenging issues that shall be faced and tackled by research community.

259 citations


Journal ArticleDOI
TL;DR: The current SWSE system is described, initially detailing the architecture and later elaborating upon the function, design, implementation and performance of each individual component, to give an insight into how current Semantic Web standards can be tailored, in a best-effort manner, for use on Web data.

236 citations


Proceedings ArticleDOI
Sören Auer1
18 Apr 2011
TL;DR: An overview of the Linked Data life-cycle is presented and some promising approaches with regard to extraction, storage and querying, authoring, linking, enrichment, quality analysis, evolution, as well as search and exploration of Linked data are discussed.
Abstract: Over the past 4 years, the semantic web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into a very promising candidate for addressing one of the biggest challenges in the area of the Semantic Web vision: the exploitation of the Web as a platform for data and information integration.To translate this initial success into a world-scale reality, a number of research challenges need to be addressed. While many standards, methods and technologies developed within the Semantic Web activity are applicable for Linked Data, there are also a number of specific characteristics of Linked Data, which have to be considered. In this talk we present an overview of the Linked Data life-cycle and discuss some promising approaches with regard to extraction, storage and querying, authoring, linking, enrichment, quality analysis, evolution, as well as search and exploration of Linked Data.

216 citations


Book
03 Jan 2011
TL;DR: Software developers in industry and students specializing in Web development or Semantic Web technologies will find in this book the most complete guide to this exciting field available today.
Abstract: The Semantic Web represents a vision for how to make the huge amount of information on the Web automatically processable by machines on a large scale. For this purpose, a whole suite of standards, technologies and related tools have been specified and developed over the last couple of years, and they have now become the foundation for numerous new applications.A Developers Guide to the Semantic Web helps the reader to learn the core standards, key components, and underlying concepts. It provides in-depth coverage of both the what-is and how-to aspects of the Semantic Web. From Yus presentation, the reader will obtain not only a solid understanding about the Semantic Web, but also learn how to combine all the pieces to build new applications on the Semantic Web.Software developers in industry and students specializing in Web development or Semantic Web technologies will find in this book the most complete guide to this exciting field available today. Based on the step-by-step presentation of real-world projects, where the technologies and standards are applied, they will acquire the knowledge needed to design and implement state-of-the-art applications.

204 citations


Proceedings ArticleDOI
04 Jul 2011
TL;DR: An effective Personalized Hybrid Collaborative Filtering (PHCF) technique by integrating personalized user- based algorithm and personalized item-based algorithm is developed based on the similarity measurement model of Web services.
Abstract: Collaborative filtering is one of widely used Web service recommendation techniques. There have been several methods of Web service selection and recommendation based on collaborative filtering, but seldom have they considered personalized influence of users and services. In this paper, we present an effective personalized collaborative filtering method for Web service recommendation. A key component of Web service recommendation techniques is computation of similarity measurement of Web services. Different from the Pearson Correlation Coefficient (PCC) similarity measurement, we take into account the personalized influence of services when computing similarity measurement between users and personalized influence of services. Based on the similarity measurement model of Web services, we develop an effective Personalized Hybrid Collaborative Filtering (PHCF) technique by integrating personalized user-based algorithm and personalized item-based algorithm. We conduct series of experiments based on real Web service QoS dataset WSRec [11] which contains more than 1.5 millions test results of 150 service users in different countries on 100 publicly available Web services located all over the world. Experimental results show that the method improves accuracy of recommendation of Web services significantly.

179 citations


01 Jan 2011
TL;DR: WebProtege as mentioned in this paper is a lightweight ontology editor and knowledge acquisition tool for the Web that integrates these features as part of the ontology development process itself, which can be accessed from any web browser and can be adapted to any level of user expertise.
Abstract: In this paper, we present WebProtege—a lightweight ontology editor and knowledge acquisition tool for the Web. With the wide adoption of Web 2.0 platforms and the gradual adoption of ontologies and Semantic Web technologies in the real world, we need ontology-development tools that are better suited for the novel ways of interacting, constructing and consuming knowledge. Users today take Web-based content creation and online collaboration for granted. WebProtege integrates these features as part of the ontology development process itself. We tried to lower the entry barrier to ontology development by providing a tool that is accessible from any Web browser, has extensive support for collaboration, and a highly customizable and pluggable user interface that can be adapted to any level of user expertise. The declarative user interface enabled us to create custom knowledge-acquisition forms tailored for domain experts. We built WebProtege using the existing Protege infrastructure, which supports collaboration on the back end side, and the Google Web Toolkit for the front end. The generic and extensible infrastructure allowed us to easily deploy WebProtege in production settings for several projects. We present the main features of WebProtege and its architecture and describe briefly some of its uses for real-world projects. WebProtege is free and open source. An online demo is available at http://webprotege.stanford.edu.

178 citations


Proceedings ArticleDOI
07 Jun 2011
TL;DR: This paper analyzes five years of real Web traffic from a globally-distributed proxy system, which captures the browsing behavior of over 70,000 daily users from 187 countries, and investigates the redundancy of this traffic, using both traditional object-level caching as well as content-based approaches.
Abstract: As the nature of Web traffic evolves over time, we must update our understanding of underlying nature of today's Web, which is necessary to improve response time, understand caching effectiveness, and to design intermediary systems, such as firewalls, security analyzers, and reporting or management systems. In this paper, we analyze five years (2006-2010) of real Web traffic from a globally-distributed proxy system, which captures the browsing behavior of over 70,000 daily users from 187 countries. Using this data set, we examine major changes in Web traffic characteristics during this period, and also investigate the redundancy of this traffic, using both traditional object-level caching as well as content-based approaches.

Reference BookDOI
24 Jun 2011
TL;DR: This handbook is the first dedicated reference work in this field, collecting contributions about both the technical foundations of the Semantic Web as well as their main usage in other scientific fields like life sciences, engineering, business, or education.
Abstract: After years of mostly theoretical research, Semantic Web Technologies are now reaching out into application areas like bioinformatics, eCommerce, eGovernment, or Social Webs. Applications like genomic ontologies, semantic web services, automated catalogue alignment, ontology matching, or blogs and social networks are constantly increasing, often driven or at least backed up by companies like Google, Amazon, YouTube, Facebook, LinkedIn and others. The need to leverage the potential of combining information in a meaningful way in order to be able to benefit from the Web will create further demand for and interest in Semantic Web research. This movement, based on the growing maturity of related research results, necessitates a reliable reference source from which beginners to the field can draw a first basic knowledge of the main underlying technologies as well as state-of-the-art application areas. This handbook, put together by three leading authorities in the field, and supported by an advisory board of highly reputed researchers, fulfils exactly this need. It is the first dedicated reference work in this field, collecting contributions about both the technical foundations of the Semantic Web as well as their main usage in other scientific fields like life sciences, engineering, business, or education.

Journal ArticleDOI
TL;DR: In this article, the most recent evaluation criteria methods which were used in different e-business services were reviewed and proposed general criteria for evaluating the quality of any website regardless of the type of service that it offers.

Proceedings ArticleDOI
02 Nov 2011
TL;DR: A detailed analysis of word-of-mouth exchange of URLs among Twitter users shows that Twitter yields propagation trees that are wider than they are deep, and indicates that users who are geographically close together are more likely to share the same URL.
Abstract: Traditionally, users have discovered information on the Web by browsing or searching. Recently, word-of-mouth has emerged as a popular way of discovering the Web, particularly on social networking sites like Facebook and Twitter. On these sites, users discover Web content by following URLs posted by their friends. Such word-of-mouth based content discovery has become a major driver of traffic to many Web sites today. To better understand this popular phenomenon, in this paper we present a detailed analysis of word-of-mouth exchange of URLs among Twitter users. Among our key findings, we show that Twitter yields propagation trees that are wider than they are deep. Our analysis on the geolocation of users indicates that users who are geographically close together are more likely to share the same URL.



Journal ArticleDOI
TL;DR: This work examined Web 2.0 services that provide different levels of knowledge exploitation and developed a framework for classifying existing service models from a knowledge-creation perspective, and termed the two types of service platforms: Experience-Socialization and Intelligence-Proliferation.

Book
01 Jan 2011
TL;DR: The Semantic Web as mentioned in this paper is an extension of the current one, in which information would be given well-defined meaning, better enabling computers and people to work in co-operation, and thus the current Web, basically comprised of documents presented by computers and read by man, would also include data and information that would automatically be handled by agents and utilities.
Abstract: , with a fully functional Web, Berners-Lee published an original article, with the challenging pro-posal of a “Semantic Web” (Berners-Lee 2001). According to his proposal, the Semantic Web would be an extension of the current one, in which information would be given well-defined meaning, better enabling computers and people to work in coop-eration. Thus the current Web, basically comprised of documents presented by computers and read by man, would also include data and information that would automatically be handled by agents and utilities. Back then, Berners-Lee (2001) argued that advancements in the Semantic Web would require the development of a language able to express data and data reasoning rules, in addition to enabling the any knowledge rep-resentation system to be exported to the Web. Up to that time, two key technologies for the achievement of the Semantic Web had been developed: the eXtensible Markup Language – XML, and the specifications fam-ily Resource Description Framework – RDF, the latter intended for the description or modeling of information implemented in Web resources.


Journal ArticleDOI
TL;DR: The SADI design patterns significantly improve the ability of software to automatically discover appropriate services based on user-needs, and automatically chain these into complex analytical workflows, thus facilitating the intersection of Web services and Semantic Web technologies.
Abstract: The complexity and inter-related nature of biological data poses a difficult challenge for data and tool integration. There has been a proliferation of interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technologies are being welcomed by this community. SADI - Semantic Automated Discovery and Integration - is a lightweight set of fully standards-compliant Semantic Web service design patterns that simplify the publication of services of the type commonly found in bioinformatics and other scientific domains. Using Semantic Web technologies at every level of the Web services "stack", SADI services consume and produce instances of OWL Classes following a small number of very straightforward best-practices. In addition, we provide codebases that support these best-practices, and plug-in tools to popular developer and client software that dramatically simplify deployment of services by providers, and the discovery and utilization of those services by their consumers. SADI Services are fully compliant with, and utilize only foundational Web standards; are simple to create and maintain for service providers; and can be discovered and utilized in a very intuitive way by biologist end-users. In addition, the SADI design patterns significantly improve the ability of software to automatically discover appropriate services based on user-needs, and automatically chain these into complex analytical workflows. We show that, when resources are exposed through SADI, data compliant with a given ontological model can be automatically gathered, or generated, from these distributed, non-coordinating resources - a behaviour we have not observed in any other Semantic system. Finally, we show that, using SADI, data dynamically generated from Web services can be explored in a manner very similar to data housed in static triple-stores, thus facilitating the intersection of Web services and Semantic Web technologies.

Proceedings ArticleDOI
07 Sep 2011
TL;DR: A feature-based comparison of the state-of-the-art RDB-to-RDF mapping languages is provided and a classification proposes four categories of mapping languages: direct mapping, read-only general-purpose mapping, reading-write general- Purpose mapping, and special- purpose mapping.
Abstract: Mapping Relational Databases (RDB) to RDF is an active field of research. The majority of data on the current Web is stored in RDBs. Therefore, bridging the conceptual gap between the relational model and RDF is needed to make the data available on the Semantic Web. In addition, recent research has shown that Semantic Web technologies are useful beyond the Web, especially if data from different sources has to be exchanged or integrated. Many mapping languages and approaches were explored leading to the ongoing standardization effort of the World Wide Web Consortium (W3C) carried out in the RDB2RDF Working Group (WG). The goal and contribution of this paper is to provide a feature-based comparison of the state-of-the-art RDB-to-RDF mapping languages. It should act as a guide in selecting a RDB-to-RDF mapping language for a given application scenario and its requirements w.r.t. mapping features. Our comparison framework is based on use cases and requirements for mapping RDBs to RDF as identified by the RDB2RDF WG. We apply this comparison framework to the state-of-the-art RDB-to-RDF mapping languages and report the findings in this paper. As a result, our classification proposes four categories of mapping languages: direct mapping, read-only general-purpose mapping, read-write general-purpose mapping, and special-purpose mapping. We further provide recommendations for selecting a mapping language.

Journal ArticleDOI
TL;DR: Fusion Tables is described, a recently launched data-management service that lets users create and visualize structured and easily and emphasizes the ability to collaborate with other data owners.
Abstract: Google's Web Tables and Deep Web Crawler identify and deliver this otherwise inaccessible resource directly to end users.

Book ChapterDOI
26 Sep 2011
TL;DR: The obtained results showed that the number of web archiving initiatives significantly grew after 2003 and they are concentrated on developed countries, and the assigned resources are scarce.
Abstract: Web archiving has been gaining interest and recognized importance for modern societies around the world. However, for web archivists it is frequently difficult to demonstrate this fact, for instance, to funders. This study provides an updated and global overview of web archiving. The obtained results showed that the number of web archiving initiatives significantly grew after 2003 and they are concentrated on developed countries. We statistically analyzed metrics, such as, the volume of archived data, archive file formats or number of people engaged. Web archives all together must process more data than any web search engine. Considering the complexity and large amounts of data involved in web archiving, the results showed that the assigned resources are scarce. A Wikipedia page was created to complement the presented work and be collaboratively kept up-to-date by the community.

Journal ArticleDOI
TL;DR: In Web 2.0, there is a social dichotomy at work based upon and reflecting the underlying Von Neumann Architecture of computers, where users are encouraged to process digital ephemera by sharing content, making connections, ranking cultural artifacts, and producing digital content, a mode of computing I call ‘affective processing.
Abstract: In Web 2.0, there is a social dichotomy at work based upon and reflecting the underlying Von Neumann Architecture of computers. In the hegemonic Web 2.0 business model, users are encouraged to process digital ephemera by sharing content, making connections, ranking cultural artifacts, and producing digital content, a mode of computing I call ‘affective processing.’ The Web 2.0 business model imagines users to be a potential superprocessor. In contrast, the memory possibilities of computers are typically commanded by Web 2.0 site owners. They seek to surveil every user action, store the resulting data, protect that data via intellectual property, and mine it for profit. Users are less likely to wield control over these archives. These archives are comprised of the products of affective processing; they are archives of affect, sites of decontextualized data which can be rearranged by the site owners to construct knowledge about Web 2.0 users.

Journal ArticleDOI
TL;DR: A novel and extensible model balancing the new dimension of semantic quality (as a functional quality metric) with a QoS metric is proposed, and it is demonstrated the utility of Genetic Algorithms to allow optimization within the context of a large number of services foreseen by the “Web of Services” vision.
Abstract: Ranking and optimization of web service compositions represent challenging areas of research with significant implications for the realization of the “Web of Services” vision. “Semantic web services” use formal semantic descriptions of web service functionality and interface to enable automated reasoning over web service compositions. To judge the quality of the overall composition, for example, we can start by calculating the semantic similarities between outputs and inputs of connected constituent services, and aggregate these values into a measure of semantic quality for the composition. This paper takes a specific interest in combining semantic and nonfunctional criteria such as quality of service (QoS) to evaluate quality in web services composition. It proposes a novel and extensible model balancing the new dimension of semantic quality (as a functional quality metric) with a QoS metric, and using them together as ranking and optimization criteria. It also demonstrates the utility of Genetic Algorithms to allow optimization within the context of a large number of services foreseen by the “Web of Services” vision. We test the performance of the overall approach using a set of simulation experiments, and discuss its advantages and weaknesses.


Journal ArticleDOI
TL;DR: This paper formulates a framework that encompasses validity, reliability, sensitivity, adequacy and complexity of metrics in the context of four scenarios where the metrics can be used, and demonstrates that it is feasible and shows the findings when it is applied to seven existing automatic accessibility metrics.

Proceedings ArticleDOI
08 Apr 2011
TL;DR: The current, past and future of web mining is described, and online resources for retrieval Information on the web i.e. web content mining, and the discovery of user access patterns from web servers are introduced.
Abstract: In this paper we presents study about how to extract the useful information on the web and also give the superficial knowledge and comparison about data mining. This paper describes the current, past and future of web mining. Here we introduce online resources for retrieval Information on the web i.e. web content mining, and the discovery of user access patterns from web servers, i.e. web usage mining that improve the data mining drawback. Furthermore, we also described web mining through cloud computing i.e. cloud mining. That can be seen as future of Web Mining.

Book ChapterDOI
05 Dec 2011
TL;DR: This paper proposes a novel approach called WTClusters, in which both WSDL documents and tags are utilized for web service clustering, and presents and evaluates two tag recommendation strategies to improve the performance of WTCluster.
Abstract: Clustering web services would greatly boost the ability of web service search engine to retrieve relevant ones. An important restriction of traditional studies on web service clustering is that researchers focused on utilizing web services' WSDL (Web Service Description Language) documents only. The singleness of data source limits the accuracy of clustering. Recently, web service search engines such as Seekda! allow users to manually annotate web services using so called tags, which describe the function of the web service or provide additional contextual and semantical information. In this paper, we propose a novel approach called WTCluster, in which both WSDL documents and tags are utilized for web service clustering. Furthermore, we present and evaluate two tag recommendation strategies to improve the performance of WTCluster. The comprehensive experiments based on a dataset consists of 15,968 real web services demonstrate the effectiveness of WTCluster and tag recommendation strategies.

Proceedings Article
01 Jan 2011
TL;DR: This paper provides a framework for information quality assessment ofSemantic Web data called SWIQA by solely using Semantic Web technologies and employs data quality rule templates to express quality requirements which are automatically used to identify deficient data and calculate quality scores.
Abstract: The internet is currently evolving from the "Web of Documents" into the "Web of Data" where data is available on web-scale in the so called Semantic Web (1) to retrieve information or (2) for data reuse, e.g. within applications for a higher degree of automation. At present, there is already a lot of data available on the Semantic Web, but unfortunately we do not know much about their quality due to missing techniques and methodologies for information quality assessment. In this paper, we provide a framework for information quality assessment of Semantic Web data called SWIQA by solely using Semantic Web technologies. Other than survey-based techniques for information quality assessment SWIQA employs data quality rule templates to express quality requirements which are automatically used to identify deficient data and calculate quality scores. Hence, using our approach minimizes manual effort while providing transparency about the quality of Semantic Web data. SWIQA may, therefore, be used by data consumers to find high quality data sources or by data owners to keep track of the quality of their own data.