scispace - formally typeset
Search or ask a question

Showing papers on "Semantic Web published in 2013"


Journal ArticleDOI
TL;DR: This article develops a proposal of an RDF representation that modularly partitions and efficiently represents three components of RDF datasets: Header information, a Dictionary, and the actual Triples structure (thus called HDT).

318 citations


Proceedings ArticleDOI
22 Jun 2013
TL;DR: This work study the problem of querying graph databases, and, in particular, the expressiveness and complexity of evaluation for several general-purpose query languages, such as the regular path queries and its extensions with conjunctions and inverses.
Abstract: Graph databases have gained renewed interest in the last years, due to its applications in areas such as the Semantic Web and Social Networks Analysis. We study the problem of querying graph databases, and, in particular, the expressiveness and complexity of evaluation for several general-purpose query languages, such as the regular path queries and its extensions with conjunctions and inverses. We distinguish between two semantics for these languages. The first one, based on simple paths, easily leads to intractability, while the second one, based on arbitrary paths, allows tractable evaluation for an expressive family of languages.We also study two recent extensions of these languages that have been motivated by modern applications of graph databases. The first one allows to treat paths as first-class citizens, while the second one permits to express queries that combine the topology of the graph with its underlying data.

192 citations


Book ChapterDOI
21 Oct 2013
TL;DR: It is shown that if optimal string similarity metrics are chosen, those alone can produce alignments that are competitive with the state of the art in ontology alignment systems.
Abstract: Ontology alignment is an important part of enabling the semantic web to reach its full potential. The vast majority of ontology alignment systems use one or more string similarity metrics, but often the choice of which metrics to use is not given much attention. In this work we evaluate a wide range of such metrics, along with string pre-processing strategies such as removing stop words and considering synonyms, on different types of ontologies. We also present a set of guidelines on when to use which metric. We furthermore show that if optimal string similarity metrics are chosen, those alone can produce alignments that are competitive with the state of the art in ontology alignment systems. Finally, we examine the improvements possible to an existing ontology alignment system using an automated string metric selection strategy based upon the characteristics of the ontologies to be aligned.

183 citations


Journal ArticleDOI
TL;DR: The main goal of the challenge was to get insight into the strengths, capabilities, and current shortcomings of question answering systems as interfaces to query linked data sources, as well as benchmarking how these interaction paradigms can deal with the fact that the amount of RDF data available on the web is very large and heterogeneous with respect to the vocabularies and schemas used.

174 citations


Journal ArticleDOI
TL;DR: While there is no generally agreed understanding of what exactly is (or more importantly, what is not) Big Data, an increasing number of V’s has been used to characterize different dimensions and challenges of Big Data: volume, velocity, variety, value, and veracity.
Abstract: Around 2006, the inception of Linked Data [2] has led to a realignment of the Semantic Web vision and the realization that data is not merely a way to evaluate our theoretical considerations, but a key research enabler in its own right that inspires novel theoretical and foundational research questions. Since then, Linked Data is growing rapidly and is altering research, governments, and industry. Simply put, Linked Data takes the World Wide Web’s ideas of global identifiers and links and applies them to (raw) data, not just documents. Moreover, and regularly highlighted by Tim Berners-Lee, Anybody can say Anything about Any topic (AAA)1 [1], which leads to a multi-thematic, multi-perspective, and multi-medial global data graph. More recently, Big Data has made its appearance in the shared mindset of researchers, practitioners, and funding agencies, driven by the awareness that concerted efforts are needed to address 21st century data collection, analysis, management, ownership, and privacy issues. While there is no generally agreed understanding of what exactly is (or more importantly, what is not) Big Data, an increasing number of V’s has been used to characterize different dimensions and challenges of Big Data: volume, velocity, variety, value, and veracity. Interestingly, different (scientific) disciplines highlight certain dimensions and neglect others. For instance, super computing seems to be mostly interested in the volume dimension while researchers working on sensor webs and the internet of things seem to push on the velocity front. The social sciences and humanities, in contrast, are more interested in value and veracity. As argued before [13,17], the variety dimensions seems to be the most intriguing one for the Semantic Web and the one where we can contribute-

173 citations


Proceedings ArticleDOI
12 Oct 2013
TL;DR: SPrank as mentioned in this paper is a hybrid recommendation algorithm able to compute top-N item recommendations from implicit feedback exploiting the information available in the so-called Web of Data, which leverages DBpedia, a well-known knowledge base in the LOD compass, to extract semantic path-based features and to eventually compute recommendations using a learning to rank algorithm.
Abstract: The advent of the Linked Open Data (LOD) initiative gave birth to a variety of open knowledge bases freely accessible on the Web. They provide a valuable source of information that can improve conventional recommender systems, if properly exploited. In this paper we present SPrank, a novel hybrid recommendation algorithm able to compute top-N item recommendations from implicit feedback exploiting the information available in the so called Web of Data. We leverage DBpedia, a well-known knowledge base in the LOD compass, to extract semantic path-based features and to eventually compute recommendations using a learning to rank algorithm. Experiments with datasets on two different domains show that the proposed approach outperforms in terms of prediction accuracy several state-of-the-art top-N recommendation algorithms for implicit feedback in situations affected by different degrees of data sparsity.

161 citations


Book ChapterDOI
Aldo Gangemi1
26 May 2013
TL;DR: A landscape analysis of several tools, either conceived specifically for KE on the Semantic Web, or adaptable to it, or even acting as aggregators of extracted data from other tools are described.
Abstract: In the last years, basic NLP tasks: NER, WSD, relation extraction, etc have been configured for Semantic Web tasks including ontology learning, linked data population, entity resolution, NL querying to linked data, etc Some assessment of the state of art of existing Knowledge Extraction (KE) tools when applied to the Semantic Web is then desirable In this paper we describe a landscape analysis of several tools, either conceived specifically for KE on the Semantic Web, or adaptable to it, or even acting as aggregators of extracted data from other tools Our aim is to assess the currently available capabilities against a rich palette of ontology design constructs, focusing specifically on the actual semantic reusability of KE output

144 citations


Journal ArticleDOI
TL;DR: A brief historical summary of AGROVOC is provided and its specification as a Linked Dataset is detailed, containing links as well as backlinks and references to many other LinkedDatasets in the LOD cloud.
Abstract: Born in the early 1980's as a multilingual agricultural thesaurus, AGROVOC has steadily evolved over the last fifteen years, moving to an electronic version around the year 2000, and embracing the Semantic Web shortly thereafter. Today AGROVOC is a SKOS-XL concept scheme published as Linked Open Data, containing links as well as backlinks and references to many other Linked Datasets in the LOD cloud. In this paper we provide a brief historical summary of AGROVOC and detail its specification as a Linked Dataset.

142 citations


Book ChapterDOI
26 May 2013
TL;DR: A significant update to increase the overall quality of RDFized datasets generated from open scripts powered by an API to generate registry-validated IRIs, dataset provenance and metrics, SPARQL endpoints, downloadable RDF and database files is described.
Abstract: Bio2RDF currently provides the largest network of Linked Data for the Life Sciences. Here, we describe a significant update to increase the overall quality of RDFized datasets generated from open scripts powered by an API to generate registry-validated IRIs, dataset provenance and metrics, SPARQL endpoints, downloadable RDF and database files. We demonstrate federated SPARQL queries within and across the Bio2RDF network, including semantic integration using the Semanticscience Integrated Ontology (SIO). This work forms a strong foundation for increased coverage and continuous integration of data in the life sciences.

142 citations


Book
01 Sep 2013
TL;DR: Following the recent publication of the PROV standard for provenance on the Web, which the two authors actively help shape in the Provenance Working Group at the World Wide Web Consortium, this Synthesis lecture is a hands-on introduction to PROV aimed at Web and linked data professionals.
Abstract: The World Wide Web is now deeply intertwined with our lives, and has become a catalyst for a data deluge, making vast amounts of data available online, at a click of a button. With Web 2.0, users are no longer passive consumers, but active publishers and curators of data. Hence, from science to food manufacturing, from data journalism to personal well-being, from social media to art, there is a strong interest in provenance, a description of what influenced an artifact, a data set, a document, a blog, or any resource on the Web and beyond. Provenance is a crucial piece of information that can help a consumer make a judgment as to whether something can be trusted. Provenance is no longer seen as a curiosity in art circles, but it is regarded as pragmatically, ethically, and methodologically crucial for our day-to-day data manipulation and curation activities on the Web. Following the recent publication of the PROV standard for provenance on the Web, which the two authors actively help shape in the Provenance Working Group at the World Wide Web Consortium, this Synthesis lecture is a hands-on introduction to PROV aimed at Web and linked data professionals. By means of recipes, illustrations, a website at www.provbook.org, and tools, it guides practitioners through a variety of issues related to provenance: how to generate provenance, publish it on the Web, make it discoverable, and how to utilize it. Equipped with this knowledge, practictioners will be in a position to develop novel applications that can bring open-ness, trust, and accountability.

137 citations


Proceedings ArticleDOI
03 Jun 2013
TL;DR: CASSARAM is presented, a context-aware sensor search, selection, and ranking model for Internet of Things to address the research challenges of selecting sensors when large numbers of sensors with overlapping and sometimes redundant functionality are available.
Abstract: As we are moving towards the Internet of Things (IoT), the number of sensors deployed around the world is growing at a rapid pace. Market research has shown a significant growth of sensor deployments over the past decade and has predicted a substantial acceleration of the growth rate in the future. It is also evident that the increasing number of IoT middleware solutions are developed in both research and commercial environments. However, sensor search and selection remain a critical requirement and a challenge. In this paper, we present CASSARAM, a context-aware sensor search, selection, and ranking model for Internet of Things to address the research challenges of selecting sensors when large numbers of sensors with overlapping and sometimes redundant functionality are available. CASSARAM proposes the search and selection of sensors based on user priorities. CASSARAM considers a broad range of characteristics of sensors for search such as reliability, accuracy, battery life just to name a few. Our approach utilises both semantic querying and quantitative reasoning techniques. User priority based weighted Euclidean distance comparison in multidimensional space technique is used to index and rank sensors. Our objectives are to highlight the importance of sensor search in IoT paradigm, identify important characteristics of both sensors and data acquisition processes which help to select sensors, understand how semantic and statistical reasoning can be combined together to address this problem in an efficient manner. We developed a tool called CASSARA to evaluate the proposed model in terms of resource consumption and response time.

Journal ArticleDOI
TL;DR: Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future.

Book ChapterDOI
30 Jul 2013
TL;DR: This article presents an overview of the Linked Data lifecycle and discusses individual approaches as well as the state-of-the-art with regard to extraction, authoring, linking, enrichment as wellAs quality of Linked data.
Abstract: With Linked Data, a very pragmatic approach towards achieving the vision of the Semantic Web has gained some traction in the last years. The term Linked Data refers to a set of best practices for publishing and interlinking structured data on the Web. While many standards, methods and technologies developed within by the Semantic Web community are applicable for Linked Data, there are also a number of specific characteristics of Linked Data, which have to be considered. In this article we introduce the main concepts of Linked Data. We present an overview of the Linked Data lifecycle and discuss individual approaches as well as the state-of-the-art with regard to extraction, authoring, linking, enrichment as well as quality of Linked Data. We conclude the chapter with a discussion of issues, limitations and further research and development challenges of Linked Data. This article is an updated version of a similar lecture given at Reasoning Web Summer School 2011.

Journal ArticleDOI
TL;DR: SMARTMUSEUM, a mobile ubiquitous recommender system for the Web of Data, and its application to information needs of tourists in context-aware on-site access to cultural heritage indicates that semantic content representation and retrieval can significantly improve the performance of mobile recommender systems in knowledge-rich domains.

Proceedings Article
01 Jan 2013
TL;DR: Hydra, a small vocabulary to describe Web APIs that aims to simplify the development of truly RESTful services by leveraging the power of Linked Data, is developed.
Abstract: with the ever-increasing amount of data becomes increasingly challenging. To alleviate the information overload put on people, systems are progressively being connected directly to each other. They exchange, analyze, and manipulate humongous amounts of data without any human interaction. Most current solutions, however, do not exploit the whole potential of the architecture of the World Wide Web and completely ignore the possibilities offered by Semantic Web technologies. Based on the experiences gained by implementing and analyzing various RESTful APIs and drawing from the longer history of Semantic Web research we developed Hydra, a small vocabulary to describe Web APIs. It aims to simplify the development of truly RESTful services by leveraging the power of Linked Data. By breaking the descriptions down into small independent fragments, a new breed of interoperable Web APIs using decentralized, reusable, and composable contracts can be realized.

Journal ArticleDOI
TL;DR: This article provides a comprehensive and comparative overview of approaches to modeling argumentation for the Social Semantic Web from theoretical foundational models to Social Web tools for argumentation, following the path to a global World Wide Argument Web.
Abstract: Argumentation represents the study of views and opinions that humans express with the goal of reaching a conclusion through logical reasoning Since the 1950's, several models have been proposed to capture the essence of informal argumentation in different settings With the emergence of the Web, and then the Semantic Web, this modeling shifted towards ontologies, while from the development perspective, we witnessed an important increase in Web 20 human-centered collaborative deliberation tools Through a review of more than 150 scholarly papers, this article provides a comprehensive and comparative overview of approaches to modeling argumentation for the Social Semantic Web We start from theoretical foundational models and investigate how they have influenced Social Web tools We also look into Semantic Web argumentation models Finally we end with Social Web tools for argumentation, including online applications combining Web 20 and Semantic Web technologies, following the path to a global World Wide Argument Web

Proceedings ArticleDOI
12 Oct 2013
TL;DR: This paper uses a content-based system enriched with Linked Data to overcome the data sparsity, a problem induced by the transiency of events, and incorporates a collaborative filtering to involve the social aspect, an influential feature in decision making.
Abstract: An ever increasing number of social services offer thousands of diverse events per day. Users tend to be overwhelmed by the massive amount of information available, especially with limited browsing options perceived in many event web services. To alleviate this information overload, a recommender system becomes a vital component for assisting users selecting relevant events. However, such system faces a number of challenges owed to the the inherent complex nature of an event. In this paper, we propose a novel hybrid approach built on top of Semantic Web. On the one hand, we use a content-based system enriched with Linked Data to overcome the data sparsity, a problem induced by the transiency of events. On the other hand, we incorporate a collaborative filtering to involve the social aspect, an influential feature in decision making. This hybrid system is enhanced by the integration of a user diversity model designed to detect user propensity towards specific topics. We show how the hybridization of CB+CF systems and the integration of interest diversity features are important to improve predictions. Experimental results demonstrate the effectiveness of our approach using precision and recall measures.

Journal ArticleDOI
TL;DR: This paper provides a structured assessment and classification of existing challenges and approaches, serving as potential guideline for researchers and practitioners in the field of TEL.
Abstract: Purpose - Research in the area of technology-enhanced learning (TEL) throughout the last decade has largely focused on sharing and reusing educational resources and data. This effort has led to a fragmented landscape of competing metadata schemas, or interface mechanisms. More recently, semantic technologies were taken into account to improve interoperability. The linked data approach has emerged as the de facto standard for sharing data on the web. To this end, it is obvious that the application of linked data principles offers a large potential to solve interoperability issues in the field of TEL. This paper aims to address this issue. Design/methodology/approach - In this paper, approaches are surveyed that are aimed towards a vision of linked education, i.e. education which exploits educational web data. It particularly considers the exploitation of the wealth of already existing TEL data on the web by allowing its exposure as linked data and by taking into account automated enrichment and interlinking techniques to provide rich and well-interlinked data for the educational domain. Findings - So far web-scale integration of educational resources is not facilitated, mainly due to the lack of take-up of shared principles, datasets and schemas. However, linked data principles increasingly are recognized by the TEL community. The paper provides a structured assessment and classification of existing challenges and approaches, serving as potential guideline for researchers and practitioners in the field. Originality/value - Being one of the first comprehensive surveys on the topic of linked data for education, the paper has the potential to become a widely recognized reference publication in the area.

BookDOI
TL;DR: This work has adapted, extended, and integrated several open source applications and frameworks that handle major portions of functionality for these platforms and includes an object-type repository, collaboration tools, an ability to identify and manage all key entities in the platform, and an integrated portal to manage diverse content and applications.
Abstract: As collaborative, or network science spreads into more science, engineering and medical fields, both the participants and their funders have expressed a very strong desire for highly functional data and information capabilities that are a) easy to use, b) integrated in a variety of ways, c) leverage prior investments and keep pace with rapid technical change, and d) are not expensive or timeconsuming to build or maintain. In response, and based on our accummulated experience over the last decade and a maturing of several key semantic web approaches, we have adapted, extended, and integrated several open source applications and frameworks that handle major portions of functionality for these platforms. At minimum, these functions include: an object-type repository, collaboration tools, an ability to identify and manage all key entities in the platform, and an integrated portal to manage diverse content and applications, with varied access levels and privacy options. At the same time, there is increasing attention to how researchers present and explain results based on interpretation of increasingly diverse and heterogeneous data and information sources. With the renewed emphasis on good data practices, informatics practitioners have responded to this challenge with maturing informatics-based approaches. These approaches include, but are not limited to, use case development; information modeling and architectures; elaborating vocabularies; mediating interfaces to data and related services on the Web; and traceable provenance. The current era of data-intensive research presents numerous challenges to both individuals and research teams. In environmental science especially, sub-fields that were data-poor are becoming data-rich (volume, type and mode), while some that were largely model/ simulation driven are now dramatically shifting to data-driven or least to data-model assimilation approaches. These paradigm shifts make it very hard for researchers used to one mode to shift to another, let alone produce products of their work that are usable or understandable by non-specialists. However, it is exactly at these frontiers where much of the exciting environmental science needs to be performed and appreciated.

BookDOI
01 Jan 2013
TL;DR: This paper presents a unified framework for aligning taxonomies, the most used kind of ontologies, and debuggingTaxonomies and their alignments, where ontology alignment is treated as a special kind of debugging.
Abstract: With the increased use of ontologies in semantically-enabled applications, the issues of debugging and aligning ontologies have become increasingly important. The quality of the results of such applications is directly dependent on the quality of the ontologies and mappings between the ontologies they employ. A key step towards achieving high quality ontologies and mappings is discovering and resolving modeling defects, e.g., wrong or missing relations and mappings. In this paper we present a unified framework for aligning taxonomies, the most used kind of ontologies, and debugging taxonomies and their alignments, where ontology alignment is treated as a special kind of debugging. Our framework supports the detection and repairing of missing and wrong is-a structure in taxonomies, as well as the detection and repairing of missing (alignment) and wrong mappings between ontologies. Further, we implemented a system based on this framework and demonstrate its benefits through experiments with ontologies from the Ontology Alignment Evaluation Initiative.

Journal ArticleDOI
TL;DR: A critical analysis and comparison of several ontology engineering methodologies showed that there is no completely mature methodology and this research may act as a preliminary guide to come with a state of art ontology Engineering methodology, bridging up the existing gaps and shortfalls.
Abstract: It is now widely accepted that ontologies play a critical role in achieving the goal of machine understandable web, also known as semantic web. In order to develop ontologies, several methodologies have been proposed during the last two decades. Despite the fact, that quite a number of ontology engineering methodologies have been proposed, still the field lacks widely accepted and mature methodologies. Most methodologies lack sufficient details of techniques and activities employed in them. However, some methodologies provide sufficient details including METHONTOLOGY. This article discusses and reports a critical analysis and comparison of these methodologies. The analysis is performed based on a criterion, derived from related literature, trends and needs which evolved over the years. The results of the analysis showed that there is no completely mature methodology. Therefore, this research may act as a preliminary guide to come with a state of art ontology engineering methodology, bridging up the existing gaps and shortfalls.

Journal ArticleDOI
TL;DR: This updated version of ChEMBL-RDF uses recently introduced ontologies, including CHEMINF and CiTO; exposes more information from the database; and is now available as dereferencable, linked data.
Abstract: Making data available as Linked Data using Resource Description Framework (RDF) promotes integration with other web resources. RDF documents can natively link to related data, and others can link back using Uniform Resource Identifiers (URIs). RDF makes the data machine-readable and uses extensible vocabularies for additional information, making it easier to scale up inference and data analysis. This paper describes recent developments in an ongoing project converting data from the ChEMBL database into RDF triples. Relative to earlier versions, this updated version of ChEMBL-RDF uses recently introduced ontologies, including CHEMINF and CiTO; exposes more information from the database; and is now available as dereferencable, linked data. To demonstrate these new features, we present novel use cases showing further integration with other web resources, including Bio2RDF, Chem2Bio2RDF, and ChemSpider, and showing the use of standard ontologies for querying. We have illustrated the advantages of using open standards and ontologies to link the ChEMBL database to other databases. Using those links and the knowledge encoded in standards and ontologies, the ChEMBL-RDF resource creates a foundation for integrated semantic web cheminformatics applications, such as the presented decision support.

Proceedings ArticleDOI
08 Apr 2013
TL;DR: A distributed SPARQL engine is proposed that combines a graph partitioning technique with workload-aware replication of triples across partitions, enabling efficient query execution even for complex queries from the workload.
Abstract: With the increasing popularity of the Semantic Web, more and more data becomes available in RDF with SPARQL as a query language Data sets, however, can become too big to be managed and queried on a single server in a scalable way Existing distributed RDF stores approach this problem using data partitioning, aiming at limiting the communication between servers and exploiting parallelism This paper proposes a distributed SPARQL engine that combines a graph partitioning technique with workload-aware replication of triples across partitions, enabling efficient query execution even for complex queries from the workload Furthermore, it discusses query optimization techniques for producing efficient execution plans for ad-hoc queries not contained in the workload

Book ChapterDOI
21 Oct 2013
TL;DR: This work presents implementation details of a joint inference module that uses knowledge from the linked open data (LOD) cloud to jointly infer the semantics of column headers, table cell values and relations between columns.
Abstract: We describe work on automatically inferring the intended meaning of tables and representing it as RDF linked data, making it available for improving search, interoperability and integration. We present implementation details of a joint inference module that uses knowledge from the linked open data (LOD) cloud to jointly infer the semantics of column headers, table cell values (e.g., strings and numbers) and relations between columns. We also implement a novel Semantic Message Passing algorithm which uses LOD knowledge to improve existing message passing schemes. We evaluate our implemented techniques on tables from the Web and Wikipedia.

Book
27 Aug 2013
TL;DR: Findations of Fuzzy Logic and Semantic Web Languages provides a rigorous and succinct account of the mathematical methods and tools used for representing and reasoning with fuzzy information withinSemantic Web languages.
Abstract: Managing vagueness/fuzziness is starting to play an important role in Semantic Web research, with a large number of research efforts underway. Foundations of Fuzzy Logic and Semantic Web Languages provides a rigorous and succinct account of the mathematical methods and tools used for representing and reasoning with fuzzy information within Semantic Web languages. The book focuses on the three main streams of Semantic Web languages: Triple languages RDF and RDFS Conceptual languages OWL and OWL 2, and their profiles OWL EL, OWL QL, and OWL RL Rule-based languages, such as SWRL and RIF Written by a prominent researcher in this area, the book is the first to combine coverage of fuzzy logic and Semantic Web languages. The first part of the book covers all the theoretical and logical aspects of classical (two-valued) Semantic Web languages. The second part explains how to generalize these languages to cope with fuzzy set theory and fuzzy logic. With an extensive bibliography, this book provides in-depth insight into fuzzy Semantic Web languages for non-fuzzy set theory and fuzzy logic experts. It also helps researchers of non-Semantic Web languages get a better understanding of the theoretical fundamentals of Semantic Web languages.

Journal ArticleDOI
TL;DR: An efficient evaluation of all existing semantic similarity methods based on structure, information content and feature approaches is given to help researcher and practitioners to select the measure that best fit for their requirements.
Abstract: In recent years, semantic similarity measure has a great interest in Semantic Web and Natural Language Processing (NLP). Several similarity measures have been developed, being given the existence of a structured knowledge representation offered by ontologies and corpus which enable semantic interpretation of terms. Semantic similarity measures compute the similarity between concepts/terms included in knowledge sources in order to perform estimations. This paper discusses the existing semantic similarity methods based on structure, information content and feature approaches. Additionally, we present a critical evaluation of several categories of semantic similarity approaches based on two standard benchmarks. The aim of this paper is to give an efficient evaluation of all these measures which help researcher and practitioners to select the measure that best fit for their requirements.

Journal ArticleDOI
TL;DR: The approach involves three web services that cooperate to achieve production goals using the domain web services and maintains a semantic model of the current state of the system, which is automatically updated based on event notifications sent by the domain services.
Abstract: This paper presents an approach to using semantic web services in managing production processes. In particular, the devices in the production systems considered expose web service interfaces through which they can then be controlled, while semantic web service descriptions formulated in web ontology language for services (OWL-S) make it possible to determine the conditions and effects of invoking the web services. The approach involves three web services that cooperate to achieve production goals using the domain web services. In particular, one of the three services maintains a semantic model of the current state of the system, while another uses the model to compose the domain web services so that they jointly achieve the desired goals. The semantic model of the system is automatically updated based on event notifications sent by the domain services.

Journal ArticleDOI
TL;DR: It is shown that MathML and OpenMath, the standard XML-based exchange languages for mathematical knowledge, can be fully integrated with RDF representations in order to contribute existing mathematical knowledge to the Web of Data.
Abstract: Mathematics is a ubiquitous foundation of science, technology, and engineering. Specific areas of mathematics, such as numeric and symbolic computation or logics, enjoy considerable software support. Working mathematicians have recently started to adopt Web 2.0 environments, such as blogs and wikis, but these systems lack machine support for knowledge organization and reuse, and they are disconnected from tools such as computer algebra systems or interactive proof assistants. We argue that such scenarios will benefit from Semantic Web technology.Conversely, mathematics is still underrepresented on the Web of [Linked] Data. There are mathematics-related Linked Data, for example statistical government data or scientific publication databases, but their mathematical semantics has not yet been modeled. We argue that the services for the Web of Data will benefit from a deeper representation of mathematical knowledge.Mathematical knowledge comprises structures given in a logical language --formulae, statements e.g. axioms, and theo-ries --, a mixture of rigorous natural language and symbolic notation in documents, application-specific metadata, and discussions about conceptualizations, formalizations, proofs, and counter-examples. Our review of vocabularies for representing these structures covers ontologies for mathematical problems, proofs, interlinked scientific publications, scientific discourse, as well as mathematical metadata vocabularies and domain knowledge from pure and applied mathematics.Many fields of mathematics have not yet been implemented as proper Semantic Web ontologies; however, we show that MathML and OpenMath, the standard XML-based exchange languages for mathematical knowledge, can be fully integrated with RDF representations in order to contribute existing mathematical knowledge to the Web of Data.We conclude with a roadmap for getting the mathematical Web of Data started: what datasets to publish, how to interlink them, and how to take advantage of these new connections.

Journal ArticleDOI
TL;DR: This paper presents a Linked Data model and a RESTful proxy for OGC's Sensor Observation Service to improve integration and inter-linkage of observation data for the Digital Earth.
Abstract: The vision of a Digital Earth calls for more dynamic information systems, new sources of information, and stronger capabilities for their integration. Sensor networks have been identified as a major information source for the Digital Earth, while Semantic Web technologies have been proposed to facilitate integration. So far, sensor data are stored and published using the Observations & Measurements standard of the Open Geospatial Consortium (OGC) as data model. With the advent of Volunteered Geographic Information and the Semantic Sensor Web, work on an ontological model gained importance within Sensor Web Enablement (SWE). In contrast to data models, an ontological approach abstracts from implementation details by focusing on modeling the physical world from the perspective of a particular domain. Ontologies restrict the interpretation of vocabularies toward their intended meaning. The ongoing paradigm shift to Linked Sensor Data complements this attempt. Two questions have to be addressed: (1) ...

Book ChapterDOI
21 Oct 2013
TL;DR: This study is based on a large public Web crawl dating from early 2012 and consisting of 3 billion HTML pages which originate from over 40 million websites, and reveals the deployment of the different markup standards, the main topical areas of the published data as well as the different vocabularies that are used within each topical area to represent data.
Abstract: More and more websites embed structured data describing for instance products, reviews, blog posts, people, organizations, events, and cooking recipes into their HTML pages using markup standards such as Microformats, Microdata and RDFa. This development has accelerated in the last two years as major Web companies, such as Google, Facebook, Yahoo!, and Microsoft, have started to use the embedded data within their applications. In this paper, we analyze the adoption of RDFa, Microdata, and Microformats across the Web. Our study is based on a large public Web crawl dating from early 2012 and consisting of 3 billion HTML pages which originate from over 40 million websites. The analysis reveals the deployment of the different markup standards, the main topical areas of the published data as well as the different vocabularies that are used within each topical area to represent data. What distinguishes our work from earlier studies, published by the large Web companies, is that the analyzed crawl as well as the extracted data are publicly available. This allows our findings to be verified and to be used as starting points for further domain-specific investigations as well as for focused information extraction endeavors.