scispace - formally typeset
Search or ask a question

Showing papers on "Semantic Web published in 2014"


Proceedings ArticleDOI
06 Mar 2014
TL;DR: A novel architecture model for IoT with the help of Semantic Fusion Model (SFM) is presented, which introduces the use of Smart Semantic framework to encapsulate the processed information from sensor networks.
Abstract: Internet-of-Things (IoT) is the convergence of Internet with RFID, Sensor and smart objects. IoT can be defined as “things belonging to the Internet” to supply and access all of real-world information. Billions of devices are expected to be associated into the system and that shall require huge distribution of networks as well as the process of transforming raw data into meaningful inferences. IoT is the biggest promise of the technology today, but still lacking a novel mechanism, which can be perceived through the lenses of Internet, things and semantic vision. This paper presents a novel architecture model for IoT with the help of Semantic Fusion Model (SFM). This architecture introduces the use of Smart Semantic framework to encapsulate the processed information from sensor networks. The smart embedded system is having semantic logic and semantic value based Information to make the system an intelligent system. This paper presents a discussion on Internet oriented applications, services, visual aspect and challenges for Internet of things using RFID, 6lowpan and sensor networks.

463 citations


Journal ArticleDOI
TL;DR: A structured and comprehensive overview of the literature in the field of Web Data Extraction is provided, namely applications at the Enterprise level and at the Social Web level, which allows to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users.
Abstract: Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction.This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.

364 citations


Book ChapterDOI
19 Oct 2014
TL;DR: New RDF exports that connect Wikidata to the Linked Data Web are introduced and several partial exports are introduced that provide more selective or simplified views on the data.
Abstract: Wikidata is the central data management platform of Wikipedia. By the efforts of thousands of volunteers, the project has produced a large, open knowledge base with many interesting applications. The data is highly interlinked and connected to many other datasets, but it is also very rich, complex, and not available in RDF. To address this issue, we introduce new RDF exports that connect Wikidata to the Linked Data Web. We explain the data model of Wikidata and discuss its encoding in RDF. Moreover, we introduce several partial exports that provide more selective or simplified views on the data. This includes a class hierarchy and several other types of ontological axioms that we extract from the site. All datasets we discuss here are freely available online and updated regularly.

287 citations


Proceedings ArticleDOI
07 Apr 2014
TL;DR: This paper proposes an effective approach for fragmenting RDF data sets based on a query log and allocating the fragments to hosts in a cluster of machines and produces efficient query execution plans for ad-hoc SPARQL queries.
Abstract: The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the Web but also to an increasing number of backend applications relying on efficient query processing. Confronted with such a trend, existing centralized state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient. In this paper, we introduce Partout, a distributed engine for fast RDF processing in a cluster of machines. We propose an effective approach for fragmenting RDF data sets based on a query log and allocating the fragments to hosts in a cluster of machines. Furthermore, Partout's query optimizer produces efficient query execution plans for ad-hoc SPARQL queries.

162 citations


Journal ArticleDOI
TL;DR: This work presents a novel semantic level interoperability architecture for pervasive computing and IoTs that conforms to the common IoT-A architecture reference model (ARM), and maps the central components of the architecture to the IoT-ARM.
Abstract: Pervasive computing and Internet of Things (IoTs) paradigms have created a huge potential for new business. To fully realize this potential, there is a need for a common way to abstract the heterogeneity of devices so that their functionality can be represented as a virtual computing platform. To this end, we present novel semantic level interoperability architecture for pervasive computing and IoTs. There are two main principles in the proposed architecture. First, information and capabilities of devices are represented with semantic web knowledge representation technologies and interaction with devices and the physical world is achieved by accessing and modifying their virtual representations. Second, global IoT is divided into numerous local smart spaces managed by a semantic information broker (SIB) that provides a means to monitor and update the virtual representation of the physical world. An integral part of the architecture is a resolution infrastructure that provides a means to resolve the network address of a SIB either using a physical object identifier as a pointer to information or by searching SIBs matching a specification represented with SPARQL. We present several reference implementations and applications that we have developed to evaluate the architecture in practice. The evaluation also includes performance studies that, together with the applications, demonstrate the suitability of the architecture to real-life IoT scenarios. In addition, to validate that the proposed architecture conforms to the common IoT-A architecture reference model (ARM), we map the central components of the architecture to the IoT-ARM.

148 citations


Journal ArticleDOI
TL;DR: This paper defines five key research questions in this new application area, examined through a survey of state-of-the-art approaches to mining semantics from social media streams; user, network, and behaviour modelling; and intelligent, semanticbased information access.
Abstract: Using semantic technologies for mining and intelligent information access to social media is a challenging, emerging research area. Traditional search methods are no longer able to address the more complex information seeking behaviour in media streams, which has evolved towards sense making, learning, investigation, and social search. Unlike carefully authored news text and longer web context, social media streams pose a number of new challenges, due to their large-scale, short, noisy, contextdependent, and dynamic nature. This paper defines five key research questions in this new application area, examined through a survey of state-of-the-art approaches to mining semantics from social media streams; user, network, and behaviour modelling; and intelligent, semanticbased information access. The survey includes key methods not just from the Semantic Web research field, but also from the related areas of natural language processing and user modelling. In conclusion, key outstanding challenges are discussed and new directions for research are proposed.

144 citations


Journal ArticleDOI
TL;DR: The requirements coming from different application scenarios are identified, and the problems they pose are isolated, and a research agenda is drawn to guide the future research and development of stream reasoning is drawn.

144 citations


Journal ArticleDOI
TL;DR: This work analyses the Semantic Web of Things SWoT, presenting its different levels to offer an IoT convergence, and analyses the trends for capillary networks and for cellular networks with standards such as IPSO, ZigBee, OMA, and the oneM2M initiative.
Abstract: The Internet of Things IoT is being applied for stovepipe solutions, since it presents a semantic description limited to a specific domain. IoT needs to be pushed towards a more open, interoperable and collaborative IoT. The first step has been the Web of Things WoT. WoT evolves the IoT with a common stack based on web services. But, even when a homogeneous access is reached through web protocols, a common understanding is not yet acquired. For this purpose, the Semantic Web of Things SWoT is proposed for the integration of the semantic web on the WoT. This work analyses the SWoT, presenting its different levels to offer an IoT convergence. Specifically, we analyse the trends for capillary networks and for cellular networks with standards such as IPSO, ZigBee, OMA, and the oneM2M initiative. This work also analyses the impact of the semantic-annotations/metadata in the performance of the resources.

141 citations


15 Sep 2014
TL;DR: The QALD-4 open challenge on question answering over linked data (QALD) as mentioned in this paper provides up-to-date, demanding benchmarks that establish a standard against which question answering systems over structured data can be evaluated and compared.
Abstract: With the increasing amount of semantic data available on the web there is a strong need for systems that allow common web users to access this body of knowledge. Especially question answering systems have received wide attention, as they allow users to express arbitrarily complex information needs in an easy and intuitive fashion (for an overview see [4]). The key challenge lies in translating the users' information needs into a form such that they can be evaluated using standard Semantic Web query processing and inferencing techniques. Over the past years, a range of approaches have been developed to address this challenge, showing signicant advances towards answering natural language questions with respect to large, heterogeneous sets of structured data. However, only few systems yet address the fact that the structured data available nowadays is distributed among a large collection of interconnected datasets, and that answers to questions can often only be provided if information from several sources are combined. In addition, a lot of information is still available only in textual form, both on the web and in the form of labels and abstracts in linked data sources. Therefore approaches are needed that can not only deal with the specific character of structured data but also with finding information in several sources, processing both structured and unstructured information, and combining such gathered information into one answer. The main objective of the open challenge on question answering over linked data (QALD) is to provide up-to-date, demanding benchmarks that establish a standard against which question answering systems over structured data can be evaluated and compared. QALD-4 is the fourth instalment of the QALD open challenge, comprising three tasks: multilingual question answering, biomedical question answering over interlinked data, and hybrid question answering.

135 citations


Journal ArticleDOI
TL;DR: Sentilo implements an approach based on the neo-Davidsonian assumption that events and situations are the primary entities for contextualizing opinions, which makes it able to distinguish holders, main topics, and sub-topics of an opinion.
Abstract: Sentilo is a model and a tool to detect holders and topics of opinion sentences. Sentilo implements an approach based on the neo-Davidsonian assumption that events and situations are the primary entities for contextualizing opinions, which makes it able to distinguish holders, main topics, and sub-topics of an opinion. It uses a heuristic graph mining approach that relies on FRED, a machine reader for the Semantic Web that leverages Natural Language Processing (NLP) and Knowledge Representation (KR) components jointly with cognitively-inspired frames. The evaluation results are excellent for holder detection (F1: 95%), very good for subtopic detection (F1: 78%), and good for topic detection (F1: 68%).

128 citations


Journal ArticleDOI
TL;DR: A novel method to efficiently provide better Web-page recommendation through semantic-enhancement by integrating the domain and Web usage knowledge of a website is proposed.
Abstract: Web-page recommendation plays an important role in intelligent Web systems. Useful knowledge discovery from Web usage data and satisfactory knowledge representation for effective Web-page recommendations are crucial and challenging. This paper proposes a novel method to efficiently provide better Web-page recommendation through semantic-enhancement by integrating the domain and Web usage knowledge of a website. Two new models are proposed to represent the domain knowledge. The first model uses an ontology to represent the domain knowledge. The second model uses one automatically generated semantic network to represent domain terms, Web-pages, and the relations between them. Another new model, the conceptual prediction model, is proposed to automatically generate a semantic network of the semantic Web usage knowledge, which is the integration of domain knowledge and Web usage knowledge. A number of effective queries have been developed to query about these knowledge bases. Based on these queries, a set of recommendation strategies have been proposed to generate Web-page candidates. The recommendation results have been compared with the results obtained from an advanced existing Web Usage Mining (WUM) method. The experimental results demonstrate that the proposed method produces significantly higher performance than the WUM method.

Journal ArticleDOI
TL;DR: This survey paper studies the joint work of historians and computer scientists in the use of Semantic Web methods and technologies in historical research, and describes open challenges and possible lines of research pushing further a still young, but promising, historicalSemantic Web.
Abstract: During the nineties of the last century, historians and computer scientists created together a research agenda around the life cycle of historical information. It comprised the tasks of creation, design, enrichment, editing, retrieval, analysis and presentation of historical information with help of information technology. They also identified a number of problems and challenges in this field, some of them closely related to semantics and meaning. In this survey paper we study the joint work of historians and computer scientists in the use of Semantic Web methods and technologies in historical research. We analyse to what extent these contributions help in solving the open problems in the agenda of historians, and we describe open challenges and possible lines of research pushing further a still young, but promising, historical Semantic Web.

Book ChapterDOI
19 Oct 2014
TL;DR: This paper presents a series of publicly accessible Microdata, RDFa, Microformats datasets that have been extracted from three large web corpora dating from 2010, 2012 and 2013, which consist of almost 30 billion RDF quads.
Abstract: In order to support web applications to understand the content of HTML pages an increasing number of websites have started to annotate structured data within their pages using markup formats such as Microdata, RDFa, Microformats The annotations are used by Google, Yahoo!, Yandex, Bing and Facebook to enrich search results and to display entity descriptions within their applications In this paper, we present a series of publicly accessible Microdata, RDFa, Microformats datasets that we have extracted from three large web corpora dating from 2010, 2012 and 2013 Altogether, the datasets consist of almost 30 billion RDF quads The most recent of the datasets contains amongst other data over 211 million product descriptions, 54 million reviews and 125 million postal addresses originating from thousands of websites The availability of the datasets lays the foundation for further research on integrating and cleansing the data as well as for exploring its utility within different application contexts As the dataset series covers four years, it can also be used to analyze the evolution of the adoption of the markup formats

Proceedings ArticleDOI
01 Mar 2014
TL;DR: This paper proposes a concept-based approach that maps each column of a web table to the best concept, in a well-developed knowledge base, that represents it and develops a hybrid machine-crowdsourcing framework that leverages human intelligence to discern the concepts for “difficult” columns.
Abstract: The Web is teeming with rich structured information in the form of HTML tables, which provides us with the opportunity to build a knowledge repository by integrating these tables An essential problem of web data integration is to discover semantic correspondences between web table columns, and schema matching is a popular means to determine the semantic correspondences However, conventional schema matching techniques are not always effective for web table matching due to the incompleteness in web tables In this paper, we propose a two-pronged approach for web table matching that effectively addresses the above difficulties First, we propose a concept-based approach that maps each column of a web table to the best concept, in a well-developed knowledge base, that represents it This approach overcomes the problem that sometimes values of two web table columns may be disjoint, even though the columns are related, due to incompleteness in the column values Second, we develop a hybrid machine-crowdsourcing framework that leverages human intelligence to discern the concepts for “difficult” columns Our overall framework assigns the most “beneficial” column-to-concept matching tasks to the crowd under a given budget and utilizes the crowdsourcing result to help our algorithm infer the best matches for the rest of the columns We validate the effectiveness of our framework through an extensive experimental study over two real-world web table data sets The results show that our two-pronged approach outperforms existing schema matching techniques at only a low cost for crowdsourcing

Journal ArticleDOI
TL;DR: This paper presents a building automation system adopting SOA paradigm with devices implemented by device profile for web service (DPWS) in which context information is collected, processed, and sent to a composition engine to coordinate appropriate devices/services based on the context, composition plan, and predefined policy rules.
Abstract: Service-oriented architecture (SOA) is realized by independent, standardized, and self-describing units known as services. This architecture has been widely used and verified for automatic, dynamic, and self-configuring distributed systems such as in building automation. This paper presents a building automation system adopting SOA paradigm with devices implemented by device profile for web service (DPWS) in which context information is collected, processed, and sent to a composition engine to coordinate appropriate devices/services based on the context, composition plan, and predefined policy rules. A six-phased composition process is proposed to carry out the task. In addition, two other components are designed to support the composition process: building ontology as a schema for representing semantic data and composition plan description language to describe context-based composite services in form of composition plans. A prototype consisting of a DPWSim simulator and SamBAS is developed to illustrate and test the proposed idea. Comparison analysis and experimental results imply the feasibility and scalability of the system.

Journal ArticleDOI
TL;DR: A semantically-enhanced platform that will assist in the process of discovering the cloud services that best match user needs and outperform state-of-the-art solutions in similarly broad domains is presented.
Abstract: Cloud computing is a technological paradigm that permits computing services to be offered over the Internet. This new service model is closely related to previous well-known distributed computing initiatives such as Web services and grid computing. In the current socio-economic climate, the affordability of cloud computing has made it one of the most popular recent innovations. This has led to the availability of more and more cloud services, as a consequence of which it is becoming increasingly difficult for service consumers to find and access those cloud services that fulfil their requirements. In this paper, we present a semantically-enhanced platform that will assist in the process of discovering the cloud services that best match user needs. This fully-fledged system encompasses two basic functions: the creation of a repository with the semantic description of cloud services and the search for services that accomplish the required expectations. The cloud service's semantic repository is generated by means of an automatic tool that first annotates the cloud service descriptions with semantic content and then creates a semantic vector for each service. The comprehensive evaluation of the tool in the ICT domain has led to very promising results that outperform state-of-the-art solutions in similarly broad domains.

Journal ArticleDOI
TL;DR: The results of a structured literature survey of Semantic Web technologies in DSS are presented, together with the results of interviews with DSS practitioners, to provide an overview of current research as well as open research areas, trends and new directions.
Abstract: The Semantic Web shares many goals with Decision Support Systems DSS, e.g., being able to precisely interpret information, in order to deliver relevant, reliable and accurate information to a user when and where it is needed. DSS have in addition more specific goals, since the information need is targeted towards making a particular decision, e.g., making a plan or reacting to a certain situation. When surveying DSS literature, we discover applications ranging from Business Intelligence, via general purpose social networking and collaboration support, Information Retrieval and Knowledge Management, to situation awareness, emergency management, and simulation systems. The unifying element is primarily the purpose of the systems, and their focus on information management and provision, rather than the specific technologies they employ to reach these goals. Semantic Web technologies have been used in DSS during the past decade to solve a number of different tasks, such as information integration and sharing, web service annotation and discovery, and knowledge representation and reasoning. In this survey article, we present the results of a structured literature survey of Semantic Web technologies in DSS, together with the results of interviews with DSS researchers and developers both in industry and research organizations outside the university. The literature survey has been conducted using a structured method, where papers are selected from the publisher databases of some of the most prominent conferences and journals in both fields Semantic Web and DSS, based on sets of relevant keywords representing the intersection of the two fields. Our main contribution is to analyze the landscape of semantic technologies in DSS, and provide an overview of current research as well as open research areas, trends and new directions. An added value is the conclusions drawn from interviews with DSS practitioners, which give an additional perspective on the potential of Semantic Web technologies in this field; including scenarios for DSS, and requirements for Semantic Web technologies that may attempt to support those scenarios.

Book ChapterDOI
19 Oct 2014
TL;DR: This work combines four different state-of-the approaches by using 15 different algorithms for ensemble learning and evaluates their performace on five different datasets to suggest that ensemble learning can reduce the error rate of state- of-the-art named entity recognition systems by 40%, thereby leading to over 95% f-score in the best run.
Abstract: A considerable portion of the information on the Web is still only available in unstructured form. Implementing the vision of the Semantic Web thus requires transforming this unstructured data into structured data. One key step during this process is the recognition of named entities. Previous works suggest that ensemble learning can be used to improve the performance of named entity recognition tools. However, no comparison of the performance of existing supervised machine learning approaches on this task has been presented so far. We address this research gap by presenting a thorough evaluation of named entity recognition based on ensemble learning. To this end, we combine four different state-of-the approaches by using 15 different algorithms for ensemble learning and evaluate their performace on five different datasets. Our results suggest that ensemble learning can reduce the error rate of state-of-the-art named entity recognition systems by 40%, thereby leading to over 95% f-score in our best run.

Book ChapterDOI
19 Oct 2014
TL;DR: The Optique platform is introduced as a suitable OBDA solution for Siemens and the preliminary installation and evaluation of the platform in Siemens is described.
Abstract: We present a description and analysis of the data access challenge in the Siemens Energy. We advocate for Ontology Based Data Access (OBDA) as a suitable Semantic Web driven technology to address the challenge. We derive requirements for applying OBDA in Siemens, review existing OBDA systems and discuss their limitations with respect to the Siemens requirements. We then introduce the Optique platform as a suitable OBDA solution for Siemens. Finally, we describe our preliminary installation and evaluation of the platform in Siemens.

Proceedings ArticleDOI
06 Mar 2014
TL;DR: This work proposes a semantic-based approach to automatically combine, enrich and reason about M2M data to provide promising cross-domain M 2M applications.
Abstract: The Internet of Things, more specifically, the Machine-to-Machine (M2M) standard enables machines and devices such as sensors to communicate with each other without human intervention. The M2M devices provide a great deal of M2M data, mainly used for specific M2M applications such as weather forecasting, healthcare or building automation. Existing applications are domain-specific and use their own descriptions of devices and measurements. A major challenge is to combine M2M data provided by these heterogeneous domains and by different projects. It is really a difficult task to understand the meaning of the M2M data to later reason about them. We propose a semantic-based approach to automatically combine, enrich and reason about M2M data to provide promising cross-domain M2M applications. A proof-of-concept to validate our approach is published online (http://sensormeasurement.appspot.com/).

01 May 2014
TL;DR: It is shown that R2RML, the W3C recommendation for describing RDB to RDF mappings, may not apply to all needs in the wide scope of RDB-to-RDF translation applications, leaving space for future extensions.
Abstract: Relational databases scattered over the web are generally opaque to regular web crawling tools. To address this concern, many RDB-to-RDF approaches have been proposed over the last years. In this paper, we propose a detailed review of seventeen RDB-to-RDF initiatives, considering end-to-end projects that delivered operational tools. The different tools are classified along three major axes: mapping description language, mapping implementation and data retrieval method. We analyse the motivations, commonalities and differences between existing approaches. The expressiveness of existing mapping languages is not always sufficient to produce semantically rich data and make it usable, interoperable and linkable. We therefore briefly present various strategies investigated in the literature to produce additional knowledge. Finally, we show that R2RML, the W3C recommendation for describing RDB to RDF mappings, may not apply to all needs in the wide scope of RDB to RDF translation applications, leaving space for future extensions.

Proceedings Article
01 May 2014
TL;DR: This article presents three novel, manually curated and annotated corpora (N3) based on a free license and stored in the NLP Interchange Format to leverage the Linked Data character of the authors' datasets.
Abstract: Extracting Linked Data following the Semantic Web principle from unstructured sources has become a key challenge for scientific research. Named Entity Recognition and Disambiguation are two basic operations in this extraction process. One step towards the realization of the Semantic Web vision and the development of highly accurate tools is the availability of data for validating the quality of processes for Named Entity Recognition and Disambiguation as well as for algorithm tuning. This article presents three novel, manually curated and annotated corpora (N3). All of them are based on a free license and stored in the NLP Interchange Format to leverage the Linked Data character of our datasets.

Journal ArticleDOI
TL;DR: This work states that there is a lack of formal representation of the relevant knowledge domain for neurodegenerative diseases such as Alzheimer's disease.
Abstract: Background Biomedical ontologies offer the capability to structure and represent domain-specific knowledge semantically. Disease-specific ontologies can facilitate knowledge exchange across multiple disciplines, and ontology-driven mining approaches can generate great value for modeling disease mechanisms. However, in the case of neurodegenerative diseases such as Alzheimer's disease, there is a lack of formal representation of the relevant knowledge domain. Methods Alzheimer's disease ontology (ADO) is constructed in accordance to the ontology building life cycle. The Protege OWL editor was used as a tool for building ADO in Ontology Web Language format. Results ADO was developed with the purpose of containing information relevant to four main biological views—preclinical, clinical, etiological, and molecular/cellular mechanisms—and was enriched by adding synonyms and references. Validation of the lexicalized ontology by means of named entity recognition-based methods showed a satisfactory performance (F score=72%). In addition to structural and functional evaluation, a clinical expert in the field performed a manual evaluation and curation of ADO. Through integration of ADO into an information retrieval environment, we show that the ontology supports semantic search in scientific text. The usefulness of ADO is authenticated by dedicated use case scenarios. Conclusions Development of ADO as an open ADO is a first attempt to organize information related to Alzheimer's disease in a formalized, structured manner. We demonstrate that ADO is able to capture both established and scattered knowledge existing in scientific text.

Proceedings Article
01 May 2014
TL;DR: The publication of BabelNet 2.0, a wide-coverage multilingual encyclopedic dictionary and ontology, is presented as Linked Data, an interlinked multilingual (lexical) resource which can not only be accessed on the LOD, but also be used to enrich existing datasets with linguistic information, or to support the process of mapping datasets across languages.
Abstract: Recent years have witnessed a surge in the amount of semantic information published on the Web. Indeed, the Web of Data, a subset of the Semantic Web, has been increasing steadily in both volume and variety, transforming the Web into a ‘global database’ in which resources are linked across sites. Linguistic fields -- in a broad sense -- have not been left behind, and we observe a similar trend with the growth of linguistic data collections on the so-called ‘Linguistic Linked Open Data (LLOD) cloud’. While both Semantic Web and Natural Language Processing communities can obviously take advantage of this growing and distributed linguistic knowledge base, they are today faced with a new challenge, i.e., that of facilitating multilingual access to the Web of data. In this paper we present the publication of BabelNet 2.0, a wide-coverage multilingual encyclopedic dictionary and ontology, as Linked Data. The conversion made use of lemon, a lexicon model for ontologies particularly well-suited for this enterprise. The result is an interlinked multilingual (lexical) resource which can not only be accessed on the LOD, but also be used to enrich existing datasets with linguistic information, or to support the process of mapping datasets across languages.

01 Jan 2014
TL;DR: This paper shows how the proposed approach to annotating sensor data and observations are annotated using an ontology network based on the SSN ontology, providing a standardized queryable representation that makes it easier to share, discover, integrate and interpret the data.
Abstract: We present XGSN, an open-source system that relies on semantic representations of sensor metadata and observations, to guide the process of annotating and publishing sensor data on the Web. XGSN is able to handle the data acquisition process of a wide number of devices and protocols, and is designed as a highly extensible platform, leveraging on the existing capabilities of the Global Sensor Networks (GSN) middleware. Going beyond traditional sensor management systems, XGSN is capable of enriching virtual sensor descriptions with semantically annotated content using standard vocabularies. In the proposed approach, sensor data and observations are annotated using an ontology network based on the SSN ontology, providing a standardized queryable representation that makes it easier to share, discover, integrate and interpret the data. XGSN manages the annotation process for the incoming sensor observations, producing RDF streams that are sent to the cloud-enabled Linked Sensor Middleware, which can internally store the data or perform continuous query processing. The distributed nature of XGSN allows deploying different remote instances that can interchange observation data, so that virtual sensors can be aggregated and consume data from other remote virtual sensors. In this paper we show how this approach has been implemented in XGSN, and incorporated to the wider OpenIoT platform, providing a highly flexible and scalable system for managing the life-cycle of sensor data, from acquisition to publishing, in the context of the semantic Web of Things.

Proceedings ArticleDOI
01 Oct 2014
TL;DR: The nature of big data and the role of semantic web and data analysis for generating “smart data” which offer actionable information that supports better decision for personalized medicine are discussed.
Abstract: In healthcare, big data tools and technologies have the potential to create significant value by improving outcomes while lowering costs for each individual patient. Diagnostic images, genetic test results and biometric information are increasingly generated and stored in electronic health records presenting us with challenges in data that is by nature high volume, variety and velocity, thereby necessitating novel ways to store, manage and process big data. This presents an urgent need to develop new, scalable and expandable big data infrastructure and analytical methods that can enable healthcare providers access knowledge for the individual patient, yielding better decisions and outcomes. In this paper, we briefly discuss the nature of big data and the role of semantic web and data analysis for generating “smart data” which offer actionable information that supports better decision for personalized medicine. In our view, the biggest challenge is to create a system that makes big data robust and smart for healthcare providers and patients that can lead to more effective clinical decision-making, improved health outcomes, and ultimately, managing the healthcare costs. We highlight some of the challenges in using big data and propose the need for a semantic data-driven environment to address them. We illustrate our vision with practical use cases, and discuss a path for empowering personalized medicine using big data and semantic web technology.

Journal ArticleDOI
TL;DR: The goal of this article is to survey several of the most outstanding methodologies, methods and techniques that have emerged in the last years, and present the most popular development environments, which can be utilized to carry out, or facilitate specific activities within the methodologies.
Abstract: Building ontologies in a collaborative and increasingly community-driven fashion has become a central paradigm of modern ontology engineering. This understanding of ontologies and ontology engineering processes is the result of intensive theoretical and empirical research within the Semantic Web community, supported by technology developments such as Web 2.0. Over 6 years after the publication of the first methodology for collaborative ontology engineering, it is generally acknowledged that, in order to be useful, but also economically feasible, ontologies should be developed and maintained in a community-driven manner, with the help of fully-fledged environments providing dedicated support for collaboration and user participation. Wikis, and similar communication and collaboration platforms enabling ontology stakeholders to exchange ideas and discuss modeling decisions are probably the most important technological components of such environments. In addition, process-driven methodologies assist the ontology engineering team throughout the ontology life cycle, and provide empirically grounded best practices and guidelines for optimizing ontology development results in real-world projects. The goal of this article is to analyze the state of the art in the field of collaborative ontology engineering. We will survey several of the most outstanding methodologies, methods and techniques that have emerged in the last years, and present the most popular development environments, which can be utilized to carry out, or facilitate specific activities within the methodologies. A discussion of the open issues identified concludes the survey and provides a roadmap for future research and development in this lively and promising field.

Journal ArticleDOI
TL;DR: STAR-CITY demonstrates how the severity of road traffic congestion can be smoothly analyzed, diagnosed, explored and predicted using semantic web technologies.


Book ChapterDOI
24 Nov 2014
TL;DR: The new version VOWL 2 is presented and it is described how the initial definitions were used to systematically redefine the visual notation and confirmed that not only the general ideas of VowL but also most of the enhancements for VOWl 2 can be well understood by casual ontology users.
Abstract: Ontologies become increasingly important as a means to structure and organize information. This requires methods and tools that enable not only ontology experts but also other user groups to work with ontologies and related data. We have developed VOWL, a comprehensive and well-specified visual language for the user-oriented representation of ontologies, and conducted a comparative study on an initial version of VOWL. Based upon results from that study, as well as an extensive review of other ontology visualizations, we have reworked many parts of VOWL. In this paper, we present the new version VOWL 2 and describe how the initial definitions were used to systematically redefine the visual notation. Besides the novelties of the visual language, which is based on a well-defined set of graphical primitives and an abstract color scheme, we briefly describe two implementations of VOWL 2. To gather some insight into the user experience with the new version of VOWL, we have conducted a qualitative user study. We report on the study and its results, which confirmed that not only the general ideas of VOWL but also most of our enhancements for VOWL 2 can be well understood by casual ontology users.