scispace - formally typeset
Search or ask a question

Showing papers on "XML published in 2017"


Journal ArticleDOI
TL;DR: The Spoken British National Corpus 2014 is introduced, an 11.5-million-word corpus of orthographically transcribed conversations among L1 speakers of British English from across the UK, recorded in the years 2012–2016.
Abstract: This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthographically transcribed conversations among L1 speakers of British English from across the UK, recorded in the years 2012–2016. After showing that a survey of the recent history of corpora of spoken British English justifies the compilation of this new corpus, we describe the main stages of the Spoken BNC2014’s creation: design, data and metadata collection, transcription, XML encoding, and annotation. In doing so we aim to (i) encourage users of the corpus to approach the data with sensitivity to the many methodological issues we identified and attempted to overcome while compiling the Spoken BNC2014, and (ii) inform (future) compilers of spoken corpora of the innovations we implemented to attempt to make the construction of corpora representing spontaneous speech in informal contexts more tractable, both logistically and practically, than in the past.

159 citations


Journal ArticleDOI
01 Dec 2017
TL;DR: The goal of GasLib is to provide a set of publicly available gas network instances that can be used by researchers in the field of gas transport to save time and compare different models and algorithms on the same specified test sets.
Abstract: The development of mathematical simulation and optimization models and algorithms for solving gas transport problems is an active field of research. In order to test and compare these models and algorithms, gas network instances together with demand data are needed. The goal of GasLib is to provide a set of publicly available gas network instances that can be used by researchers in the field of gas transport. The advantages are that researchers save time by using these instances and that different models and algorithms can be compared on the same specified test sets. The library instances are encoded in an XML (extensible markup language) format. In this paper, we explain this format and present the instances that are available in the library.

113 citations


Proceedings ArticleDOI
01 Sep 2017
TL;DR: The PURE (PUblic REquirements dataset), a dataset of 79 publicly available natural language requirements documents collected from the Web, is presented and its language with generic English texts is compared, showing the peculiarities of the requirements jargon.
Abstract: This paper presents PURE (PUblic REquirements dataset), a dataset of 79 publicly available natural language requirements documents collected from the Web. The dataset includes 34,268 sentences and can be used for natural language processing tasks that are typical in requirements engineering, such as model synthesis, abstraction identification and document structure assessment. It can be further annotated to work as a benchmark for other tasks, such as ambiguity detection, requirements categorisation and identification of equivalent re-quirements. In the paper, we present the dataset and we compare its language with generic English texts, showing the peculiarities of the requirements jargon, made of a restricted vocabulary of domain-specific acronyms and words, and long sentences. We also present the common XML format to which we have manually ported a subset of the documents, with the goal of facilitating replication of NLP experiments.

83 citations


Journal ArticleDOI
TL;DR: The analysis of results indicates that ifcJSON4 schema developed in this paper is a valid JSON schema that can guide the creation of valid ifc JSON documents to be used for web-based data transfer and to improve interoperability of Cloud-based BIM applications.

77 citations


Proceedings ArticleDOI
01 Nov 2017
TL;DR: Transkribus is a comprehensive platform for the computer-aided transcription, recognition and retrieval of digitized historical documents through an open-source desktop application that incorporates means to segment document images, to add a transcription and to tag entities within.
Abstract: Transkribus is a comprehensive platform for the computer-aided transcription, recognition and retrieval of digitized historical documents. The main user interface is provided via an open-source desktop application that incorporates means to segment document images, to add a transcription and to tag entities within. The desktop application is able to connect to the platform's backend, which implements a document management system as well as several tools for document image analysis, such as layout analysis or automatic/handwritten text recognition (ATR/HTR). Access to documents, uploaded to the platform, may be granted to other users in order to collaborate on the transcription and to share results.

76 citations


Posted Content
TL;DR: In this article, a graph priors-based label space modeling method was proposed for extreme multi-label classification. But it is not suitable for XML, where the label space can be as large as in millions.
Abstract: Extreme multi-label learning (XML) or classification has been a practical and important problem since the boom of big data. The main challenge lies in the exponential label space which involves $2^L$ possible label sets especially when the label dimension $L$ is huge, e.g., in millions for Wikipedia labels. This paper is motivated to better explore the label space by originally establishing an explicit label graph. In the meanwhile, deep learning has been widely studied and used in various classification problems including multi-label classification, however it has not been properly introduced to XML, where the label space can be as large as in millions. In this paper, we propose a practical deep embedding method for extreme multi-label classification, which harvests the ideas of non-linear embedding and graph priors-based label space modeling simultaneously. Extensive experiments on public datasets for XML show that our method performs competitive against state-of-the-art result.

42 citations


Journal ArticleDOI
TL;DR: The proposed Internet of things–based integrated information system is demonstrated to improve the effectiveness of monitoring processes and decision making in construction informatics applications and highlights the crucial importance of a systematic approach toward integrated information systems for effective information collection and structural health monitoring.
Abstract: The intelligent security monitoring of buildings and their surroundings has become increasingly crucial as the number of high-rise buildings increases. Building structural health monitoring and early warning technology are key components of building safety, the implementation of which remains challenging, and the Internet of things approach provides a new technical measure for addressing this challenge. This article presents a novel integrated information system that combines Internet of things, building information management, early warning system, and cloud services. Specifically, the system involves an intelligent data box with enhanced connectivity and exchangeability for accessing and integrating the data obtained from distributed heterogeneous sensing devices. An extensible markup language (XML)–based uniform data parsing model is proposed to abstract the various message formats of heterogeneous devices to ensure data integration. The proposed Internet of things–based integrated information system s...

32 citations


Proceedings ArticleDOI
30 Oct 2017
TL;DR: The GTR algorithm is presented, an effective and efficient technique to reduce arbitrary test inputs that can be represented as a tree, such as program code, PDF files, and XML documents, and automatically specializes the tree transformations applied by the algorithm based on examples of input trees.
Abstract: Reducing the test input given to a program while preserving some property of interest is important, e.g., to localize faults or to reduce test suites. The well-known delta debugging algorithm and its derivatives automate this task by repeatedly reducing a given input. Unfortunately, these approaches are limited to blindly removing parts of the input and cannot reduce the input by restructuring it. This paper presents the Generalized Tree Reduction (GTR) algorithm, an effective and efficient technique to reduce arbitrary test inputs that can be represented as a tree, such as program code, PDF files, and XML documents. The algorithm combines tree transformations with delta debugging and a greedy backtracking algorithm. To reduce the size of the considered search space, the approach automatically specializes the tree transformations applied by the algorithm based on examples of input trees. We evaluate GTR by reducing Python files that cause interpreter crashes, JavaScript files that cause browser inconsistencies, PDF documents with malicious content, and XML files used to tests an XML validator. The GTR algorithm reduces the trees of these files to 45.3%, 3.6%, 44.2%, and 1.3% of the original size, respectively, outperforming both delta debugging and another state-of-the-art algorithm.

31 citations


Posted Content
TL;DR: This paper presents a novel programming-by-example approach, and its implementation in a tool called Mitra, for automatically migrating tree-structured documents to relational tables, and shows that Mitra can automate the desired transformation for all datasets.
Abstract: While many applications export data in hierarchical formats like XML and JSON, it is often necessary to convert such hierarchical documents to a relational representation. This paper presents a novel programming-by-example approach, and its implementation in a tool called Mitra, for automatically migrating tree-structured documents to relational tables. We have evaluated the proposed technique using two sets of experiments. In the first experiment, we used Mitra to automate 98 data transformation tasks collected from StackOverflow. Our method can generate the desired program for 94% of these benchmarks with an average synthesis time of 3.8 seconds. In the second experiment, we used Mitra to generate programs that can convert real-world XML and JSON datasets to full-fledged relational databases. Our evaluation shows that Mitra can automate the desired transformation for all datasets.

28 citations


Journal ArticleDOI
TL;DR: This work evaluates the proposed framework that, taking as input a small set of parallel documents, gathers domain-specific bilingual terms and injects them into an SMT system to enhance translation quality and compares two terminology injection methods that can be easily used at run-time without altering the normal activity of anSMT system.
Abstract: This work focuses on the extraction and integration of automatically aligned bilingual terminology into a Statistical Machine Translation (SMT) system in a Computer Aided Translation scenario We evaluate the proposed framework that, taking as input a small set of parallel documents, gathers domain-specific bilingual terms and injects them into an SMT system to enhance translation quality Therefore, we investigate several strategies to extract and align terminology across languages and to integrate it in an SMT system We compare two terminology injection methods that can be easily used at run-time without altering the normal activity of an SMT system: XML markup and cache-based model We test the cache-based model on two different domains (information technology and medical) in English, Italian and German, showing significant improvements ranging from 223 to 678 BLEU points over a baseline SMT system and from 005 to 303 compared to the widely-used XML markup approach

20 citations


Proceedings ArticleDOI
01 Sep 2017
TL;DR: In the experiments, hybrid reinsertion has proven the most accurate method to handle markup, while alignment masking and alignment reinsertions should be regarded as viable alternatives.
Abstract: We present work on handling XML markup in Statistical Machine Translation (SMT). The methods we propose can be used to effectively preserve markup (for instance inline formatting or structure) and to place markup correctly in a machine-translated segment. We evaluate our approaches with parallel data that naturally contains markup or where markup was inserted to create synthetic examples. In our experiments, hybrid reinsertion has proven the most accurate method to handle markup, while alignment masking and alignment reinsertion should be regarded as viable alternatives. We provide implementations of all the methods described and they are freely available as an open-source framework.

Proceedings ArticleDOI
01 Jan 2017
TL;DR: This work is particularly interested in establishing what methods and tools exist to create OWL ontologies from implicitly expressed semantics, and focuses on popular data formats i.e. XML, JSON, RDF, Relational Databases and NoSQL Databases.
Abstract: From the general SOA architectural pattern, through distributed computing based on Grids and Clouds, to the Internet of Things, the idea of collaboration between software entities, independent from their vendors and technologies, attracts much attention. This brings about a question: how to achieve interoperability among multiple (existing and upcoming) platforms/systems/applications. The context for the presented research is provided by the INTER-IoT project, which deals with different aspects of interoperability in the Internet of Things (IoT). It aims at the design and implementation of an open framework and associated methodology to provide interoperability among heterogeneous IoT platforms, across a software stack (devices, network, middleware, application services, data and semantics). We focus on the data and semantics layer. Specifically, the role of ontologies and semantic data processing, as means of achieving interoperability. However, since the vision of the Semantic Web remains mostly unfulfilled, semantics remains implicitly “hidden” data and in exchanged messages. Therefore, we are particularly interested in establishing what methods and tools exist to create OWL ontologies from implicitly expressed semantics. We focus on popular data formats i.e. XML, JSON, RDF, Relational Databases and NoSQL Databases.

Journal ArticleDOI
TL;DR: An up-to-date review of current XML-based industry-neutral and domain-specific standardization initiatives aiming at achieving seamless interoperability among communication systems is presented, discussing their commonalities and differences, and highlighting directions for further research and development work.

Proceedings ArticleDOI
01 Aug 2017
TL;DR: Multilayer graphs, namely graphs whose labeled edges belong to a number of predetermined classes, have been recently introduced in social network analysis in order to represent the different interaction options between netizens and the potential of applying this new type of graphs to an ontological context creating essentially an ontology tensor is outlined and its complexity is assessed.
Abstract: Ontology has been an active research field connecting philosophy, logic, history, mathematics, and computer science to name a few. Within an ontological context defined over a domain the entities as well as their associated relationships can be represented by the vertices and the edges of a tree. From the latter new knowledge can be then inferred through a number of techniques including Horn logic from reasoners and RDF triplets. With the advent of the Semantic Web and sophisticated associated software tools including graph databases such as Neo4j, Sparksee, and TitanDB or XML parsers such as Xerces graph mining is done efficiently on the semantic level instead of the combinatorial or algebraic ones. Multilayer graphs, namely graphs whose labeled edges belong to a number of predetermined classes, have been recently introduced in social network analysis in order to represent the different interaction options between netizens. In this work, the potential of applying this new type of graphs to an ontological context creating essentially an ontological tensor is outlined and its complexity is assessed. A human readable dataset based on the late 1970s and early 1980s Apple manually constructed from the 2011 officially authorized biography of Steve Jobs and the 1999 film Pirates of Silicon Valley serves as a concrete example complete with Neo4j queries.

Journal ArticleDOI
01 Jun 2017
TL;DR: An extensive evaluation of the proposed citation system is conducted, showing that it represents a suitable solution that can be easily employed in real‐world environments and that reduces human intervention on data to a minimum.
Abstract: The practice of citation is foundational for the propagation of knowledge along with scientific development and it is one of the core aspects on which scholarship and scientific publishing rely. Within the broad context of data citation, we focus on the automatic construction of citations problem for hierarchically structured data. We present the "learning to cite" framework, which enables the automatic construction of human- and machine-readable citations with different levels of coarseness. The main goal is to reduce the human intervention on data to a minimum and to provide a citation system general enough to work on heterogeneous and complex XML data sets. We describe how this framework can be realized by a system for creating citations to single nodes within an XML data set and, as a use case, show how it can be applied in the context of digital archives. We conduct an extensive evaluation of the proposed citation system by analyzing its effectiveness from the correctness and completeness viewpoints, showing that it represents a suitable solution that can be easily employed in real-world environments and that reduces human intervention on data to a minimum.

Book ChapterDOI
15 Mar 2017
TL;DR: The basic syntax of the DTD is described and it is compared to its two main rivals: W3C XML Schema and RELAX NG.
Abstract: Document Type Definitions (DTDs) are schemas that describe the structure and, to a limited extent, the content of Extensible Markup Language (XML) and Standard Generalized Markup Language (SGML) documents. At its inception, the XML standard inherited the DTD from SGML as its only schema language. Many alternative schema languages have subsequently been developed for XML. But the DTD is still alive and actively used to define narrative-based document types. This entry describes the basic syntax of the DTD and compares it to its two main rivals: W3C XML Schema and RELAX NG

Posted Content
TL;DR: In this paper, the authors present a complex data warehousing methodology that exploits XML as a pivot language, which includes the integration of complex data in an ODS, under the form of XML documents; their dimensional modeling and storage in an XML data warehouse; and their analysis with combined OLAP and data mining techniques.
Abstract: The data warehousing and OLAP technologies are now moving onto handling complex data that mostly originate from the Web. However, intagrating such data into a decision-support process requires their representation under a form processable by OLAP and/or data mining techniques. We present in this paper a complex data warehousing methodology that exploits XML as a pivot language. Our approach includes the integration of complex data in an ODS, under the form of XML documents; their dimensional modeling and storage in an XML data warehouse; and their analysis with combined OLAP and data mining techniques. We also address the crucial issue of performance in XML warehouses.

Proceedings ArticleDOI
01 Oct 2017
TL;DR: This paper proposes a framework that considers both semantic and non-semantic data sources to process them for data interoperability in Web Objects enabled IoT environment, a service platform that provides a resourceful infrastructure to deploy IoT services in the World Wide Web environment.
Abstract: Data interoperability is a prerequisite to achieve cross-community and cross-application sharing of information and knowledge. Heterogeneous data from multiple sources including semantic and non-semantic data sources (e.g. SNS data, web data, relational data, RDF, XML, CSV, etc.) have an important effect for IoT service provisioning. The data are not in a same type or format always that requires to be processed and transformed into a machine readable semantic format so that systems can understand each other to create and offer services. This paper proposes a framework that considers both semantic and non-semantic data sources to process them for data interoperability in Web Objects enabled IoT environment. WoO is a service platform that provides a resourceful infrastructure to deploy IoT services in the World Wide Web environment. For data interoperability, heterogeneous data are transformed following a semantic data schema that has been defined using a standard schema. The heterogeneous data are processed and stored in a knowledge base in RDF/OWL format.

Proceedings ArticleDOI
01 Nov 2017
TL;DR: This paper proposes the concept of “Positional Token” to overcome the attack on XML signatures and demonstrates the same.
Abstract: XML signature standard defined by IETF/W3C references or identifies signed elements by their unique identities specified by “id” attribute values in the given XML document Hence, signed XML elements can be shifted from one location to another location in a XML document, and still, it does not have any effect on its ability to verify its signature This flexibility paves the way for an attacker to tweak original XML message without getting noticed by the receiver In this paper we propose the concept of “Positional Token” to overcome the attack on XML signatures and demonstrate the same

Journal ArticleDOI
01 Jan 2017-Database
TL;DR: The AnnoSys system accesses collection data from either conventional web resources or the Biological Collection Access Service (BioCASe) and accepts XML-based data standards like ABCD or DarwinCore and proposes best practice procedures for digital annotations of complex records.
Abstract: Biological research collections holding billions of specimens world-wide provide the most important baseline information for systematic biodiversity research. Increasingly, specimen data records become available in virtual herbaria and data portals. The traditional (physical) annotation procedure fails here, so that an important pathway of research documentation and data quality control is broken. In order to create an online annotation system, we analysed, modeled and adapted traditional specimen annotation workflows. The AnnoSys system accesses collection data from either conventional web resources or the Biological Collection Access Service (BioCASe) and accepts XML-based data standards like ABCD or DarwinCore. It comprises a searchable annotation data repository, a user interface, and a subscription based message system. We describe the main components of AnnoSys and its current and planned interoperability with biodiversity data portals and networks. Details are given on the underlying architectural model, which implements the W3C OpenAnnotation model and allows the adaptation of AnnoSys to different problem domains. Advantages and disadvantages of different digital annotation and feedback approaches are discussed. For the biodiversity domain, AnnoSys proposes best practice procedures for digital annotations of complex records. Database url https://annosys.bgbm.fu-berlin.de/AnnoSys/AnnoSys.

Journal ArticleDOI
TL;DR: Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications).
Abstract: The objective of this research is to compare the relational and non-relational (NoSQL) database systems approaches in order to store, recover, query and persist standardized medical information in the form of ISO/EN 13606 normalized Electronic Health Record XML extracts, both in isolation and concurrently. NoSQL database systems have recently attracted much attention, but few studies in the literature address their direct comparison with relational databases when applied to build the persistence layer of a standardized medical information system. One relational and two NoSQL databases (one document-based and one native XML database) of three different sizes have been created in order to evaluate and compare the response times (algorithmic complexity) of six different complexity growing queries, which have been performed on them. Similar appropriate results available in the literature have also been considered. Relational and non-relational NoSQL database systems show almost linear algorithmic complexity query execution. However, they show very different linear slopes, the former being much steeper than the two latter. Document-based NoSQL databases perform better in concurrency than in isolation, and also better than relational databases in concurrency. Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications). Document-based NoSQL databases perform in general better than native XML NoSQL databases. EHR extracts visualization and edition are also document-based tasks more appropriate to NoSQL database systems. However, the appropriate database solution much depends on each particular situation and specific problem.

Patent
01 Feb 2017
TL;DR: In this article, the authors present a method and a system for presenting a webpage by a native user interface assembly, which relates to the technical field of computers and relates to our work.
Abstract: The embodiment of the invention provides a method and a system for presenting a webpage by a native user interface assembly, and relates to the technical field of computers. The method comprises the following steps: converting an HTML (Hypertext Markup Language) document specific to the webpage into an XML (Extensive Markup Language) document on a server side; receiving an operation instruction of accessing the webpage, sending a webpage request specific to the webpage to a server according to the operation instruction, and receiving the XML document returned by the server, so that the native user interface assembly presents the XML document in a rendering way. The performance and user experience advantages of native application are utilized, the HTML document specific to the webpage is converted into the XML document, and the XML document is presented in a rendering way by the native user interface assembly, so that the performance and user experience problems caused by presentation of the HTML document by a webpage window assembly are solved, a presentation effect being the same as that of the webpage window assembly is ensured, and better performance and user experience can be realized via presentation by the native user interface assembly.

Proceedings ArticleDOI
01 Jul 2017
TL;DR: The paper shows that there are better alternatives than XML & JAXB and gives guidance in choosing the most appropriate serialization format and library depending on the context, especially in the context of the Internet of Things.
Abstract: Communication between DERs and System Operators is required to provide Demand Response and solve some of the problems caused by the intermittency of much Renewable Energy. An important part of efficient communication is serialization, which is important to ensure a high probability of delivery within a given timeframe, especially in the context of the Internet of Things, using low-bandwidth data connections and constrained devices. The paper shows that there are better alternatives than XML & JAXB and gives guidance in choosing the most appropriate serialization format and library depending on the context.

Journal ArticleDOI
TL;DR: The platform offers an interactive visualization and analytics environment that functions well on mobile, laptop and desktop devices, and has the goal of offering federated instances, that can be customized to the sites/research performed.
Abstract: An end-to-end platform for chemical science research has been developed that integrates data from computational and experimental approaches through a modern web-based interface. The platform offers an interactive visualization and analytics environment that functions well on mobile, laptop and desktop devices. It offers pragmatic solutions to ensure that large and complex data sets are more accessible. Existing desktop applications/frameworks were extended to integrate with high-performance computing resources, and offer command-line tools to automate interaction—connecting distributed teams to this software platform on their own terms. The platform was developed openly, and all source code hosted on the GitHub platform with automated deployment possible using Ansible coupled with standard Ubuntu-based machine images deployed to cloud machines. The platform is designed to enable teams to reap the benefits of the connected web—going beyond what conventional search and analytics platforms offer in this area. It also has the goal of offering federated instances, that can be customized to the sites/research performed. Data gets stored using JSON, extending upon previous approaches using XML, building structures that support computational chemistry calculations. These structures were developed to make it easy to process data across different languages, and send data to a JavaScript-based web client.

Journal ArticleDOI
TL;DR: Two databases approaches namely Extensible Markup Language (XML) and Java Object Notation (JSON) were investigated to evaluate their suitability for handling thousands records of publication data and showed JSON is the best choice for query retrieving speed and CPU usage.
Abstract: Big data is the latest industry buzzword to describe large volume of structured and unstructured data that can be difficult to process and analyze. Most of organization looking for the best approach to manage and analyze the large volume of data especially in making a decision. XML is chosen by many organization because of powerful approach during retrieval and storage processes. However, XML approach, the execution time for retrieving large volume of data are still considerably inefficient due to several factors. In this contribution, two databases approaches namely Extensible Markup Language (XML) and Java Object Notation (JSON) were investigated to evaluate their suitability for handling thousands records of publication data. The results showed JSON is the best choice for query retrieving speed and CPU usage. These are essential to cope with the characteristics of publication’s data. Whilst, XML and JSON technologies are relatively new to date in comparison to the relational database. Indeed, JSON technology demonstrates greater potential to become a key database technology for handling huge data due to increase of data annually.

Journal ArticleDOI
Yiqin Zou1, Li Quan1
TL;DR: A new service-oriented grid-based method to set up and build the agricultural IoT and description of Web Service Resource Framework-based Agricultural Internet of Things (AIoT) and the encapsulation method were discussed in this paper.
Abstract: The traditional three-layer Internet of things (IoT) model, which includes physical perception layer, information transferring layer and service application layer, cannot express complexity and diversity in agricultural engineering area completely. It is hard to categorize, organize and manage the agricultural things with these three layers. Based on the above requirements, we propose a new service-oriented grid-based method to set up and build the agricultural IoT. Considering the heterogeneous, limitation, transparency and leveling attributes of agricultural things, we propose an abstract model for all agricultural resources. This model is service-oriented and expressed with Open Grid Services Architecture (OGSA). Information and data of agricultural things were described and encapsulated by using XML in this model. Every agricultural engineering application will provide service by enabling one application node in this service-oriented grid. Description of Web Service Resource Framework (WSRF)-based Agricultural Internet of Things (AIoT) and the encapsulation method were also discussed in this paper for resource management in this model.

Journal ArticleDOI
TL;DR: It is demonstrated how updating operations, inserting operations, and deleting operations effect on consistencies of fuzzy spatiotemporal data in XML documents, and proposed algorithms for fixing these inconsistencies are proposed.

Proceedings ArticleDOI
01 Nov 2017
TL;DR: To automate the measurement process, XML structure of sequence diagram is analyzed to fit existing functional and structural equation, and UML sequence diagram chose as source to extract software function.
Abstract: This research aims to automated software size measurement for both functional and structural view. Software size is used to estimate schedule, effort, cost and other resource in software development process. Therefore, the best method to measure software size is to derive software attributes from requirement artifacts to get early estimation. UML sequence diagram chose as source to extract software function, since this diagram provide high level granularity of function. Functional size is measured by using COSMIC method, while structural size is calculated based on control structure on sequence diagram. To automate the measurement process, XML structure of sequence diagram is analyzed to fit existing functional and structural equation. A well-known case study of rice cooker is used to depict the proposed method in this research. In addition, a simple support tool is provided.

Journal ArticleDOI
TL;DR: By leveraging semantic web technologies, this platform is able to place computational chemistry data onto web portals as a component of a Giant Global Graph (GGG) such that computer agents, as well as individual chemists, can access the data.
Abstract: This paper presents a formal data publishing platform for computational chemistry using semantic web technologies. This platform encapsulates computational chemistry data from a variety of packages in an Extensible Markup Language (XML) file called CSX (Common Standard for eXchange). On the basis of a Gainesville Core (GC) ontology for computational chemistry, a CSX XML file is converted into the JavaScript Object Notation for Linked Data (JSON-LD) format using an XML Stylesheet Language Transformation (XSLT) file. Ultimately the JSON-LD file is converted to subject–predicate–object triples in a Turtle (TTL) file and published on the web portal. By leveraging semantic web technologies, we are able to place computational chemistry data onto web portals as a component of a Giant Global Graph (GGG) such that computer agents, as well as individual chemists, can access the data.

Journal ArticleDOI
TL;DR: A model that expands the present XML Encryption standard for data with string and numeric types implemented in the sensors, efficiently and discreetly filters matched streaming data and performs summation in the fog nodes, and decrypts the filtered and aggregated data in the subscribers without revealing privacy data is proposed.
Abstract: The Internet of Things provides visions of innovative services and domain-specific applications. With the development of Internet of Things services, various structural data need to be transferred ...