scispace - formally typeset
Search or ask a question

Showing papers on "XML published in 2018"


Journal ArticleDOI
TL;DR: This document provides the specification for Release 2 of Version 2 of SBML Level 3 Core, which defines the data structures prescribed by SBML as well as their encoding in XML, the eXtensible Markup Language.
Abstract: Computational models can help researchers to interpret data, understand biological functions, and make quantitative predictions. The Systems Biology Markup Language (SBML) is a file format for representing computational models in a declarative form that different software systems can exchange. SBML is oriented towards describing biological processes of the sort common in research on a number of topics, including metabolic pathways, cell signaling pathways, and many others. By supporting SBML as an input/output format, different tools can all operate on an identical representation of a model, removing opportunities for translation errors and assuring a common starting point for analyses and simulations. This document provides the specification for Version 2 of SBML Level 3 Core. The specification defines the data structures prescribed by SBML, their encoding in XML (the eXtensible Markup Language), validation rules that determine the validity of an SBML document, and examples of models in SBML form. The design of Version 2 differs from Version 1 principally in allowing new MathML constructs, making more child elements optional, and adding identifiers to all SBML elements instead of only selected elements. Other materials and software are available from the SBML project website at http://sbml.org/.

195 citations


01 Jan 2018
TL;DR: A method and apparatus for making a graphic product from a layer of sheet material on a carrier sheet uses stored machine readable data describing the periphery of the graphic product to cut and remove the product from the carrier sheet.
Abstract: The widespread use of XML prompted the development of appropriate searching and browsing methods for XML documents. This explosion of XML retrieval tools requires the development of appropriate testbeds and evaluation methods. As part of a large-scale effort to improve the efficiency of research in information retrieval and digital libraries, the INEX initiative organises an international, coordinated effort to promote evaluation procedures for content-based XML retrieval. The project provides an opportunity for participants to evaluate their retrieval methods using uniform scoring procedures and a forum for participating organisations to compare their results.

136 citations


Dataset
01 Jan 2018
TL;DR: The electronic code of federal regulations e cfr is the codification of the general and permanent rules published in the federal register by the executive departments and agencies of the federal government it is divided into 50 titles that represent broad areas subject to federal regulation as mentioned in this paper.
Abstract: the electronic code of federal regulations e cfr is the codification of the general and permanent rules published in the federal register by the executive departments and agencies of the federal government it is divided into 50 titles that represent broad areas subject to federal regulation the electronic code of federal regulations is updated daily the electronic code of federal regulations and its accompanying xml data is not yet an official format of the code of federal regulations only the pdf and text versions of the annual code of federal regulations have legal status as parts of the official online format of the code of federal regulations the xml structured files are derived from sgml tagged data and printing codes which may produce anomalies in display in addition the xml data does not yet include image files users who require a higher level of assurance may wish to consult the official version of the code of federal regulations or the daily federal register on fdsys gov

123 citations


Proceedings ArticleDOI
05 Jun 2018
TL;DR: This paper proposes a practical deep embedding method for extreme multi-label classification, which harvests the ideas of non-linear embedding and graph priors-based label space modeling simultaneously.
Abstract: Extreme multi-label learning (XML) or classification has been a practical and important problem since the boom of big data. The main challenge lies in the exponential label space which involves 2L possible label sets especially when the label dimension L is huge, e.g., in millions for Wikipedia labels. This paper is motivated to better explore the label space by originally establishing an explicit label graph. In the meanwhile, deep learning has been widely studied and used in various classification problems including multi-label classification, however it has not been properly introduced to XML, where the label space can be as large as in millions. In this paper, we propose a practical deep embedding method for extreme multi-label classification, which harvests the ideas of non-linear embedding and graph priors-based label space modeling simultaneously. Extensive experiments on public datasets for XML show that our method performs competitive against state-of-the-art result.

79 citations


Journal ArticleDOI
TL;DR: A cloud-based EHR model that performs attribute-based access control using extensible access control markup language, focused on security, performs partial encryption and uses electronic signatures when a patient’s document is sent to a document requester.
Abstract: Cloud-based electronic health record (EHR) systems enable medical documents to be exchanged between medical institutions; this is expected to contribute to improvements in various medical services in the future. However, as the system architecture becomes more complicated, cloud-based EHR systems may introduce additional security threats when compared with the existing singular systems. Thus, patients may experience exposure of private data that they do not wish to disclose. In order to protect the privacy of patients, many approaches have been proposed to provide access control to patient documents when providing health services. However, most current systems do not support fine-grained access control or take into account additional security factors such as encryption and digital signatures. In this paper, we propose a cloud-based EHR model that performs attribute-based access control using extensible access control markup language. Our EHR model, focused on security, performs partial encryption and uses electronic signatures when a patient’s document is sent to a document requester. We use XML encryption and XML digital signature technology. Our proposed model works efficiently by sending only the necessary information to the requesters who are authorized to treat the patient in question.

61 citations


Posted Content
TL;DR: This work proposes a grammar-aware coverage-based greybox fuzzing approach to fuzz programs that process structured inputs, and implemented it as an extension to AFL, named Superion; and evaluated the effectiveness of Superion using large- scale programs.
Abstract: In recent years, coverage-based greybox fuzzing has proven itself to be one of the most effective techniques for finding security bugs in practice. Particularly, American Fuzzy Lop (AFL for short) is deemed to be a great success in fuzzing relatively simple test inputs. Unfortunately, when it meets structured test inputs such as XML and JavaScript, those grammar-blind trimming and mutation strategies in AFL hinder the effectiveness and efficiency. To this end, we propose a grammar-aware coverage-based greybox fuzzing approach to fuzz programs that process structured inputs. Given the grammar (which is often publicly available) of test inputs, we introduce a grammar-aware trimming strategy to trim test inputs at the tree level using the abstract syntax trees (ASTs) of parsed test inputs. Further, we introduce two grammar-aware mutation strategies (i.e., enhanced dictionary-based mutation and tree-based mutation). Specifically, tree-based mutation works via replacing subtrees using the ASTs of parsed test inputs. Equipped with grammar-awareness, our approach can carry the fuzzing exploration into width and depth. We implemented our approach as an extension to AFL, named Superion; and evaluated the effectiveness of Superion on real-life large-scale programs (a XML engine libplist and three JavaScript engines WebKit, Jerryscript and ChakraCore). Our results have demonstrated that Superion can improve the code coverage (i.e., 16.7% and 8.8% in line and function coverage) and bug-finding capability (i.e., 31 new bugs, among which we discovered 21 new vulnerabilities with 16 CVEs assigned and 3.2K USD bug bounty rewards received) over AFL and jsfunfuzz. We also demonstrated the effectiveness of our grammar-aware trimming and mutation.

49 citations


Journal ArticleDOI
TL;DR: The Molecular Interaction workgroup of the HUPO-PSI has extended the existing, well-used XML interchange format for molecular interaction data to meet new use cases and enable the capture of new data types, following extensive community consultation.
Abstract: Systems biologists study interaction data to understand the behaviour of whole cell systems, and their environment, at a molecular level. In order to effectively achieve this goal, it is critical that researchers have high quality interaction datasets available to them, in a standard data format, and also a suite of tools with which to analyse such data and form experimentally testable hypotheses from them. The PSI-MI XML standard interchange format was initially published in 2004, and expanded in 2007 to enable the download and interchange of molecular interaction data. PSI-XML2.5 was designed to describe experimental data and to date has fulfilled this basic requirement. However, new use cases have arisen that the format cannot properly accommodate. These include data abstracted from more than one publication such as allosteric/cooperative interactions and protein complexes, dynamic interactions and the need to link kinetic and affinity data to specific mutational changes. The Molecular Interaction workgroup of the HUPO-PSI has extended the existing, well-used XML interchange format for molecular interaction data to meet new use cases and enable the capture of new data types, following extensive community consultation. PSI-MI XML3.0 expands the capabilities of the format beyond simple experimental data, with a concomitant update of the tool suite which serves this format. The format has been implemented by key data producers such as the International Molecular Exchange (IMEx) Consortium of protein interaction databases and the Complex Portal. PSI-MI XML3.0 has been developed by the data producers, data users, tool developers and database providers who constitute the PSI-MI workgroup. This group now actively supports PSI-MI XML2.5 as the main interchange format for experimental data, PSI-MI XML3.0 which additionally handles more complex data types, and the simpler, tab-delimited MITAB2.5, 2.6 and 2.7 for rapid parsing and download.

45 citations


Journal ArticleDOI
TL;DR: An independent RESTful web service in a layered approach to detect NoSQL injection attacks in web applications named DNIARS, which depends on comparing the generated patterns from NoSQL statement structure in static code state and dynamic state to respond to the web application with the possibility of NoSQL injections.
Abstract: Despite the extensive research of using web services for security purposes, there is a big challenge towards finding a no radical solution for NoSQL injection attack. This paper presents an independent RESTful web service in a layered approach to detect NoSQL injection attacks in web applications. The proposed method is named DNIARS. DNIARS depends on comparing the generated patterns from NoSQL statement structure in static code state and dynamic state. Accordingly, the DNIARS can respond to the web application with the possibility of NoSQL injection attack. The proposed DNIARS was implemented in PHP plain code and can be considered as an independent framework that has the ability for responding to different requests formats like JSON, XML. To evaluate its performance, DNIARS was tested using the most common testing tools for RESTful web service. According to the results, DNIARS can work in real environments where the error rate did not exceed 1%.

45 citations


01 Jan 2018
TL;DR: In this paper, a method for planarizing metal plugs for device interconnections is described, where a semiconductor structure with at least one device thereon is provided, and a dielectric layer is formed over the device and the semiconductor structures.
Abstract: A method for planarizing metal plugs for device interconnections. The process begins by providing a semiconductor structure with at least one device thereon. A dielectric layer is formed over the device and the semiconductor structure. A first barrier metal layer is formed on the dielectric layer, and a sacrificial oxide layer is formed on the first barrier metal layer. The sacrificial oxide layer, the first barrier metal layer, and the dielectric layer are patterned to form contact openings. A second barrier metal layer is formed over the semiconductor structure, and a metal contact layer is formed on the second barrier metal layer. The metal contact layer and the second barrier metal layer are planarized using a first chemical mechanical polishing process and the sacrificial oxide layer is removed. The metal contact layer and the first barrier metal layer are planarized using a second chemical mechanical polishing process.

42 citations


Journal ArticleDOI
TL;DR: The NanoMine polymer nanocomposite schema is presented as an XML-based data schema designed for nanocomPOSite materials data representation and distribution and its relationship to a higher level polymer data core consistent with other centralized materials data efforts is discussed.
Abstract: Polymer nanocomposites consist of a polymer matrix and fillers with at least one dimension below 100 nanometers (nm) [L. Schadler et al., Jom 59(3), 53–60 (2007)]. A key challenge in constructing an effective data resource for polymer nanocomposites is building a consistent, coherent, and clear data representation of all relevant parameters and their interrelationships. The data resource must address (1) data representation for representing, saving, and accessing the data (e.g., a data schema used in a data resource such as a database management system), (2) data contribution and uploading (e.g., an MS Excel template file that users can use to input data), (3) concept and knowledge modeling in a computationally accessible form (e.g., generation of a knowledge graph and ontology), and (4) ultimately data analytics and mining for new materials discovery. This paper addresses the first three issues, paving the way for rich, nuanced data analysis. We present the NanoMine polymer nanocomposite schema as an XML-based data schema designed for nanocomposite materials data representation and distribution and discuss its relationship to a higher level polymer data core consistent with other centralized materials data efforts. We also demonstrate aspects of data entry in an accessible manner consistent with the XML schema and discuss our mapping and augmentation approach to provide a more comprehensive representation in the form of an ontology and an ontology-enabled knowledge graph framework for nanopolymer systems. The schema and ontology and their easy accessibility and compatibility with parallel material standards provide a platform for data storage and search, customized visualization, and machine learning tools for material discovery and design.

38 citations


Book ChapterDOI
31 Aug 2018
TL;DR: This paper presents a benchmark, called UniBench, with the goal of facilitating a holistic and rigorous evaluation of MMDBs, consisting of a mixed data model, a synthetic multi-model data generator, and a set of core workloads, aiming to cover essential aspects of multi- model data management.
Abstract: Unlike traditional database management systems which are organized around a single data model, a multi-model database (MMDB) utilizes a single, integrated back-end to support multiple data models, such as document, graph, relational, and key-value. As more and more platforms are proposed to deal with multi-model data, it becomes crucial to establish a benchmark for evaluating the performance and usability of MMDBs. Previous benchmarks, however, are inadequate for such scenario because they lack a comprehensive consideration for multiple models of data. In this paper, we present a benchmark, called UniBench, with the goal of facilitating a holistic and rigorous evaluation of MMDBs. UniBench consists of a mixed data model, a synthetic multi-model data generator, and a set of core workloads. Specifically, the data model simulates an emerging application: Social Commerce, a Web-based application combining E-commerce and social media. The data generator provides diverse data format including JSON, XML, key-value, tabular, and graph. The workloads are comprised of a set of multi-model queries and transactions, aiming to cover essential aspects of multi-model data management. We implemented all workloads on ArangoDB and OrientDB to illustrate the feasibility of our proposed benchmarking system and show the learned lessons through the evaluation of these two multi-model databases. The source code and data of this benchmark can be downloaded at http://udbms.cs.helsinki.fi/bench/.

Journal ArticleDOI
TL;DR: In this article, a file format ( psml) is proposed to map the concepts of the norm-conserving pseudopotential domain in a flexible form and support the inclusion of provenance information and other important metadata.

ReportDOI
18 May 2018
TL;DR: This specification defines a format for representing simple sensor measurements and device parameters in the Sensor Measurement Lists (SenML).
Abstract: This specification defines a format for representing simple sensor measurements and device parameters in the Sensor Measurement Lists (SenML). Representations are defined in JavaScript Object Notation (JSON), Concise Binary Object Representation (CBOR), Extensible Markup Language (XML), and Efficient XML Interchange (EXI), which share the common SenML data model. A simple sensor, such as a temperature sensor, could use one of these media types in protocols such as HTTP or CoAP to transport the measurements of the sensor or to be configured.

Journal ArticleDOI
01 Sep 2018
TL;DR: The presented research work demonstrates the implementation and visualization of a set of KPIs defined in the ISO 22400 standard-Automation systems and integration, for manufacturing operations management.
Abstract: The employment of tools and techniques for monitoring and supervising the performance of industrial systems has become essential for enterprises that seek to be more competitive in today’s market. The main reason is the need for validating tasks that are executed by systems, such as industrial machines, which are involved in production processes. The early detection of malfunctions and/or improvable system values permits the anticipation to critical issues that may delay or even disallow productivity. Advances on Information and Communication Technologies (ICT)-based technologies allows the collection of data on system runtime. In fact, the data is not only collected but formatted and integrated in computer nodes. Then, the formatted data can be further processed and analyzed. This article focuses on the utilization of standard Key Performance Indicators (KPIs), which are a set of parameters that permit the evaluation of the performance of systems. More precisely, the presented research work demonstrates the implementation and visualization of a set of KPIs defined in the ISO 22400 standard-Automation systems and integration, for manufacturing operations management. The approach is validated within a discrete manufacturing web-based interface that is currently used for monitoring and controlling an assembly line at runtime. The selected ISO 22400 KPIs are described within an ontology, which the description is done according to the data models included in the KPI Markup Language (KPIML), which is an XML implementation developed by the Manufacturing Enterprise Solutions Association (MESA) international organization.

Journal ArticleDOI
TL;DR: A black-box fuzzing approach to detect XQuery injection and parameter tampering vulnerabilities in web applications driven by native XML databases is proposed and a prototype XiParam is developed and tested on vulnerable applications developed with a native XML database, BaseX, as the backend.
Abstract: As web applications become the most popular way to deliver essential services to customers, they also become attractive targets for attackers. The attackers craft injection attacks in database-driven applications through the user-input fields intended for interacting with the applications. Even though precautionary measures such as user-input sanitization is employed at the client side of the application, the attackers can disable the JavaScript at client side and still inject attacks through HTTP parameters. The injected parameters result in attacks due to improper server-side validation of user input. The injected parameters may either contain malicious SQL/XML commands leading to SQL/XPath/XQuery injection or be invalid input that intend to violate the expected behavior of the web application. The former is known as an injection attack, while the latter is called a parameter tampering attack. While SQL injection has been intensively examined by the research community, limited work has been done so far for identifying XML injection and parameter tampering vulnerabilities. Database-driven web applications today rely on XML databases, as XML has gained rapid acceptance due to the fact that it favors integration of data with other applications and handles diverse information. Hence, this work proposes a black-box fuzzing approach to detect XQuery injection and parameter tampering vulnerabilities in web applications driven by native XML databases. A prototype XiParam is developed and tested on vulnerable applications developed with a native XML database, BaseX, as the backend. The experimental evaluation clearly demonstrates that the prototype is effective against detection of both XQuery injection and parameter tampering vulnerabilities.

Proceedings ArticleDOI
TL;DR: Coming is presented, a tool that takes as input a Git repository and mines instances of code change patterns present on each commit, and presents a) the frequency of code changes and b) the instances found in each commit.
Abstract: Software repositories such as Git have become a relevant source of information for software engineer researcher. For instance, the detection of Commits that fulfill a given criterion (e.g., bugfixing commits) is one of the most frequent tasks done to understand the software evolution. However, to our knowledge, there is not open-source tools that, given a Git repository, returns all the instances of a given change pattern. In this paper we present Coming, a tool that takes an input a Git repository and mines instances of change patterns on each commit. For that, Coming computes fine-grained changes between two consecutive revisions, analyzes those changes to detect if they correspond to an instance of a change pattern (specified by the user using XML), and finally, after analyzing all the commits, it presents a) the frequency of code changes and b) the instances found on each commit. We evaluate Coming on a set of 28 pairs of revisions from Defects4J, finding instances of change patterns that involve If conditions on 26 of them.

Journal ArticleDOI
Yilong Yang1, Quan Zu1, Peng Liu1, Defang Ouyang1, Xiaoshan Li1 
TL;DR: The proposed toolkit combines software engineering technologies such as Java EE, RESTful web services, JSON Web Tokens to allow exchanging medical data in an unidentifiable XML and JSON format as well as restricting users to the need-to-know principle.
Abstract: This paper takes up the problem of medical resource sharing through MicroService architecture without compromising patient privacy. To achieve this goal, we suggest refactoring the legacy EHR systems into autonomous MicroServices communicating by the unified techniques such as RESTFul web service. This lets us handle clinical data queries directly and far more efficiently for both internal and external queries. The novelty of the proposed approach lies in avoiding the data de-identification process often used as a means of preserving patient privacy. The implemented toolkit combines software engineering technologies such as Java EE, RESTful web services, JSON Web Tokens to allow exchanging medical data in an unidentifiable XML and JSON format as well as restricting users to the need-to-know principle. Our technique also inhibits retrospective processing of data such as attacks by an adversary on a medical dataset using advanced computational methods to reveal Protected Health Information (PHI). The approach is validated on an endoscopic reporting application based on openEHR and MST standards. From the usability perspective, the approach can be used to query datasets by clinical researchers, governmental or non-governmental organizations in monitoring health care and medical record services to improve quality of care and treatment.

Journal ArticleDOI
TL;DR: Through the semantic dependency between data and the integration process from bottom to top, the global visual range of inverted XML structure is realized and a data access control model for individual users is proposed.
Abstract: In the era of big data, the conflict between data mining and data privacy protection is increasing day by day. Traditional information security focuses on protecting the security of attribute values without semantic association. The data privacy of big data is mainly reflected in the effective use of data without exposing the user's sensitive information. Considering the semantic association, reasonable security access for privacy protect is required. Semi-structured and self-descriptive XML(eXtensible Markup Language) has become a common form of data organization for database management in big data environments. Based on the semantic integration nature of XML data, this paper proposes a data access control model for individual users. Through the semantic dependency between data and the integration process from bottom to top, the global visual range of inverted XML structure is realized. Experimental results show that the model effectively protects the privacy and has high access efficiency.

Proceedings ArticleDOI
Wei Huang1
26 Mar 2018
TL;DR: This paper provides an alternate approach to comprehend core components of IEC 61850: Semantic Hierarchical Object Data model and two communication services: Client — Server and Publish — Subscribe and it is demonstrated as preparing the data list following the data model and setting parameters for communication services.
Abstract: The myth says IEC 61850 is too complicated to use with a steep learning curve. The traditional IEC 61850 learning path is to go through IEC 61850 standard then start IEC 61850 configuration. User spent days to go through massive content of the standard, which includes 10 parts, 27 documents, and more than 4,000 pages. The standard adopts modern communication and computer science technologies like 7-layer communication model, XML (Extensible Markup Language), Unified Model Language (UML), Object Oriented Design (OOD) and others. Many potential users are discouraged with the overwhelming information during the IEC 61850 training. One of IEC 61850 standard's goals is to shorten system engineering time. Learning the know-how presented a challenge with such massive content. It is possible to get a fundamental understanding of standard and be ready for IEC 61850 configuration in 30 minutes. Two basic questions from protection relay engineer's point of view are “Where is the data?” and “How to get the data?” This paper provides an alternate approach to comprehend core components of IEC 61850: Semantic Hierarchical Object Data model and two communication services: Client — Server and Publish — Subscribe. IEC 61850 configuration is demonstrated as preparing the data list following the data model and setting parameters for communication services. This approach has been well received and has provided satisfactory results in recent IEC 61850 configuration training sessions.

Book ChapterDOI
01 Jan 2018
TL;DR: An architectural approach to build a system that will create the virtual rooms from the XML repository to enable visitors to lookup individual life stories and also inter-cross information among them is described.
Abstract: The Museum of the Person (Museu da Pessoa, MP) is a virtual museum with the purpose of exhibit life stories of common people. Its assets are composed of several interviews involving people whose stories we want to perpetuate. So the museum holds an heterogeneous collection of XML (eXtensible Markup Language) documents that constitute the working repository. The main idea is to extract automatically the information included in the repository in order to build the virtual museum’s exhibition rooms. The goal of this paper is to describe an architectural approach to build a system that will create the virtual rooms from the XML repository to enable visitors to lookup individual life stories and also inter-cross information among them. We adopted the standard for museum ontologies CIDOC-CRM (CIDOC Conceptual Reference Model) refined with FOAF (Friend of a Friend) and DBpedia ontologies to represent OntoMP. That ontology is intended to allow a conceptual navigation over the available information. The approach here discussed is based on a TripleStore and uses SPARQL (SPARQL Protocol and RDF Query Language) to extract the information. Aiming at the extraction of meaningful information, we built a text filter that converts the interviews into a RDF triples file that reflects the assets described by the ontology.

Patent
08 Nov 2018
TL;DR: In this article, the authors propose an approach for multimodal cognitive communications, collaboration, consultation and instruction between and among heterogeneous networked teams of persons, machines, devices, neural networks, robots and algorithms during various stages of medical disease management, including detection, diagnosis, prognosis, treatment, measurement, monitoring and reporting.
Abstract: The invention enables multimodal cognitive communications, collaboration, consultation and instruction between and among heterogeneous networked teams of persons, machines, devices, neural networks, robots and algorithms during various stages of medical disease management, including detection, diagnosis, prognosis, treatment, measurement, monitoring and reporting. The invention enables both synchronous and asynchronous multiparty collaboration with multichannel, multiplexed streaming imagery data, including interactive curation, multisensory annotation and metadata tagging, as well as multi-formatted encapsulation, saving and sharing of collaborated imagery data as packetized augmented intelligence. The invention acquires both live stream and archived medical modality imagery from network-connected medical devices, cameras, signals and sensors, as well as multiomic data [phenotypic, genomic, metabolomic, pathomic, radiomic, radiopathomic and radiogenomic] maps and clinical data sets from structured reports and clinical documents, including biometric maps and movies, hapmaps, heat maps and data stream visualizations. The invention also acquires both medical and non-medical streaming imagery data from image data repositories, documents and structured reports, workstations and mobile devices, as well as from wearable computing, signals and sensors. The invention enables networked teams to interactively communicate, concurrently collaborate and bi-directionally exchange multichannel multiplexed imagery data streams, singly or together, in real time or asynchronously, generally by curating, annotating and tagging imagery information objects. The invention encapsulates and saves collaborated imagery data, together with multisensory annotations and metadata tags, in standard file formats as packetized augmented intelligence. The invention enables recursive cognitive enrichment of clinical cognitive vismemes, and saves packetized imagery information objects, multisensory annotations and metadata tags in native file formats [PDF, MPEG, JPEG, XML, XMPP, QR,TIFF, RDF, RDF/XML, SVG and DAE] as well as in formats compliant with standards for digital communications in medicine [DICOM]. The invention enables live stream multicasting of multimodal cognitive instruction and collaborative knowledge exchange with multisensory [visual, auditory, haptic] annotation of streaming imagery data, as well as secure, encrypted transmission of streaming augmented intelligence across file sharing data networks for informatics-enabled learning, specialist skills acquisition and accelerated knowledge exchange.

02 Mar 2018
TL;DR: Media types for representing simple sensor measurements and device parameters in the Sensor Measurement Lists (SenML) are defined in JavaScript Object Notation, Concise Binary Object Representation, eXtensible Markup Language (XML), and Efficient XML Interchange (EXI), which share the common SenML data model.
Abstract: This specification defines media types for representing simple sensor measurements and device parameters in the Sensor Measurement Lists (SenML). Representations are defined in JavaScript Object Notation (JSON), Concise Binary Object Representation (CBOR), eXtensible Markup Language (XML), and Efficient XML Interchange (EXI), which share the common SenML data model. A simple sensor, such as a temperature sensor, could use one of these media types in protocols such as HTTP or CoAP to transport the measurements of the sensor or to be configured.

01 Jan 2018
TL;DR: An ongoing development of a synchronized XML-MediaWiki dictionary to solve the problem of XML dictionaries in the context of small Uralic languages and how the dictionary knowledge in the MediaWiki-based dictionary can be enhanced by an additional Semantic MediaWiki layer for more effective searches in the data.
Abstract: We present our ongoing development of a synchronized XML-MediaWiki dictionary to solve the problem of XML dictionaries in the context of small Uralic languages. XML is good at representing structured data, but it does not fare well in a situation where multiple users are editing the dictionary simultaneously. Furthermore, XML is overly complicated for non-technical users due to its strict syntax that has to be maintained valid at all times. Our system solves these problems by making a synchronized editing of the same dictionary data possible both in a MediaWiki environment and XML files in an easy fashion. In addition, we describe how the dictionary knowledge in the MediaWiki-based dictionary can be enhanced by an additional Semantic MediaWiki layer for more effective searches in the data. In addition, an API access to the lexical information in the dictionary and morphological tools in the form of an open source Python library is presented.

Journal ArticleDOI
TL;DR: FAIMS Mobile promotes synthetic research and improves transparency and reproducibility through the production of comprehensive datasets that can be mapped to vocabularies or ontologies as they are created.

Journal ArticleDOI
TL;DR: The main goal of this research note is to educate business researchers on how to automatically scrape financial data from the World Wide Web using the R programming language.
Abstract: The main goal of this research note is to educate business researchers on how to automatically scrape financial data from the World Wide Web using the R programming language. This paper is...

Journal ArticleDOI
TL;DR: Phoenix is presented, a similarity-based approach for comparing revisions of XML documents that does not rely on explicit IDs and is by far the most efficient approach to match elements across revisions of the same XML document.

Journal ArticleDOI
TL;DR: An attempt is made to fill the gap of customized representation and retrieval with a focus on the educational domain by proposing and developing a partitioning and parallelism focused architecture to archive the information for sharing, back-up and collaboration.
Abstract: At present, there are several formats that exist through which data is distributed among online stakeholders. An example of this is the XML, which like other such formats is helpful for traditional inquiry methods and for forming the foundation of query languages such as SPARQL and SQL. Information about primary representation demands a broader assistance for the languages where every piece of data from any resource can substantiate the original queries for searching. Such models are useful for XML based retrieval since several cooperative XML search engines have been developed already. These search engines perform semantic investigation of XML files with data surrounded by the important fields. Therefore, XML files are used to store and index data intended for competent retrieval. In this research, an attempt is made to fill this gap of customized representation and retrieval with a focus on the educational domain. An institute's repository of books, e-books, journals, articles and research theses has been used to retrieve results. A system has been proposed and developed to store the contents of Institute's Databank as an object of the Digital Library. A structured method has been proposed to organize all the data and a system has been developed which extracts meaningful information from the Data Bank. The information repository is established, and the entire data is represented in terms of a unit called Digital Object in the Digital Library. The single unit is represented by recording some quantitative data about it referred to as ‘Metadata'. The search is focused on extracting meaningful information from the repository by applying some filtration strategies to get relevant information, best matched with the query terms. At the end, a partitioning and parallelism focused architecture to archive the information for sharing, back-up and collaboration is also proposed. Comparison of the proposed scheme with state of the art schemes is provided in terms of computational complexity and recall measurement.

Journal ArticleDOI
TL;DR: A new fuzzy XML tree model is proposed, and an effective algorithm based on the tree edit distance is presented to identify the structural and semantic similarities between the fuzzy documents represented in the proposed fuzzyxml tree model.

Journal ArticleDOI
TL;DR: This paper aims to develop an efficient distributed XML query processing method using MapReduce, which simultaneously processes several queries on large volumes of XML data.

Journal ArticleDOI
TL;DR: This paper presents an approach for integrating several geoprocessing components in the TaMIS water dam monitoring system developed with the Wupperverband, a regional waterbody authority in Germany.
Abstract: Novel sensor technologies are rapidly emerging. They enable a monitoring and modelling of our environment in a level of detail that was not possible a few years ago. However, while the raw data produced by these sensors are useful to get a first overview, it usually needs to be post-processed and integrated with other data or models in different applications. In this paper, we present an approach for integrating several geoprocessing components in the TaMIS water dam monitoring system developed with the Wupperverband, a regional waterbody authority in Germany. The approach relies upon the OGC Web Processing Service and is tightly coupled with Sensor Observation Service instances running at the Wupperverband. Besides implementing the standardized XML-based interface, lightweight REST APIs have been developed to ease the integration with thin Web clients and other Web-based components. Using this standards-based approach, new processing facilities can be easily integrated and coupled with different ...