Showing papers on "Serialization published in 2017"

PDF

Open Access

Journal Article•DOI•

xarray: N-D labeled arrays and datasets in Python

[...]

Stephan Hoyer¹, Joseph Hamman²•Institutions (2)

05 Apr 2017-Journal of open research software

TL;DR: This approach combines an application programing interface (API) inspired by pandas with the Common Data Model for self-described scientific data to provide a toolkit and data structures for N-dimensional labeled arrays.

...read moreread less

Abstract: xarray is an open source project and Python package that provides a toolkit and data structures for N-dimensional labeled arrays. Our approach combines an application programing interface (API) inspired by pandas with the Common Data Model for self-described scientific data. Key features of the xarray package include label-based indexing and arithmetic, interoperability with the core scientific Python packages (e.g., pandas, NumPy, Matplotlib), out-of-core computation on datasets that don’t fit into memory, a wide range of serialization and input/output (I/O) options, and advanced multi-dimensional data manipulation tools such as group-by and resampling. xarray, as a data model and analytics toolkit, has been widely adopted in the geoscience community but is also used more broadly for multi-dimensional data analysis in physics, machine learning and finance.

...read moreread less

620 citations

CBOR Object Signing and Encryption (COSE)

[...]

Jim Schaad

01 Jul 2017

TL;DR: This document defines the CBOR Object Signing and Encryption (COSE) protocol, which describes how to create and process signatures, message authentication codes, and encryption using CBOR for serialization.

...read moreread less

Abstract: Concise Binary Object Representation (CBOR) is a data format designed for small code size and small message size. There is a need for the ability to have basic security services defined for this data format. This document defines the CBOR Object Signing and Encryption (COSE) protocol. This specification describes how to create and process signatures, message authentication codes, and encryption using CBOR for serialization. This specification additionally describes how to represent cryptographic keys using CBOR.

...read moreread less

83 citations

Journal Article•DOI•

JavaScript Object Notation (JSON) data serialization for IFC schema in web-based BIM data exchange

[...]

Kereshmeh Afsari¹, Charles Eastman¹, Daniel Castro-Lacouture¹•Institutions (1)

Georgia Institute of Technology¹

01 May 2017-Automation in Construction

TL;DR: The analysis of results indicates that ifcJSON4 schema developed in this paper is a valid JSON schema that can guide the creation of valid ifc JSON documents to be used for web-based data transfer and to improve interoperability of Cloud-based BIM applications.

...read moreread less

77 citations

Journal Article•DOI•

A Methodology for Spark Parameter Tuning

[...]

Anastasios Gounaris¹, Jordi Torres²•Institutions (2)

Aristotle University of Thessaloniki¹, Polytechnic University of Catalonia²

19 May 2017-Big Data Research

TL;DR: An alternative systematic methodology for parameter tuning is proposed, which can be easily applied onto any computing infrastructure and is shown to yield comparable if not better results than the initial one when applied to MN3; observed speedups in the validating test case studies start from 20%.

...read moreread less

56 citations

Journal Article•DOI•

An IFC schema extension and binary serialization format to efficiently integrate point cloud data into building models

[...]

Thomas Krijnen¹, Jakob Beetz¹•Institutions (1)

Eindhoven University of Technology¹

01 Aug 2017-Advanced Engineering Informatics

TL;DR: The authors advocate the use of binary storage for sizeable point cloud scans, but also show how especially the grid discretization can result into usable points cloud segments embedded into text-based IFC models.

...read moreread less

36 citations

Journal Article•DOI•

Don't hold my data hostage: a case for client protocol redesign

[...]

Mark Raasveldt¹, Hannes Mühleisen¹•Institutions (1)

Centrum Wiskunde & Informatica¹

01 Jun 2017

TL;DR: This paper explores and analyse the result set serialization design space, presents experimental results from a large chunk of the database market, and proposes a columnar serialization method that improves transmission performance by an order of magnitude.

...read moreread less

Abstract: Transferring a large amount of data from a database to a client program is a surprisingly expensive operation. The time this requires can easily dominate the query execution time for large result sets. This represents a significant hurdle for external data analysis, for example when using statistical software. In this paper, we explore and analyse the result set serialization design space. We present experimental results from a large chunk of the database market and show the inefficiencies of current approaches. We then propose a columnar serialization method that improves transmission performance by an order of magnitude.

...read moreread less

28 citations

Journal Article•DOI•

Performing computation offloading on multiple platforms

[...]

Paulo A. L. Rego, Philipp B. Costa, Emanuel F. Coutinho¹, Lincoln S. Rocha, Fernando Trinta, José Neuman de Souza - Show less +2 more•Institutions (1)

Federal University of Ceará¹

01 Jun 2017-Computer Communications

TL;DR: This paper presents MpOS (Multiplatform Offloading System), a framework that supports a method-based offloading technique for applications of different mobile platforms (Android and Windows Phone) and shows that the type of serialization used by the framework directly impacts on the offloading performance.

...read moreread less

25 citations

Journal Article•DOI•

FAD.js: fast JSON data access using JIT-based speculative optimizations

[...]

Daniele Bonetta¹, Matthias Brantner¹•Institutions (1)

Oracle Corporation¹

01 Aug 2017

TL;DR: Experiments show that applications using Fad.js achieve speedups up to 2.7x for encoding and 9.9x for decoding JSON data when compared to state-of-the art JSON processing libraries.

...read moreread less

Abstract: JSON is one of the most popular data encoding formats, with wide adoption in Databases and BigData frameworks as well as native support in popular programming languages such as JavaScript/Node.js, Python, and R.Nevertheless, JSON data processing can easily become a performance bottleneck in data-intensive applications because of parse and serialization overhead. In this paper, we introduce Fad.js, a runtime system for efficient processing of JSON objects in data-intensive applications. Fad.js is based on (1) speculative just-in-time (JIT) compilation and (2) selective access to data. Experiments show that applications using Fad.js achieve speedups up to 2.7x for encoding and 9.9x for decoding JSON data when compared to state-of-the art JSON processing libraries.

...read moreread less

22 citations

Journal Article•DOI•

Staggered transmissions: Twitter and the return of serialized literature

[...]

Tore Rye Andersen¹•Institutions (1)

Aarhus University¹

24 Jan 2017-Convergence

TL;DR: The article discusses how the micro-serialization of Twitter fiction both differs from and draws on the pre-digital tradition of serial fiction, and focuses on two interrelated aspects of serialization, temporality and interaction.

...read moreread less

Abstract: The final part of the recent anthology Serialization in Popular Culture (2014) is called ‘Digital serialization’ and is devoted to ‘the influence of digital technologies on serial form’. The chapte...

...read moreread less

15 citations

Proceedings Article•DOI•

Smart grid serialization comparison: Comparision of serialization for distributed control in the context of the Internet of Things

[...]

Bo Søborg Petersen¹, Henrik W. Bindner¹, Shi You¹, Bjarne Poulsen¹•Institutions (1)

Technical University of Denmark¹

01 Jul 2017

TL;DR: The paper shows that there are better alternatives than XML & JAXB and gives guidance in choosing the most appropriate serialization format and library depending on the context, especially in the context of the Internet of Things.

...read moreread less

Abstract: Communication between DERs and System Operators is required to provide Demand Response and solve some of the problems caused by the intermittency of much Renewable Energy. An important part of efficient communication is serialization, which is important to ensure a high probability of delivery within a given timeframe, especially in the context of the Internet of Things, using low-bandwidth data connections and constrained devices. The paper shows that there are better alternatives than XML & JAXB and gives guidance in choosing the most appropriate serialization format and library depending on the context.

...read moreread less

14 citations

Journal Article•DOI•

A Semantic-aware Framework for Service Definition and Discovery in the Internet of Things Using CoAP

[...]

Farzad Khodadadi¹, Richard O. Sinnott¹•Institutions (1)

University of Melbourne¹

01 Jan 2017-Procedia Computer Science

TL;DR: In this paper, a semantic annotation of services is supported through ontologies defined for API definition languages such as Swagger and RAML, which enables intelligent discovery of services in IoT environments.

...read moreread less

Proceedings Article•DOI•

Understanding and improving disk-based intermediate data caching in Spark

[...]

Kaihui Zhang¹, Yusuke Tanimura², Hidemoto Nakada², Hirotaka Ogawa²•Institutions (2)

University of Tsukuba¹, National Institute of Advanced Industrial Science and Technology²

01 Dec 2017

TL;DR: This study aims to clarify efficient/inefficient use of disks in intermediate data caching in Spark and also to improve the usability of disks for end users.

...read moreread less

Abstract: Apache Spark is a parallel data processing framework that executes fast for iterative calculations and interactive processing, by caching intermediate data in memory with a lineage-based data recovery from faults. The Spark system can also manage data sets larger than memory capacity by placing some cache or all of them on disks on processing nodes. However, the disadvantage is potential performance degradation due to disk I/O and/or serialization. This study aims to clarify efficient/inefficient use of disks in intermediate data caching in Spark and also to improve the usability of disks for end users. In order to achieve the purpose, influence of disk use in data caching was firstly investigated in various aspects, such as caching options, data abstractions and storage devices. The results indicate that serialization cost is dominant rather than disk I/O in most cases. Secondly, a method of combined use of memory and disk was further evaluated under a high memory pressure. Then the method was improved to avoid an excessive re-caching problem, which achieved at most 20–30% reduction of total execution time under a high memory pressure and did not degrade the performance under a low memory pressure, in our experiment with 4 machine learning benchmarks. Finally, this paper summarizes important factors and potential improvements for efficiently using disks in data caching in Spark.

...read moreread less

Patent•

Centralized databases storing digital fingerprints of objects for collaborative authentication

[...]

Justin Lynn Withrow

27 Jun 2017

TL;DR: In this paper, the authors proposed a collaborative authentication platform and processes, combining digital fingerprint databases with associated incentive databases, to provide unparalleled reliability and enriched metadata to supply chain tracking, detecting counterfeit objects, and other applications.

...read moreread less

Abstract: Databases storing digital fingerprints of physical objects enable enhanced security and collaborative authentication. Digital fingerprints enable reliable identification of an object without the need for attaching or associating physical tags, labels or other identifying materials with the object; and serialization for identification also is obviated. By combining digital fingerprinting and data collaboration in one process, parties to the data collaboration can gain a level of certainty that data attributed to an object by different parties or at different times is attributed to only that object and not erroneously attributed to an incorrect or counterfeit object. Collaborative authentication platforms and processes, combining digital fingerprint databases with associated incentive databases, contribute enhanced information to the authentication databases, and provide unparalleled reliability and enriched metadata to supply chain tracking, detecting counterfeit objects, and other applications.

...read moreread less

Book Chapter•DOI•

Hollywood Remaking as Second-Order Serialization

[...]

Frank Kelleter, Kathleen Loock

05 May 2017

Proceedings Article•DOI•

Revisiting phased transactional memory

[...]

Joao P. L. de Carvalho¹, Guido Araujo¹, Alexandro Baldassin²•Institutions (2)

State University of Campinas¹, Sao Paulo State University²

14 Jun 2017

TL;DR: It is claimed that PhTM is a competitive alternative to HyTM and proposed PhTM*, the first implementation of PhTM on modern HTM-ready processors, which relies in avoiding unnecessary transitions to software mode by taking into account the categories of hardware aborts and adding a new serialization mode.

...read moreread less

Abstract: In recent years, Hybrid TM (HyTM) has been proposed as a transactional memory approach that leverages on the advantages of both hardware (HTM) and software (STM) execution modes. HyTM assumes that concurrent transactions can have very different phases and thus should run under different execution modes. Although HyTM has shown to improve performance, the overall solution can be complicated to manage, both in terms of correctness and performance. On the other hand, Phased Transactional Memory (PhTM) considers that concurrent transactions have similar phases, and thus all transactions could run under the same mode. As a result, PhTM does not require coordination between transactions on distinct modes making its implementation simpler and more flexible. In this paper we claim that PhTM is a competitive alternative to HyTM and propose PhTM*, the first implementation of PhTM on modern HTM-ready processors. PhTM* novelty relies in avoiding unnecessary transitions to software mode by: (i) taking into account the categories of hardware aborts; (ii) adding a new serialization mode. Experimental results with Haswell's TSX reveal that, for the STAMP benchmark suite, PhTM* performs on average 11% better than PhTM, a previous phase-based TM, and 15% better than HyTM-NOrec, a state-of-the-art HyTM. In addition, PhTM* showed to be even more effective running on a Power8 machine by performing over 25% and 36% better than PhTM and HyTM-NOrec, respectively.

...read moreread less

Book Chapter•DOI•

Juma: An Editor that Uses a Block Metaphor to Facilitate the Creation and Editing of R2RML Mappings

[...]

Ademar Crotti Junior¹, Christophe Debruyne¹, Declan O'Sullivan¹•Institutions (1)

Trinity College, Dublin¹

28 May 2017

TL;DR: A visual representation based on a block metaphor for creating and editing such mappings that is fully compliant with the R2RML specification is described and preliminary findings from users using the tool indicate that the visual representation was helpful in the creation of R1RML mappings with good usability results.

...read moreread less

Abstract: R2RML is the W3C standard mapping language used to define customized mappings from relational databases into RDF. One issue that hampers its adoption is the effort needed in the creation of such mappings, as they are stored as RDF documents. To address this problem, several tools that represent mappings as graphs have been proposed in the literature. In this paper, we describe a visual representation based on a block metaphor for creating and editing such mappings that is fully compliant with the R2RML specification. Preliminary findings from users using the tool indicate that the visual representation was helpful in the creation of R2RML mappings with good usability results. In future work, we intend to conduct more experiments focusing on different types of users and to abstract the visual representation from the R2RML mapping language so that it supports the serialization of other uplift mapping languages.

...read moreread less

Proceedings Article•DOI•

Fusing effectful comprehensions

[...]

Olli Saarikivi¹, Margus Veanes², Todd Mytkowicz², Madan Musuvathi²•Institutions (2)

Helsinki Institute for Information Technology¹, Microsoft²

14 Jun 2017

TL;DR: This paper builds on the underlying theory of symbolic transducers to fuse pipelines of effectful comprehensions into a single representation, from which efficient code can be generated and can significantly reduce the complexity of the fused pipelines.

...read moreread less

Abstract: List comprehensions provide a powerful abstraction mechanism for expressing computations over ordered collections of data declaratively without having to use explicit iteration constructs. This paper puts forth effectful comprehensions as an elegant way to describe list comprehensions that incorporate loop-carried state. This is motivated by operations such as compression/decompression and serialization/deserialization that are common in log/data processing pipelines and require loop-carried state when processing an input stream of data. We build on the underlying theory of symbolic transducers to fuse pipelines of effectful comprehensions into a single representation, from which efficient code can be generated. Using background theory reasoning with an SMT solver, our fusion and subsequent reachability based branch elimination algorithms can significantly reduce the complexity of the fused pipelines. Our implementation shows significant speedups over reasonable hand-written code (3.4×, on average) and traditionally fused version of the pipeline (2.6×, on average) for a variety of examples, including scenarios for extracting fields with regular expressions, processing XML with XPath, and running queries over encoded data.

...read moreread less

Journal Article•DOI•

jsCoq: Towards Hybrid Theorem Proving Interfaces

[...]

Emilio Jesús Gallego Arias¹, Benoît Pin¹, Pierre Jouvelot¹•Institutions (1)

PSL Research University¹

25 Jan 2017-arXiv: Programming Languages

TL;DR: jsCoq as discussed by the authors is a new platform and user environment for the Coq interactive proof assistant, which targets the HTML5-ECMAScript 2015 specification, and it is typically run inside a standards-compliant browser, without the need of external servers or services.

...read moreread less

Abstract: We describe jsCcoq, a new platform and user environment for the Coq interactive proof assistant. The jsCoq system targets the HTML5-ECMAScript 2015 specification, and it is typically run inside a standards-compliant browser, without the need of external servers or services. Targeting educational use, jsCoq allows the user to start interaction with proof scripts right away, thanks to its self-contained nature. Indeed, a full Coq environment is packed along the proof scripts, easing distribution and installation. Starting to use jsCoq is as easy as clicking on a link. The current release ships more than 10 popular Coq libraries, and supports popular books such as Software Foundations or Certified Programming with Dependent Types. The new target platform has opened up new interaction and display possibilities. It has also fostered the development of some new Coq-related technology. In particular, we have implemented a new serialization-based protocol for interaction with the proof assistant, as well as a new package format for library distribution.

...read moreread less

Patent•

CMOS image data training system and image data serialization-deserialization simulation detection method

[...]

Yu Da, Liu Jinguo, Kong Dezhu, Ma Qingjun, Zhu Han, Wang Wenhua, Ning Yonghui - Show less +3 more

08 Dec 2017

TL;DR: In this article, a CMOS image data training system and an image data serialization-deserialization simulation detection method, relates to the CMOS data serialisation-deerialization simulations detection method and solves the problem that the data serializatio-de-erialization is difficult to carry out due to the fact that an uncertain phase relationship exists among transmission channels employed by a CIMOS image sensor each time when electrification is carried out.

...read moreread less

Abstract: The invention discloses a CMOS image data training system and an image data serialization-deserialization simulation detection method, relates to the CMOS image data serialization-deserialization simulation detection method and solves the problem that the data serialization-deserialization is difficult to carry out due to the fact that an uncertain phase relationship exists among transmission channels employed by a CMOS image sensor each time when electrification is carried out. The CMOS image data training system comprises the CMOS image sensor and a data processor. The interior of the data processor is composed of iodelay, an iserdes, data asynchronous FIFO, control asynchronous FIFO, a gearbox, a ram based shifter and a controller. The controller is taken as the core of the CMOS image data training system and is used for controlling various components to work coordinatively. Under the control of the controller, the CMOS image sensor outputs serial graph data and finally the serial graph data is converted into parallel image data with bit width of p through the iodelay, the iserdes, the data asynchronous FIFO, the control asynchronous FIFO, the gearbox 1:2 and the ram based shifter. According to the serialization-deserialization detection method based on simulation provided by the invention, different excitations are generated for different data training phases, thereby realizing different training strategies.

...read moreread less

Patent•

Data serialization method and device

[...]

Zheng Junyuan, Sun Hongliang

31 May 2017

TL;DR: In this article, the authors proposed a data serialization method and device, which comprises the steps that original data is acquired; for each field name in a preset data template, field content corresponding the field name is acquired from the original data, wherein an association relationship among the field names, data types and field serial numbers is pre-stored in the preset data templates, and the data types are data types of the field content.

...read moreread less

Abstract: The invention provides a data serialization method and device. The method comprises the steps that original data is acquired; for each field name in a preset data template, field content corresponding the field name is acquired from the original data, wherein an association relationship among the field names, data types and field serial numbers is pre-stored in the preset data template, and the data types are data types of the field content; the field content corresponding to the field names and the field serial numbers associated to the field names are serialized according to the serialization modes corresponding to the data types associated to the field names, and byte sequences corresponding to fields to which the field names belong are obtained; according to the from small to big order of the field serial numbers, the obtained byte sequences corresponding to the fields to which the field names belong are assembled into one byte sequence to serve as serialized data corresponding to the original data. According to the data serialization method and device, the data occupation space can be effectively decreased, the data transmission and disk read-write efficiency can be improved.

...read moreread less

Proceedings Article•DOI•

Taming the Length Field in Binary Data: Calc-Regular Languages

[...]

Stefan Lucks¹, Norina Marie Grosch¹, Joshua Konig•Institutions (1)

Bauhaus University, Weimar¹

25 May 2017

TL;DR: The class of "calc-regular languages" is proposed, a minimalistic extension of regular languages with the additional property of handling length-fields, which disproves the conjecture that parsing those languages is difficult and not efficient.

...read moreread less

Abstract: When binary data are sent over a byte stream, the binary format sender and receiver are using is a "data serialization language", either explicitely specified, or implied by the implementations. Security is at risk when sender and receiver disagree on details of this language. If, e.g., the receiver fails to reject invalid messages, an adversary may assemble such invalid messages to compromise the receiver's security. Many data serialization languages are length-prefix languages. When sending/storing some F of flexible size, F is encoded at the binary level as a pair (|F|, F), with |F| representing the length of F (typically in bytes). This paper's main contributions and results are as follows. (1) Length-prefix langages are not context-free. This might seem to justify the conjecture that parsing those languages is difficult and not efficient. (2) The class of "calc-regular languages" is proposed, a minimalistic extension of regular languages with the additional property of handling length-fields. Calc-regular languages can be specified via "calc-regular expressions", a natural extension of regular expressions. (3) Calc-regular languages are almost as easy to parse as regular languages, using finite-state machines with additional accumulators. This disproves the conjecture from (1).

...read moreread less

Media Types for Hypertext Sensor Markup

[...]

Michael Koster

13 Mar 2017

TL;DR: A simple, reusable data model is described with a representation format, using a well known set of keywords to expose hypermedia controls, which inform clients how to perform state transfer operations on resources.

...read moreread less

Abstract: The scale and scope of the worldwide web has been in part driven by the availability of HTML as a common serialization, data model, and interaction model for structured resources on the web. By contrast, the general use of appropriate hypermedia techniques for machine interfaces has been limited by the lack of a common format for serialization and exchange of structured machine resources and sensor/actuator data which includes or embeds standardized hypermedia controls. The IRTF Thing to Thing Research Group [T2TRG] has a charter to investigate the use of REST design style [REST]for machine interactions. The W3C Web of Things Interest Group [W3C-WoT] are investigating abstract hypermedia controls and interaction models for machines. Machine optimized content formats exist for web links [RFC5988] [RFC6690] and for data items [I-D.ietf-core-senml]. Structured data which contains both links and items is known as the collection pattern. This draft describes media types for representation of machine resources structured as collections. A simple, reusable data model is described with a representation format, using a well known set of keywords to expose hypermedia controls, which inform clients how to perform state transfer operations on resources. The underlying assumptions regarding transfer layer processing are specified in this document. The HSML media type described in this document is compatible with SenML and CoRE Link- format by reusing the keyword identifiers and element structures from these content formats. Representations of HSML document content may be obtained in CoRE Link-Format and SenML content formats.

...read moreread less

Journal Article•DOI•

A new experiment-independent mechanism to persistify and serve the detector geometry of ATLAS

[...]

Riccardo-Maria Bianchi¹, J. Boudreau¹, Ilija Vukotic²•Institutions (2)

University of Pittsburgh¹, University of Chicago²

20 Nov 2017

TL;DR: A new mechanism to persistify is proposed, taking an object which lives in memory only and storing it on disk as a persistent object and serve the geometry of HEP experiments, and a new generation of applications could be developed, which can use the actual detector geometry while being platform-independent and experiment-independent.

...read moreread less

Abstract: The complex geometry of the whole detector of the ATLAS experiment at LHC is currently stored only in custom online databases, from which it is built on-the-fly on request. Accessing the online geometry guarantees accessing the latest version of the detector description, but requires the setup of the full ATLAS software framework "Athena", which provides the online services and the tools to retrieve the data from the database. This operation is cumbersome and slows down the applications that need to access the geometry. Moreover, all applications that need to access the detector geometry need to be built and run on the same platform as the ATLAS framework, preventing the usage of the actual detector geometry in stand-alone applications. Here we propose a new mechanism to persistify (in software development in general, and in HEP computing in particular, persistifying means taking an object which lives in memory only - for example because it was built on-the-fly while processing the experimental data, - serializing it and storing it on disk as a persistent object) and serve the geometry of HEP experiments. The new mechanism is composed by a new file format and the modules to make use of it. The new file format allows to store the whole detector description locally in a file, and it is especially optimized to describe large complex detectors with the minimum file size, making use of shared instances and storing compressed representations of geometry transformations. Then, the detector description can be read back in, to fully restore the in-memory geometry tree. Moreover, a dedicated REST API is being designed and developed to serve the geometry in standard exchange formats like JSON, to let users and applications download specific partial geometry information. With this new geometry persistification a new generation of applications could be developed, which can use the actual detector geometry while being platform-independent and experiment-independent.

...read moreread less

Proceedings Article•DOI•

Smart grid communication comparison: Distributed control middleware and serialization comparison for the Internet of Things

[...]

Bo Søborg Petersen¹, Henrik W. Bindner¹, Bjarne Poulsen¹, Shi You¹•Institutions (1)

Technical University of Denmark¹

01 Sep 2017

TL;DR: There are better alternatives to using Web Services and XMPP as middleware and that there is better alternatives than using XML for serialization and ZeroMQ, YAMI4, and ICE are the middleware that performs the best, and ProtoBuf (ProtoStuff), and ProtoStuff are the serialization that performsThe best.

...read moreread less

Abstract: To solve the problems caused by intermittent renewable energy production, communication between Distributed Energy Resources (DERs) and system operators is necessary. The communication middleware and serialization used for communication are essential to ensure delivery of the messages within the required timeframe, to provide the necessary ancillary services to the power grid. This paper shows that there are better alternatives to using Web Services and XMPP as middleware and that there are better alternatives than using XML for serialization. The paper also gives guidance at choosing the best communication middleware and serialization format/library, aided by the authors' earlier work, which investigates the performance and characteristics of communication middleware and serialization independently. Given the performance criteria of the paper, ZeroMQ, YAMI4, and ICE are the middleware that performs the best, and ProtoBuf (ProtoStuff), and ProtoStuff are the serialization that performs the best.

...read moreread less

Patent•

Serialization method and apparatus, deserialization method and apparatus, serialization and deserialization system, and electronic device

[...]

Dong Shiming

19 Apr 2017

TL;DR: In this article, a data object serialization method for data stream deserialization is described, which comprises the steps of obtaining a metadata description file of a to-be-serialized data object, obtaining corresponding attribute operation classes according to attribute operation class names provided by attribute description in the metadata description, and reading attribute values in the to be-serialised data object according to reading methods provided by the corresponding attributed operation classes.

...read moreread less

Abstract: The invention discloses a data object serialization method and apparatus, a data stream deserialization method and apparatus, an electronic device, and a serialization and deserialization system. The data object serialization method comprises the steps of obtaining a metadata description file of a to-be-serialized data object; obtaining corresponding attribute operation classes according to attribute operation class names provided by attribute description in the metadata description file; reading attribute values in the to-be-serialized data object according to reading methods provided by the corresponding attribute operation classes; and writing the read attribute values in a result data stream of the data object according to a sequence of attribute description serial numbers. By adopting the method, the problem in compatibility between systems after change of object attributes is solved; the metadata of the object and the values of the object are split; the storage space and the transmission flow are saved during storage and transmission; and the processing overhead of the systems is reduced during deserialization.

...read moreread less

Proceedings Article•DOI•

An experimental comparison of complex object implementations for big data systems

[...]

Sourav Sikdar¹, Kia Teymourian¹, Chris Jermaine¹•Institutions (1)

Rice University¹

24 Sep 2017

TL;DR: The question is asked is: How significant is the performance hit associated with choosing a particular physical implementation?

...read moreread less

Abstract: Many cloud-based data management and analytics systems support complex objects. Dataflow platforms such as Spark and Flink allow programmers to manipulate sets consisting of objects from a host programming language (often Java). Document databases such as MongoDB make use of hierarchical interchange formats---most popularly JSON---which embody a data model where individual records can themselves contain sets of records. Systems such as Dremel and AsterixDB allow complex nesting of data structures. Clearly, no system designer would expect a system that stores JSON objects as text to perform at the same level as a system based upon a custom-built physical data model. The question we ask is: How significant is the performance hit associated with choosing a particular physical implementation? Is the choice going to result in a negligible performance cost, or one that is debilitating? Unfortunately, there does not exist a scientific study of the effect of physical complex model implementation on system performance in the literature. Hence it is difficult for a system designer to fully understand performance implications of such choices. This paper is an attempt to remedy that.

...read moreread less

Journal Article•DOI•

Zeroing in on Performance 2.0: From Serialization to Performative Enactments

[...]

Antti Lindfors

01 Oct 2017-Folklore-electronic Journal of Folklore

TL;DR: In this paper, the notions of performativity and performance in digital environments from the combined perspective of linguistic anthropology and folkloristics are explored, and an intermediary heuristic term of "performative enactments" is introduced.

...read moreread less

Abstract: This article explores the notions of performativity and performance in digital environments from the combined perspective of linguistic anthropology and folkloristics. In order to bring these diverging conceptual, methodological, and disciplinary traditions into mutual contact, an intermediary heuristic term of “performative enactments” is introduced. Performative enactments are elaborated as events of communicative sign behavior that foreground and make use of the principle of performativity, although not performances proper in the sense of manifesting a specific “mode of communication” (Bauman 1984). Two different cases of digital communication are analyzed, the first manifesting an instance of everyday SMS messaging between two friends, the second concerning the so-called Per-Looks media event that took place in Finland in October 2012. Both cases are approached as materially durable performative enactments with methodological attention laid on poetic patterning understood as a textually diffuse form of performativity.

...read moreread less

D2Refine: A Platform for Clinical Research Study Data Element Harmonization and Standardization.

[...]

Deepak K. Sharma¹, Harold R. Solbrig¹, Eric Prud'hommeaux², Kathleen Lee³, Jyotishman Pathak³, Guoqian Jiang¹ - Show less +2 more•Institutions (3)

Mayo Clinic¹, Massachusetts Institute of Technology², Cornell University³

26 Jul 2017

TL;DR: It is demonstrated that D2Refine is a useful and promising platform that would help address the emergent needs for clinical research study data element harmonization and standardization.

...read moreread less

Abstract: In this paper, we present a platform known as D2Refine for facilitating clinical research study data element harmonization and standardization D2Refine is developed on top of OpenRefine (formerly Google Refine) and leverages simple interface and extensible architecture of OpenRefine D2Refine empowers the tabular representation of clinical research study data element definitions by allowing it to be easily organized and standardized using reconciliation services D2Refine builds on valuable built-in data transformation features of OpenRefine to bring source data sets to a finer state quickly We implemented the reconciliation services and search capabilities based on the standard Common Terminology Services 2 (CTS2) and the serialization of clinical research study data element definitions into standard representation using clinical information modeling technology for semantic interoperability We demonstrate that D2Refine is a useful and promising platform that would help address the emergent needs for clinical research study data element harmonization and standardization

...read moreread less

Patent•

Serialization and deserialization methods and apparatuses, computer device and storage medium

[...]

Xiao Hui

12 Dec 2017

TL;DR: In this paper, a serialization and deserialization method for data serialization is described, in which the serialization instruction comprises object type information, according to which the object type is extracted from the data, and according to the data type, a serialisation object corresponding to the type information is created.

...read moreread less

Abstract: The invention discloses serialization and deserialization methods and apparatuses, a computer device and a storage medium. The data serialization method in one embodiment comprises the steps of receiving a serialization instruction, wherein the serialization instruction comprises object type information; according to the object type information, calling a serialization and deserialization interface function, and creating a serialization object corresponding to the object type information; adding a serialization object instance of the serialization object; and performing data serialization by the serialization object instance, and outputting a serialization file. According to the scheme, the serialization and deserialization processing processes are simply and conveniently realized; the code quantity is reduced; and the error probability is reduced.

...read moreread less

Proceedings Article•DOI•

Supporting the Specification and Serialization of Planned Architectures in Architecture-Driven Modernization Context

[...]

André de S. Landi¹, Fernando Chagas¹, Bruno Marinho Santos¹, Renato S. Costa¹, Rafael Serapilha Durelli, Ricardo Terra, Valter Vieira de Camargo¹ - Show less +3 more•Institutions (1)

Federal University of São Carlos¹

01 Jul 2017

TL;DR: DCL-KDM is an efficient alternative to to generate instances of the Structure metamodel as a PA and to serialize it, and a strategy for the serialization of PAs as a Structure metAModel instance without modifying it is proposed.

...read moreread less

Abstract: Architecture-Driven Modernization (ADM) intends to standardize software reengineering by relying on a family of standard metamodels. Knowledge-Discovery Metamodel (KDM) is the main ADM ISO metamodel aiming at representing all aspects of existing legacy systems. One of the internal KDM metamodels is called Structure, responsible for representing architectural abstractions (Layers, Components and Subsystems) and their relationships. Planned Architecture (PA) is an artifact that involves not only the architectural abstractions of the system but also the access rules that must exist between them and be maintained over time. Although PAs are frequently used in Architecture-Conformance Checking processes, up to this moment, there is no contribution showing how to specify and serialize PAs in ADM-based modernization projects. Therefore, in this paper we present an approach that i) involves a DSL (Domain-Specific Language) for the specification of PAs using the Structure metamodel concepts, and ii) proposes a strategy for the serialization of PAs as a Structure metamodel instance without modifying it. We have conducted a comparison between DCL-KDM and other techniques for specifying and generating PAs. The results showed that DCL-KDM is an efficient alternative to to generate instances of the Structure metamodel as a PA and to serialize it.

...read moreread less