Showing papers on "Data access published in 2015"

PDF

Open Access

Journal Article•DOI•

The Materials Application Programming Interface (API): A simple, flexible and efficient API for materials data based on REpresentational State Transfer (REST) principles

[...]

Shyue Ping Ong¹, Shreyas Cholia², Anubhav Jain², Miriam Brafman², Dan Gunter², Gerbrand Ceder³, Kristin A. Persson² - Show less +3 more•Institutions (3)

University of California, San Diego¹, Lawrence Berkeley National Laboratory², Massachusetts Institute of Technology³

01 Feb 2015-Computational Materials Science

TL;DR: The Materials Application Programming Interface is described, a simple, flexible and efficient interface to programmatically query and interact with the Materials Project database based on the REpresentational State Transfer (REST) pattern for the web.

...read moreread less

333 citations

Journal Article•DOI•

Free and open-access satellite data are key to biodiversity conservation

[...]

Woody Turner¹, Carlo Rondinini², Nathalie Pettorelli³, Brice Mora, Allison K. Leidner¹, Allison K. Leidner⁴, Zoltan Szantoi, Graeme M. Buchanan⁵, Stefan Dech⁶, John L. Dwyer⁷, Martin Herold, Lian Pin Koh⁸, Peter Leimgruber⁹, Hannes Taubenboeck⁶, Martin Wegmann⁶, Martin Wikelski¹⁰, Curtis E. Woodcock¹¹ - Show less +13 more•Institutions (11)

NASA Headquarters¹, Sapienza University of Rome², Zoological Society of London³, Universities Space Research Association⁴, Royal Society for the Protection of Birds⁵, German Aerospace Center⁶, United States Geological Survey⁷, University of Adelaide⁸, Smithsonian Conservation Biology Institute⁹, Max Planck Society¹⁰, Boston University¹¹

01 Feb 2015-Biological Conservation

TL;DR: More cross-community interactions are necessary to strengthen ties between the biodiversity and remote sensing communities and the number of free and open data sets remains too limited.

...read moreread less

246 citations

Journal Article•DOI•

A hybrid solution for privacy preserving medical data sharing in the cloud environment

[...]

Ji-Jiang Yang¹, Jianqiang Li², Yu Niu¹•Institutions (2)

Tsinghua University¹, Beijing University of Technology²

01 Feb 2015-Future Generation Computer Systems

TL;DR: This paper proposes a practical solution for privacy preserving medical record sharing for cloud computing, where the statistical analysis and cryptography are innovatively combined together to provide multiple paradigms of balance between medical data utilization and privacy protection.

...read moreread less

246 citations

Journal Article•DOI•

Materials Data Science: Current Status and Future Outlook

[...]

Surya R. Kalidindi, Marc De Graef¹•Institutions (1)

Carnegie Mellon University¹

01 Jul 2015-Annual Review of Materials Research

TL;DR: The concept of process-structure-property (PSP) linkages is introduced and illustrated how the determination of PSPs is one of the main objectives of materials data science.

...read moreread less

Abstract: The field of materials science and engineering is on the cusp of a digital data revolution. After reviewing the nature of data science and Big Data, we discuss the features of materials data that distinguish them from data in other fields. We introduce the concept of process-structure-property (PSP) linkages and illustrate how the determination of PSPs is one of the main objectives of materials data science. Then we review a selection of materials databases, as well as important aspects of materials data management, such as storage hardware, archiving strategies, and data access strategies. We introduce the emerging field of materials data analytics, which focuses on data-driven approaches to extract and curate materials knowledge from available data sets. The critical need for materials e-collaboration platforms is highlighted, and we conclude the article with a number of suggestions regarding the near-term future of the materials data science field.

...read moreread less

199 citations

Journal Article•DOI•

k -Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data

[...]

Bharath K. Samanthula¹, Yousef Elmehdwi², Wei Jiang²•Institutions (2)

Purdue University¹, Missouri University of Science and Technology²

01 May 2015-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This work is the first to develop a secure k-NN classifier over encrypted data under the semi-honest model and empirically analyzes the efficiency of the proposed protocol using a real-world dataset under different parameter settings.

...read moreread less

Abstract: Data Mining has wide applications in many areas such as banking, medicine, scientific research and among government agencies. Classification is one of the commonly used tasks in data mining applications. For the past decade, due to the rise of various privacy issues, many theoretical and practical solutions to the classification problem have been proposed under different security models. However, with the recent popularity of cloud computing, users now have the opportunity to outsource their data, in encrypted form, as well as the data mining tasks to the cloud. Since the data on the cloud is in encrypted form, existing privacy-preserving classification techniques are not applicable. In this paper, we focus on solving the classification problem over encrypted data. In particular, we propose a secure $k$ -NN classifier over encrypted data in the cloud. The proposed protocol protects the confidentiality of data, privacy of user’s input query, and hides the data access patterns. To the best of our knowledge, our work is the first to develop a secure $k$ -NN classifier over encrypted data under the semi-honest model. Also, we empirically analyze the efficiency of our proposed protocol using a real-world dataset under different parameter settings.

...read moreread less

187 citations

Proceedings Article•DOI•

Towards Virtual Machine Migration in Fog Computing

[...]

Luiz F. Bittencourt¹, Marcio Moraes Lopes¹, Ioan Petri², Omer Rana²•Institutions (2)

State University of Campinas¹, Cardiff University²

04 Nov 2015

TL;DR: An overview of Fog computing is provided, relating it to general concepts in Cloud-based systems, followed by a general architecture to support virtual machine migration in this emerging paradigm - discussing both the benefits and challenges associated with such migration.

...read moreread less

Abstract: Handoff mechanisms allow mobile users to move across multiple wireless access points while maintaining their voice and/or data sessions. A traditional handoff process is concerned with smoothly transferring a mobile device session from its current access point (or cell) to a target access point (or cell). These handoff characteristics are sufficient for voice calls and background data transfers, however nowadays many mobile applications are heavily based on data and processing capabilities from the cloud. Such applications, especially those that require greater interactivity, often demand not only a smooth session transfer, but also the maintenance of quality of service requirements that impact a user's experience. In this context, the Fog Computing paradigm arises to overcome delays encountered when applications need low latency to access data or offload processing to the cloud. Fog computing introduces a distributed cloud layer, composed of cloudlets (i.e., "small clouds" with lower computational capacity), between the user and the cloud. Cloudlets allow low latency access to data or processing capabilities, which can be accomplished by offering a VM to the user. An overview of Fog computing is first providing, relating it to general concepts in Cloud-based systems, followed by a general architecture to support virtual machine migration in this emerging paradigm -- discussing both the benefits and challenges associated with such migration.

...read moreread less

154 citations

Journal Article•DOI•

Data Security Challenges and Its Solutions in Cloud Computing

[...]

R. Velumadhava Rao, K. Selvamani¹•Institutions (1)

Anna University¹

01 Jan 2015-Procedia Computer Science

TL;DR: This paper highlights data related security challenges in cloud based environment and solutions to overcome and provides a roadmap to overcome them.

...read moreread less

150 citations

Journal Article•DOI•

Smart Waste Collection System Based on Location Intelligence

[...]

Jose M. Gutierrez¹, Michael Jensen, Morten Henius¹, Tahir Riaz¹•Institutions (1)

Aalborg University¹

01 Jan 2015

TL;DR: This paper practically demonstrates how Internet of Things (IoT) integration with data access networks, Geographic Information Systems (GIS), combinatorial optimization, and electronic engineering can contribute to improve cities’ management systems.

...read moreread less

Abstract: Cities around the world are on the run to become smarter. Some of these have seen an opportunity on deploying dedicated municipal access networks to support all types of city management and maintenance services requiring a data connection. This paper practically demonstrates how Internet of Things (IoT) integration with data access networks, Geographic Information Systems (GIS), combinatorial optimization, and electronic engineering can contribute to improve cities’ management systems. We present a waste collection solution based on providing intelligence to trashcans, by using an IoT prototype embedded with sensors, which can read, collect, and transmit trash volume data over the Internet. This data put into a spatio-temporal context and processed by graph theory optimization algorithms can be used to dynamically and efficiently manage waste collection strategies. Experiments are carried out to investigate the benefits of such a system, in comparison to a traditional sectorial waste collection approaches, also including economic factors. A realistic scenario is set up by using Open Data from the city of Copenhagen, highlighting the opportunities created by this type of initiatives for third parties to contribute and develop Smart city solutions.

...read moreread less

124 citations

Journal Article•DOI•

Optique: Zooming in on Big Data

[...]

Martin Giese¹, Ahmet Soylu¹, Guillermo Vega-Gorgojo¹, Arild Waaler¹, Peter Haase², Ernesto Jiménez-Ruiz³, Davide Lanti⁴, Martin Rezk⁴, Guohui Xiao⁴, Özgür Lütfü Özçep⁵, Riccardo Rosati⁶ - Show less +7 more•Institutions (6)

University of Oslo¹, Fluid Operations², University of Oxford³, Free University of Bozen-Bolzano⁴, University of Lübeck⁵, Sapienza University of Rome⁶

18 Mar 2015-IEEE Computer

TL;DR: Optique overcomes problems in current ontology-based data access systems pertaining to installation overhead, usability, scalability, and scope by integrating a user-oriented query interface, semi-automated managing methods, new query rewriting techniques, and temporal and streaming data processing in one platform.

...read moreread less

Abstract: Optique overcomes problems in current ontology-based data access systems pertaining to installation overhead, usability, scalability, and scope by integrating a user-oriented query interface, semi-automated managing methods, new query rewriting techniques, and temporal and streaming data processing in one platform.

...read moreread less

105 citations

Patent•

Cellular device security apparatus and method

[...]

Rafi Nehushtan

07 Jan 2015

TL;DR: In this paper, a cellular communication device has one or more access modes which allow reading and writing of data, for example to change its settings, for instance passwords and even the entire operating system and also permitting access to personal information such as the user's telephone book.

...read moreread less

Abstract: A cellular communication device has one or more access modes which allow reading and writing of data, for example to change its settings, for example passwords and even the entire operating system and also permitting access to personal information such as the user's telephone book. To prevent cloning and like illegal access activity, the device is configured by restricting access to such data access modes using a device unique security setting. The setting may be a password, preferably a one-time password, or it may be a unique or dynamic or one time configuration of the codes for the read and write instructions of the data mode. There is also disclosed a server, which manages the security settings such that data mode operates during an active connection between the device and the server, and a secure communication protocol for communicating between the server and the cellular device.

...read moreread less

105 citations

Proceedings Article•DOI•

Oracle Database In-Memory: A dual format in-memory database

[...]

Tirthankar Lahiri¹, Shasank Chavan¹, Maria Colgan¹, Dinesh Das¹, Amit Ganesh¹, Michael J. Gleeson¹, Sanket Hase¹, Allison L. Holloway¹, Jesse Kamp¹, Teck-Hua Lee¹, Juan R. Loaiza¹, Neil Macnaughton¹, Vineet Marwah¹, Niloy Mukherjee¹, Atrayee Mullick¹, Sujatha Muthulingam¹, Vivekanandhan Raja¹, Marty Roth¹, Ekrem Soylemez¹, Mohamed Zait¹ - Show less +16 more•Institutions (1)

Oracle Corporation¹

13 Apr 2015

TL;DR: The Oracle Database In-Memory Option allows Oracle to function as the industry-first dual-format in-memory database to allow data to be simultaneously maintained in both formats with strict transactional consistency between them.

...read moreread less

Abstract: The Oracle Database In-Memory Option allows Oracle to function as the industry-first dual-format in-memory database. Row formats are ideal for OLTP workloads which typically use indexes to limit their data access to a small set of rows, while column formats are better suited for Analytic operations which typically examine a small number of columns from a large number of rows. Since no single data format is ideal for all types of workloads, our approach was to allow data to be simultaneously maintained in both formats with strict transactional consistency between them.

...read moreread less

Journal Article•DOI•

Efficient SPARQL-to-SQL with R2RML mappings

[...]

Mariano Rodriguez-Muro¹, Martin Rezk¹•Institutions (1)

Free University of Bozen-Bolzano¹

01 Aug 2015-Journal of Web Semantics

TL;DR: This paper proposes a formal approach for SPARQL-to-SQL translation that generates efficient SQL by combining optimization techniques from the logic programming and SQL optimization fields, provides a well-defined specification of the SParQL semantics used in the translation, and supports R2RML mappings over general relational schemas.

...read moreread less

Journal Article•DOI•

Shared Authority Based Privacy-Preserving Authentication Protocol in Cloud Computing

[...]

Hong Liu¹, Huansheng Ning², Qingxu Xiong¹, Laurence T. Yang³•Institutions (3)

Beihang University¹, University of Science and Technology Beijing², Huazhong University of Science and Technology³

01 Jan 2015-IEEE Transactions on Parallel and Distributed Systems

TL;DR: A shared authority based privacy-preserving authentication protocol (SAPA) is proposed to address above privacy issue for cloud storage and universal composability model is established to prove that the SAPA theoretically has the design correctness.

...read moreread less

Abstract: Cloud computing is an emerging data interactive paradigm to realize users' data remotely stored in an online cloud server. Cloud services provide great conveniences for the users to enjoy the on-demand cloud applications without considering the local infrastructure limitations. During the data accessing, different users may be in a collaborative relationship, and thus data sharing becomes significant to achieve productive benefits. The existing security solutions mainly focus on the authentication to realize that a user's privative data cannot be illegally accessed, but neglect a subtle privacy issue during a user challenging the cloud server to request other users for data sharing. The challenged access request itself may reveal the user's privacy no matter whether or not it can obtain the data access permissions. In this paper, we propose a shared authority based privacy-preserving authentication protocol (SAPA) to address above privacy issue for cloud storage. In the SAPA, 1) shared access authority is achieved by anonymous access request matching mechanism with security and privacy considerations (e.g., authentication, data anonymity, user privacy, and forward security); 2) attribute based access control is adopted to realize that the user can only access its own data fields; 3) proxy re-encryption is applied to provide data sharing among the multiple users. Meanwhile, universal composability (UC) model is established to prove that the SAPA theoretically has the design correctness. It indicates that the proposed protocol is attractive for multi-user collaborative cloud applications.

...read moreread less

Proceedings Article•DOI•

TencentRec: Real-time Stream Recommendation in Practice

[...]

Yanxiang Huang¹, Bin Cui¹, Wenyu Zhang², Jie Jiang², Ying Xu¹ - Show less +1 more•Institutions (2)

Peking University¹, Tencent²

27 May 2015

TL;DR: This paper proposes a general real-time stream recommender system built on Storm named TencentRec, and presents a practical scalable item-based CF algorithm in detail, with the super characteristics such as robust to the implicit feedback problem, incremental update and real- time pruning.

...read moreread less

Abstract: With the arrival of the big data era, opportunities as well as challenges arise in both industry and academia. As an important service in most web applications, accurate real-time recommendation in the context of big data is of high demand. Traditional recommender systems that analyze data and update models at regular time intervals cannot satisfy the requirements of modern web applications, calling for real-time recommender systems. In this paper, we tackle the ``big", ``real-time" and ``accurate" challenges in real-time recommendation, and propose a general real-time stream recommender system built on Storm named TencentRec from three aspects, i.e., ``system", ``algorithm", and ``data". We analyze the large amount of data streams from a wide range of applications leveraging the considerable computation ability of Storm, together with a data access component and a data storage component developed by us. To deal with various application specific demands, we have implemented several classic practical recommendation algorithms in TencentRec, including the item-based collaborative filtering, the content based, and the demographic based algorithms. Specially, we present a practical scalable item-based CF algorithm in detail, with the super characteristics such as robust to the implicit feedback problem, incremental update and real-time pruning. With the enhancement of real-time data collection and processing, we can capture the recommendation changes in real-time. We deploy the TencentRec in a series of production applications, and observe the superiority of TencentRec in providing accurate real-time recommendations for 10 billion user requests everyday.

...read moreread less

Patent•

Method and system for protecting data flow at a mobile device

[...]

Phillip Porras¹•Institutions (1)

SRI International¹

20 May 2015

TL;DR: In this article, a data flow policy engine is proposed to evaluate data access requests made by security-wrapped software applications running on mobile devices and prevent them from violating the policy.

...read moreread less

Abstract: A method and system for evaluating and enforcing a data flow policy at a mobile computing device includes a data flow policy engine to evaluate data access requests made by security-wrapped software applications running on the mobile device and prevent the security-wrapped software applications from violating the data flow policy. The data flow policy defines a number of security labels that are associated with data objects. A software application process may be associated with a security label if the process accesses data having the security label or the process is in communication with another process that has accessed data having the security label.

...read moreread less

Book Chapter•DOI•

Ontology Based Access to Exploration Data at Statoil

[...]

Evgeny Kharlamov¹, Dag Hovland², Ernesto Jiménez-Ruiz¹, Davide Lanti³, Hallstein Lie⁴, Christoph Pinkel⁵, Martin Rezk³, Martin G. Skjæveland², Evgenij Thorstensen², Guohui Xiao³, Dmitriy Zheleznyakov¹, Ian Horrocks¹ - Show less +8 more•Institutions (5)

University of Oxford¹, University of Oslo², Free University of Bozen-Bolzano³, Equinor⁴, Fluid Operations⁵

11 Oct 2015

TL;DR: In this article, the authors present data access challenges in the data-intensive petroleum company Statoil and their experience in addressing these challenges with OBDA technology, and develop a deployment module to create ontologies and mappings from relational databases in a semi-automatic fashion, and a query processing module to perform and optimize the process of translating ontological queries into data queries and their execution.

...read moreread less

Abstract: Ontology Based Data Access OBDA is a prominent approach to query databases which uses an ontology to expose data in a conceptually clear manner by abstracting away from the technical schema-level details of the underlying data. The ontology is 'connected' to the data via mappings that allow to automatically translate queries posed over the ontology into data-level queries that can be executed by the underlying database management system. Despite a lot of attention from the research community, there are still few instances of real world industrial use of OBDA systems. In this work we present data access challenges in the data-intensive petroleum company Statoil and our experience in addressing these challenges with OBDA technology. In particular, we have developed a deployment module to create ontologies and mappings from relational databases in a semi-automatic fashion, and a query processing module to perform and optimise the process of translating ontological queries into data queries and their execution. Our modules have been successfully deployed and evaluated for an OBDA solution in Statoil.

...read moreread less

Book Chapter•DOI•

Data-Oriented Characterization of Application-Level Energy Optimization

[...]

Kenan Liu¹, Gustavo Pinto², Yu David Liu¹•Institutions (2)

State University of New York System¹, Federal University of Pernambuco²

11 Apr 2015

TL;DR: This paper studies the energy impact of alternative data management choices by programmers, such as data access patterns, data precision choices, and data organization, and attempts to build a bridge between application-level energy management and hardware- level energy management.

...read moreread less

Abstract: Empowering application programmers to make energy-aware decisions is a critical dimension of energy optimization for computer systems. In this paper, we study the energy impact of alternative data management choices by programmers, such as data access patterns, data precision choices, and data organization. Second, we attempt to build a bridge between application-level energy management and hardware-level energy management, by elucidating how various application-level data management features respond to Dynamic Voltage and Frequency Scaling (DVFS). Finally, we apply our findings to real-world applications, demonstrating their potential for guiding application-level energy optimization. The empirical study is particularly relevant in the Big Data era, where data-intensive applications are large energy consumers, and their energy efficiency is strongly correlated to how data are maintained and handled in programs.

...read moreread less

Proceedings Article•DOI•

Insecurity of Voice Solution VoLTE in LTE Mobile Networks

[...]

Chi-Yu Li¹, Guan-Hua Tu¹, Chunyi Peng², Zengwen Yuan¹, Yuanjie Li¹, Songwu Lu¹, Xinbing Wang³ - Show less +3 more•Institutions (3)

University of California, Los Angeles¹, Ohio State University², Shanghai Jiao Tong University³

12 Oct 2015

TL;DR: This work conducts the first study on VoLTE security before its full rollout, discovering several vulnerabilities in both its control-plane and data-plane functions, which can be exploited to disrupt both data and voice in operational networks.

...read moreread less

Abstract: VoLTE (Voice-over-LTE) is the designated voice solution to the LTE mobile network, and its worldwide deployment is underway. It reshapes call services from the traditional circuit-switched telecom telephony to the packet-switched Internet VoIP. In this work, we conduct the first study on VoLTE security before its full rollout. We discover several vulnerabilities in both its control-plane and data-plane functions, which can be exploited to disrupt both data and voice in operational networks. In particular, we find that the adversary can easily gain free data access, shut down continuing data access, or subdue an ongoing call, etc. We validate these proof-of-concept attacks using commodity smartphones (rooted and unrooted) in two Tier-1 US mobile carriers. Our analysis reveals that, the problems stem from both the device and the network. The device OS and chipset fail to prohibit non-VoLTE apps from accessing and injecting packets into VoLTE control and data planes. The network infrastructure also lacks proper access control and runtime check.

...read moreread less

Journal Article•DOI•

A Parallel File System with Application-Aware Data Layout Policies for Massive Remote Sensing Image Processing in Digital Earth

[...]

Lizhe Wang¹, Yan Ma¹, Albert Y. Zomaya², Rajiv Ranjan³, Dan Chen⁴ - Show less +1 more•Institutions (4)

Chinese Academy of Sciences¹, University of Sydney², Commonwealth Scientific and Industrial Research Organisation³, Wuhan University⁴

01 Jun 2015-IEEE Transactions on Parallel and Distributed Systems

TL;DR: A RS data object-based parallel file system for remote sensing applications and implement it with the OrangeFS file system that provides application-aware data layout policies for efficient support of various data access patterns of RS applications from the server side.

...read moreread less

Abstract: Remote sensing applications in Digital Earth are overwhelmed with vast quantities of remote sensing (RS) image data. The intolerable I/O burden introduced by the massive amounts of RS data and the irregular RS data access patterns has made the traditional cluster based parallel I/O systems no longer applicable. We propose a RS data object-based parallel file system for remote sensing applications and implement it with the OrangeFS file system. It provides application-aware data layout policies, together with RS data object based data I/O interfaces, for efficient support of various data access patterns of RS applications from the server side. With the prior knowledge of the desired RS data access patterns, HPGFS could offer relevant space-filling curves to organize the sliced 3-D data bricks and distribute them over I/O servers. In this way, data layouts consistent with expected data access patterns could be created to explore data locality and achieve performance improvement. Moreover, the multi-band RS data with complex structured geographical metadata could be accessed and managed as a single data object. Through experiments on remote sensing applications with different access patterns, we have achieved performance improvement of about 30 percent for I/O and 20 percent overall.

...read moreread less

Journal Article•DOI•

An effective ECC-based user access control scheme with attribute-based encryption for wireless sensor networks

[...]

Santanu Chatterjee, Ashok Kumar Das¹•Institutions (1)

International Institute of Information Technology, Hyderabad¹

01 Jun 2015-Security and Communication Networks

TL;DR: This paper proposes a new user access control scheme with attribute-based encryption using elliptic curve cryptography in hierarchical WSNs and demonstrates that the scheme has the ability to tolerate different known attacks required for a users' access control designed for W SNs.

...read moreread less

Abstract: For critical applications, real-time data access is essential from the nodes inside a wireless sensor network WSN. Only the authorized users with unique access privilege should access the specific, but not all, sensing information gathered by the cluster heads in a hierarchical WSNs. Access rights for the correct information and resources for different services from the cluster heads to the genuine users can be provided with the help of efficient user access control mechanisms. In this paper, we propose a new user access control scheme with attribute-based encryption using elliptic curve cryptography in hierarchical WSNs. In attribute-based encryption, the ciphertexts are labeled with sets of attributes and secret keys of the users that are associated with their own access structures. The authorized users with the relevant set of attributes can able to decrypt the encrypted message coming from the cluster heads. Our scheme provides high security. Moreover, our scheme is efficient as compared with those for other existing user access control schemes. Through both the formal and informal security analysis, we show that our scheme has the ability to tolerate different known attacks required for a user access control designed for WSNs. Furthermore, we simulate our scheme for the formal security verification using the widely-accepted automated validation of Internet security protocols and applications tool. The simulation results demonstrate that our scheme is secure. Copyright © 2014 John Wiley & Sons, Ltd.

...read moreread less

Journal Article•DOI•

CHARM: A Cost-Efficient Multi-Cloud Data Hosting Scheme with High Availability

[...]

Quanlu Zhang¹, Shenglong Li¹, Zhenhua Li², Yuanjian Xing, Zhi Yang¹, Yafei Dai¹ - Show less +2 more•Institutions (2)

Peking University¹, Tsinghua University²

01 Jul 2015

TL;DR: A novel data hosting scheme (named CHARM) is proposed which integrates two key functions desired, selecting several suitable clouds and an appropriate redundancy strategy to store data with minimized monetary cost and guaranteed availability and exhibits sound adaptability to data and price adjustments.

...read moreread less

Abstract: Nowadays, more and more enterprises and organizations are hosting their data into the cloud, in order to reduce the IT maintenance cost and enhance the data reliability. However, facing the numerous cloud vendors as well as their heterogenous pricing policies, customers may well be perplexed with which cloud(s) are suitable for storing their data and what hosting strategy is cheaper.The general status quo is that customers usually put their data into a single cloud (which is subject to the vendor lock-in risk) and then simply trust to luck. Based on comprehensive analysis of various state-of-the-art cloud vendors, this paper proposes a novel data hosting scheme (named CHARM) which integrates two key functions desired. The first is selecting several suitable clouds and an appropriate redundancy strategy to store data with minimized monetary cost and guaranteed availability. The second is triggering a transition process to re-distribute data according to the variations of data access pattern and pricing of clouds. We evaluate the performance of CHARM using both trace-driven simulations and prototype experiments. The results show that compared with the major existing schemes, CHARM not only saves around 20 percent of monetary cost but also exhibits sound adaptability to data and price adjustments.

...read moreread less

Journal Article•DOI•

Partitioned Global Address Space Languages

[...]

Mattias De Wael¹, Stefan Marr¹, Bruno De Fraine¹, Tom Van Cutsem¹, Wolfgang De Meuter¹ - Show less +1 more•Institutions (1)

Vrije Universiteit Brussel¹

26 May 2015-ACM Computing Surveys

TL;DR: This survey proposes a definition and a taxonomy of the Partitioned Global Address Space, revealing that today’s PGAS languages focus on distributing regular data and distinguish only between local and remote data access cost, whereas the distribution of irregular data and the adoption of richer dataAccess cost models remain open challenges.

...read moreread less

Abstract: The Partitioned Global Address Space (PGAS) model is a parallel programming model that aims to improve programmer productivity while at the same time aiming for high performance. The main premise of PGAS is that a globally shared address space improves productivity, but that a distinction between local and remote data accesses is required to allow performance optimizations and to support scalability on large-scale parallel architectures. To this end, PGAS preserves the global address space while embracing awareness of nonuniform communication costs.Today, about a dozen languages exist that adhere to the PGAS model. This survey proposes a definition and a taxonomy along four axes: how parallelism is introduced, how the address space is partitioned, how data is distributed among the partitions, and finally, how data is accessed across partitions. Our taxonomy reveals that today’s PGAS languages focus on distributing regular data and distinguish only between local and remote data access cost, whereas the distribution of irregular data and the adoption of richer data access cost models remain open challenges.

...read moreread less

Proceedings Article•DOI•

Machine-interpretable dataset and service descriptions for heterogeneous data access and retrieval

[...]

Anastasia Dimou¹, Ruben Verborgh¹, Miel Vander Sande¹, Erik Mannens¹, Rik Van de Walle¹ - Show less +1 more•Institutions (1)

Ghent University¹

16 Sep 2015

TL;DR: An approach that takes advantage of widely-accepted vocabularies, originally used to advertise services or datasets, to define how to access Web-based or other data sources is introduced, offering a granular solution for accessing and mapping data.

...read moreread less

Abstract: The rdf data model allows the description of domain-level knowledge that is understandable by both humans and machines. rdf data can be derived from different source formats and diverse access points, ranging from databases or files in csv format to data retrieved from Web apis in json, Web Services in xml or any other speciality formats. To this end, machine-interpretable mapping languages, such as rml, were introduced to uniformly define how data in multiple heterogeneous sources is mapped to the rdf data model, independently of their original format. However, the way in which this data is accessed and retrieved still remains hard-coded, as corresponding descriptions are often not available or not taken into account. In this paper, we introduce an approach that takes advantage of widely-accepted vocabularies, originally used to advertise services or datasets, such as Hydra or dcat, to define how to access Web-based or other data sources. Consequently, the generation of rdf representations is facilitated and further automated, while the machine-interpretable descriptions of the connectivity to the original data remain independent and interoperable, offering a granular solution for accessing and mapping data.

...read moreread less

Proceedings Article•DOI•

On the Design and Scalability of Distributed Shared-Data Databases

[...]

Simon Loesing¹, Markus Pilman¹, Thomas Etter¹, Donald Kossmann²•Institutions (2)

ETH Zurich¹, Microsoft²

27 May 2015

TL;DR: This paper analyzes an alternative architecture design for distributed relational databases that overcomes the limitations of partitioned databases and introduces techniques for scalable transaction processing in shared-data environments.

...read moreread less

Abstract: Database scale-out is commonly implemented by partitioning data across several database instances. This approach, however, has several restrictions. In particular, partitioned databases are inflexible in large-scale deployments and assume a partition-friendly workload in order to scale. In this paper, we analyze an alternative architecture design for distributed relational databases that overcomes the limitations of partitioned databases. The architecture is based on two fundamental principles: We decouple query processing and transaction management from data storage, and we share data across query processing nodes. The combination of these design choices provides scalability, elasticity, and operational flexibility without making any assumptions on the workload. As a drawback, sharing data among multiple database nodes causes synchronization overhead. To address this limitation, we introduce techniques for scalable transaction processing in shared-data environments. Specifically, we describe mechanisms for efficient data access, concurrency control, and data buffering. In combination with new hardware trends, the techniques enable performance characteristics that top state-of-the-art partitioned databases.

...read moreread less

Journal Article•DOI•

From RBAC to ABAC: Constructing Flexible Data Access Control for Cloud Storage Services

[...]

Yan Zhu¹, Dijiang Huang², Chang-Jyun Hu¹, Xin Wang¹•Institutions (2)

University of Science and Technology Beijing¹, Arizona State University²

01 Jul 2015-IEEE Transactions on Services Computing

TL;DR: This paper presents a new ABE scheme called attribute-based encryption with attribute hierarchies (ABE-AH) to provide an efficient approach to implement comparison operations between attribute values on a poset derived from an attribute lattice and presents a practical construction of ABE-AH based on forward and backward derivation functions.

...read moreread less

Abstract: This paper addresses how to construct an RBAC-compatible secure cloud storage service with a user-friendly and easy-to-manage attribute-based access control (ABAC) mechanism. Similar to role hierarchies in RBAC, attribute hierarchies (considered as partial ordering relations) are introduced into attribute-based encryption (ABE) in order to define a seniority relation among all values of an attribute, whereby a user holding senior attribute values acquires permissions of his/her juniors. Based on these notations, we present a new ABE scheme called attribute-based encryption with attribute hierarchies (ABE-AH) to provide an efficient approach to implement comparison operations between attribute values on a poset derived from an attribute lattice. By using bilinear groups of a composite order, we present a practical construction of ABE-AH based on forward and backward derivation functions. Compared with prior solutions, our scheme offers a compact policy representation approach that can significantly reduce the size of private-keys and ciphertexts. To demonstrate how to use the presented solution, we illustrate how to provide richer expressive access policies to facilitate flexible access control for data access services in clouds.

...read moreread less

Journal Article•DOI•

An open Web-based system for the analysis and sharing of animal tracking data

[...]

Ross G. Dwyer¹, Charles Brooking¹, Wilfred Brimblecombe¹, Hamish A. Campbell¹, Hamish A. Campbell², Jane Hunter¹, Matthew E. Watts¹, Craig E. Franklin¹ - Show less +4 more•Institutions (2)

University of Queensland¹, University of New England (Australia)²

29 Jan 2015-Animal Biotelemetry

TL;DR: The online system developed, OzTrack, offers a set of robust, up-to-date and accessible tools for managing, processing, visualising and analysing animal location data and linking these outputs with environmental datasets.

...read moreread less

Abstract: Improvements in telemetry technology are allowing us to monitor animal movements with increasing accuracy, precision and frequency. The increased complexity of the data collections, however, demands additional software and programming skills to process, store and disseminate the datasets. Recent focus on data availability has also heightened the need for sustainable data management solutions to ensure data integrity and provide longer term access. In the last ten years, a number of online facilities have been developed for the archiving, processing and sharing of telemetry data. These facilities offer secure storage, multi-user support and analysis tools and are a step along the way to improving data access, long-term data preservation and science communication. While these software platforms promote data sharing, access to the majority of the data and to the software behind these systems remains restricted. In this paper, we present a comprehensive, highly accessible and fully transparent software facility for animal movement data. The online system we developed ( http://oztrack.org ) offers a set of robust, up-to-date and accessible tools for managing, processing, visualising and analysing animal location data and linking these outputs with environmental datasets. As OzTrack uses exclusively free and open-source software, and the source code is available online, the system promotes open access not only to data but also to the tools and software underpinning the system. We outline the capabilities and limitations of the infrastructure design and discuss the uptake of this platform by the Australasian biotelemetry community. We discuss whether an open approach to analysis tools and software encourages a more open approach to sharing data, information and knowledge. Finally, we discuss why a free and open approach enhances longer term sustainability and enables data storage facilities to evolve in parallel with the telemetry devices themselves.

...read moreread less

Proceedings Article•

Just-In-Time Data Virtualization: Lightweight Data Management with ViDa

[...]

Manos Karpathiotakis¹, Ioannis Alagiannis², Thomas Heinis², Miguel Branco², Anastasia Ailamaki² - Show less +1 more•Institutions (2)

National and Kapodistrian University of Athens¹, École Polytechnique Fédérale de Lausanne²

31 Dec 2015

TL;DR: ViDa is built, a system which reads data in its raw format and processes queries using adaptive, just-in-time operators, and features a language expressive enough to support heterogeneous data models, and to which existing languages can be translated.

...read moreread less

Abstract: As the size of data and its heterogeneity increase, traditional database system architecture becomes an obstacle to data analysis. Integrating and ingesting (loading) data into databases is quickly becoming a bottleneck in face of massive data as well as increasingly heterogeneous data formats. Still, state-of-the-art approaches typically rely on copying and transforming data into one (or few) repositories. Queries, on the other hand, are often ad-hoc and supported by pre-cooked operators which are not adaptive enough to optimize access to data. As data formats and queries increasingly vary, there is a need to depart from the current status quo of static query processing primitives and build dynamic, fully adaptive architectures. We build ViDa, a system which reads data in its raw format and processes queries using adaptive, just-in-time operators. Our key insight is use of virtualization, i.e., abstracting data and manipulating it regardless of its original format, and dynamic generation of operators. ViDa’s query engine is generated just-in-time; its caches and its query operators adapt to the current query and the workload, while also treating raw datasets as its native storage structures. Finally, ViDa features a language expressive enough to support heterogeneous data models, and to which existing languages can be translated. Users therefore have the power to choose the language best suited for an analysis.

...read moreread less

Journal Article•

Time-Based Proxy Re-encryption Scheme for Secure Data Sharing in a Cloud Environment

[...]

Dayananda Rb, G. Manoj Someswar¹•Institutions (1)

Jawaharlal Nehru Technological University, Hyderabad¹

06 Jan 2015-International journal of emerging trends in science and technology

TL;DR: The system model and security model in the scheme are described and the design goals and related assumptions are provided and it is assumed that the cloud infrastructures are more reliable and powerful than personal computers.

...read moreread less

Abstract: In this research paper, we will describe the system model and security model in our scheme and provide our design goals and related assumptions. We consider a cloud computing environment consisting of a cloud service provider (CSP), a data owner, and many users. The CSP maintains cloud infrastructures, which pool the bandwidth, storage space, and CPU power of many cloud servers to provide 24/7 services. We assume that the cloud infrastructures are more reliable and powerful than personal computers. In our system, the CSP mainly provides two services: data storage and re-encryption. After obtaining the encrypted data from the data owner, the CSP will store the data on several cloud servers, which can be chosen by the consistent hash function, where the input of the consistent hash function is the key of the data, and the outputs of the consistent hash function are the IDs of the servers that store the data. On receiving a data access request from a user, the CSP will re-encrypt the cipher text based on its own time, and return the re-encrypted cipher text.

...read moreread less

Book Chapter•DOI•

Ontology-Driven Extraction of Event Logs from Relational Databases

[...]

Diego Calvanese¹, Marco Montali¹, Alifah Syamsiyah¹, Wil M. P. van der Aalst²•Institutions (2)

Free University of Bozen-Bolzano¹, Eindhoven University of Technology²

31 Aug 2015

TL;DR: In this article, the authors propose a framework that supports domain experts in the extraction of XES event log information from legacy relational databases, and consequently enables the application of standard process mining tools on such data.

...read moreread less

Abstract: Process mining is an emerging discipline whose aim is to discover, monitor and improve real processes by extracting knowledge from event logs representing actual process executions in a given organizational setting. In this light, it can be applied only if faithful event logs, adhering to accepted standards (such as XES), are available. In many real-world settings, though, such event logs are not explicitly given, but are instead implicitly represented inside legacy information systems of organizations, which are typically managed through relational technology. In this work, we devise a novel framework that supports domain experts in the extraction of XES event log information from legacy relational databases, and consequently enables the application of standard process mining tools on such data. Differently from previous work, the extraction is driven by a conceptual representation of the domain of interest in terms of an ontology. On the one hand, this ontology is linked to the underlying legacy data leveraging the well-established ontology-based data access (OBDA) paradigm. On the other hand, our framework allows one to enrich the ontology through user-oriented log extraction annotations, which can be flexibly used to provide different log-oriented views over the data. Different data access modes are then devised so as to view the legacy data through the lens of XES.

...read moreread less

Patent•

Systems and methods for geolocation-based authentication and authorization

[...]

Joseph D. Hughes, Patrick Mcdevitt, Joseph Barbara

30 Oct 2015

TL;DR: In this article, the authors present a system for controlling the authentication or authorization of a mobile device user for enabling access to the resources or functionality associated with an application or service executable at the user's mobile device.

...read moreread less

Abstract: Systems and methods are provided for controlling the authentication or authorization of a mobile device user for enabling access to the resources or functionality associated with an application or service executable at the user's mobile device. The user or user's mobile device may be automatically authenticated or authorized to access application or system resources at the device when the current geographic location of the user's mobile device is determined to be within a preauthorized zone, e.g., based on a predetermined geo-fence corresponding to the preauthorized zone. A security level or amount of authorization credentials required to authorize a user for data access may be varied according any of a plurality of security levels, when the current or last known geographic location of the user's mobile device is determined to be outside the preauthorized zone.

...read moreread less

Collapse