Showing papers on "Data access published in 2017"

PDF

Open Access

Proceedings Article•DOI•

On the Design of a Blockchain Platform for Clinical Trial and Precision Medicine

[...]

Zon-Yin Shae¹, Jeffrey J. P. Tsai¹•Institutions (1)

05 Jun 2017

TL;DR: This paper proposes a blockchain platform architecture for clinical trial and precision medicine and discusses various design aspects and provides some insights in the technology requirements and challenges.

...read moreread less

Abstract: This paper proposes a blockchain platform architecture for clinical trial and precision medicine and discusses various design aspects and provides some insights in the technology requirements and challenges. We identify 4 new system architecture components that are required to be built on top of traditional blockchain and discuss their technology challenges in our blockchain platform: (a) a new blockchain based general distributed and parallel computing paradigm component to devise and study parallel computing methodology for big data analytics, (b) blockchain application data management component for data integrity, big data integration, and integrating disparity of medical related data, (c) verifiable anonymous identity management component for identity privacy for both person and Internet of Things (IoT) devices and secure data access to make possible of the patient centric medicine, and (d) trust data sharing management component to enable a trust medical data ecosystem for collaborative research.

...read moreread less

172 citations

Proceedings Article•DOI•

IoT data privacy via blockchains and IPFS

[...]

Muhammad Salek Ali¹, Koustabh Dolui, Fabio Antonelli•Institutions (1)

University of Bologna¹

22 Oct 2017

TL;DR: The proposed architecture facilitates IoT communications on top of a software stack of blockchains and peer-to-peer data storage mechanisms to have privacy built into it, and to be adaptable for various IoT use cases.

...read moreread less

Abstract: Blockchain, the underlying technology of cryptocurrency networks like Bitcoin, can prove to be essential towards realizing the vision of a decentralized, secure, and open Internet of Things (IoT) revolution. There is a growing interest in many research groups towards leveraging blockchains to provide IoT data privacy without the need for a centralized data access model. This paper aims to propose a decentralized access model for IoT data, using a network architecture that we call a modular consortium architecture for IoT and blockchains. The proposed architecture facilitates IoT communications on top of a software stack of blockchains and peer-to-peer data storage mechanisms. The architecture is aimed to have privacy built into it, and to be adaptable for various IoT use cases. To understand the feasibility and deployment considerations for implementing the proposed architecture, we conduct performance analysis of existing blockchain development platforms, Ethereum and Monax.

...read moreread less

145 citations

Journal Article•DOI•

Flexible Data Access Control Based on Trust and Reputation in Cloud Computing

[...]

Zheng Yan¹, Xueyun Li², Mingjun Wang¹, Athanasios V. Vasilakos³•Institutions (3)

Xidian University¹, Aalto University², Luleå University of Technology³

01 Jul 2017

TL;DR: A scheme to control data access in cloud computing based on trust evaluated by the data owner and/or reputations generated by a number of reputation centers in a flexible manner is proposed by applying Attribue-Based Encryption and Proxy Re-Encryption.

...read moreread less

Abstract: Cloud computing offers a new way of services and has become a popular service platform. Storing user data at a cloud data center greatly releases storage burden of user devices and brings access convenience. Due to distrust in cloud service providers, users generally store their crucial data in an encrypted form. But in many cases, the data need to be accessed by other entities for fulfilling an expected service, e.g., an eHealth service. How to control personal data access at cloud is a critical issue. Various application scenarios request flexible control on cloud data access based on data owner policies and application demands. Either data owners or some trusted third parties or both should flexibly participate in this control. However, existing work hasn't yet investigated an effective and flexible solution to satisfy this demand. On the other hand, trust plays an important role in data sharing. It helps overcoming uncertainty and avoiding potential risks. But literature still lacks a practical solution to control cloud data access based on trust and reputation. In this paper, we propose a scheme to control data access in cloud computing based on trust evaluated by the data owner and/or reputations generated by a number of reputation centers in a flexible manner by applying Attribue-Based Encryption and Proxy Re-Encryption. We integrate the concept of context-aware trust and reputation evaluation into a cryptographic system in order to support various control scenarios and strategies. The security and performance of our scheme are evaluated and justified through extensive analysis, security proof, comparison and implementation. The results show the efficiency, flexibility and effectiveness of our scheme for data access control in cloud computing.

...read moreread less

124 citations

Proceedings Article•DOI•

Towards using blockchain technology for eHealth data access management

[...]

Nabil Rifi, Elie Rachkidi, Nazim Agoulmine, Nada Chendeb Taher¹•Institutions (1)

Lebanese University¹

01 Oct 2017

TL;DR: This paper illustrates the specific problems and the benefits of the blockchain technology for the deployment of a secure and a scalable solution for medical data exchange in order to have the best performance possible.

...read moreread less

Abstract: eHealth is a technology that is growing in importance over time, varying from remote access to Medical Records, such as Electronic Health Records (EHR), or Electronic Medical Records (EMR), to real-time data exchange from different on-body sensors coming from different patients. With this huge amount of critical data being exchanged, problems and challenges arise. Privacy and confidentiality of this critical medical data are of high concern to the patients and authorized persons to use this data. On the other hand, scalability and interoperability are also important problems that should be considered in the final solution. This paper illustrates the specific problems and highlights the benefits of the blockchain technology for the deployment of a secure and a scalable solution for medical data exchange in order to have the best performance possible.

...read moreread less

115 citations

Journal Article•DOI•

Water quality data for national-scale aquatic research: The Water Quality Portal

[...]

Emily K. Read, Lindsay Carr, Laura De Cicco, Hilary A. Dugan¹, Paul C. Hanson¹, Julia A. Hart¹, James M. Kreft, Jordan S. Read, Luke A. Winslow - Show less +5 more•Institutions (1)

University of Wisconsin-Madison¹

01 Feb 2017-Water Resources Research

TL;DR: The Water Quality Portal (WQP) as mentioned in this paper is the largest standardized water quality data set available at the time of this writing, with more than 290 million records from more than 2.7 million sites in groundwater, inland, and coastal waters.

...read moreread less

Abstract: Aquatic systems are critical to food, security, and society. But, water data are collected by hundreds of research groups and organizations, many of which use nonstandard or inconsistent data descriptions and dissemination, and disparities across different types of water observation systems represent a major challenge for freshwater research. To address this issue, the Water Quality Portal (WQP) was developed by the U.S. Environmental Protection Agency, the U.S. Geological Survey, and the National Water Quality Monitoring Council to be a single point of access for water quality data dating back more than a century. The WQP is the largest standardized water quality data set available at the time of this writing, with more than 290 million records from more than 2.7 million sites in groundwater, inland, and coastal waters. The number of data contributors, data consumers, and third-party application developers making use of the WQP is growing rapidly. Here we introduce the WQP, including an overview of data, the standardized data model, and data access and services; and we describe challenges and opportunities associated with using WQP data. We also demonstrate through an example the value of the WQP data by characterizing seasonal variation in lake water clarity for regions of the continental U.S. The code used to access, download, analyze, and display these WQP data as shown in the figures is included as supporting information.

...read moreread less

115 citations

Journal Article•DOI•

Ontology Based Data Access in Statoil

[...]

Evgeny Kharlamov¹, Dag Hovland², Martin G. Skjæveland², Dimitris Bilidas³, Ernesto Jiménez-Ruiz², Guohui Xiao⁴, Ahmet Soylu⁵, Davide Lanti⁴, Martin Rezk⁴, Dmitriy Zheleznyakov¹, Martin Giese², Hallstein Lie⁶, Yannis Ioannidis³, Yannis Kotidis⁷, Manolis Koubarakis³, Arild Waaler² - Show less +12 more•Institutions (7)

University of Oxford¹, University of Oslo², National and Kapodistrian University of Athens³, Free University of Bozen-Bolzano⁴, Norwegian University of Science and Technology⁵, Equinor⁶, Athens University of Economics and Business⁷

01 May 2017-Journal of Web Semantics

TL;DR: This work has developed a deployment module to create ontologies and mappings from relational databases in a semi-automatic fashion; a query processing module to perform and optimise the process of translating ontological queries into data queries and their execution over either a single DB of federated DBs; and a query formulation module to support query construction for engineers with a limited IT background.

...read moreread less

105 citations

Journal Article•DOI•

A characterization of workflow management systems for extreme-scale applications

[...]

Rafael Ferreira da Silva¹, Rosa Filgueira², Rosa Filgueira³, Ilia Pietri⁴, Ming Jiang⁵, Rizos Sakellariou⁶, Ewa Deelman¹ - Show less +3 more•Institutions (6)

University of Southern California¹, University of Edinburgh², British Geological Survey³, National and Kapodistrian University of Athens⁴, Lawrence Livermore National Laboratory⁵, University of Manchester⁶

01 Oct 2017-Future Generation Computer Systems

TL;DR: A novel characterization of workflow management systems using features commonly associated with extreme-scale computing applications is presented and 15 popular workflow management Systems are classified in terms of workflow execution models, heterogeneous computing environments, and data access methods.

...read moreread less

100 citations

Journal Article•DOI•

A Data-Oriented M2M Messaging Mechanism for Industrial IoT Applications

[...]

Zhaozong Meng¹, Zhipeng Wu¹, Cahyo Muvianto¹, John Gray¹•Institutions (1)

University of Manchester¹

01 Feb 2017-IEEE Internet of Things Journal

TL;DR: A data-oriented M2M messaging mechanism based on ZeroMQ for the ubiquitous data access in rich sensing pervasive industrial applications is presented and the results demonstrate the feasibility of the proposed messaging mechanism.

...read moreread less

Abstract: Machine-to-machine (M2M) communication is a key enabling technology for the future industrial Internet of Things applications. It plays an important role in the connectivity and integration of computerized machines, such as sensors, actuators, controllers, and robots. The requirements in flexibility, efficiency, and cross-platform compatibility of the intermodule communication between the connected machines raise challenges for the M2M messaging mechanism toward ubiquitous data access and events notification. This investigation determines the challenges facing the M2M communication of industrial systems and presents a data-oriented M2M messaging mechanism based on ZeroMQ for the ubiquitous data access in rich sensing pervasive industrial applications. To prove the feasibility of the proposed solution, the EU funded PickNPack production line with a reference industrial network architecture is presented, and the communication between a microwave sensor device and the quality assessment and sensing module controller of the PickNPack line is illustrated as a case study. The evaluation is carried out through qualitative analysis and experimental studies, and the results demonstrate the feasibility of the proposed messaging mechanism. Due to the flexibility in dealing with hierarchical system architecture and cross-platform heterogeneity of industrial applications, this messaging mechanism deserves extensive investigations and further evaluations.

...read moreread less

96 citations

Journal Article•DOI•

Semantic access to streaming and static data at Siemens

[...]

Evgeny Kharlamov¹, Theofilos Mailis², Gulnar Mehdi³, Christian Neuenstadt, zgr zep, Mikhail Roshchin³, Nina Solomakhina³, Ahmet Soylu⁴, Christoforos Svingos², Sebastian Brandt³, Martin Giese⁵, Yannis Ioannidis², Steffen Lamparter³, Ralf Mller, Yannis Kotidis⁶, Arild Waaler⁵ - Show less +12 more•Institutions (6)

University of Oxford¹, National and Kapodistrian University of Athens², Siemens³, Norwegian University of Science and Technology⁴, University of Oslo⁵, Athens University of Economics and Business⁶

01 May 2017-Journal of Web Semantics

TL;DR: The Optique platform is introduced as a suitable OBDA solution for Siemens with a number of novel techniques and components including a deployment module, BootOX for ontology and mapping bootstrapping, a query language STARQL that allows for a uniform querying of both streaming and static data, and a query formulation interface, OptiqueVQS, that allows to formulate STARQL queries without prior knowledge of its formal syntax.

...read moreread less

84 citations

Proceedings Article•DOI•

Analyza: Exploring Data with Conversation

[...]

Kedar Dhamdhere¹, Kevin Snow McCurley¹, Ralfi Nahmias¹, Mukund Sundararajan¹, Qiqi Yan¹ - Show less +1 more•Institutions (1)

Google¹

07 Mar 2017

TL;DR: Analyza, a system that helps lay users explore data and discuss the key design decisions in implementing this system, including how to mix structured and natural language modalities, how to use conversation to disambiguate and simplify querying, and how to efficiently curate the data.

...read moreread less

Abstract: We describe Analyza, a system that helps lay users explore data. Analyza has been used within two large real world systems. The first is a question-and-answer feature in a spreadsheet product. The second provides convenient access to a revenue/inventory database for a large sales force. Both user bases consist of users who do not necessarily have coding skills, demonstrating Analyza's ability to democratize access to data. We discuss the key design decisions in implementing this system. For instance, how to mix structured and natural language modalities, how to use conversation to disambiguate and simplify querying, how to rely on the ``semantics' of the data to compensate for the lack of syntactic structure, and how to efficiently curate the data.

...read moreread less

78 citations

Journal Article•DOI•

Privacy-preserving attribute-keyword based data publish-subscribe service on cloud platforms

[...]

Kan Yang¹, Kuan Zhang¹, Xiaohua Jia², M. Anwar Hasan¹, Xuemin Shen¹ - Show less +1 more•Institutions (2)

University of Waterloo¹, City University of Hong Kong²

01 May 2017-Information Sciences

TL;DR: This paper employs the attribute-based encryption with decryption outsourcing to encrypt the published data, such that the publishers can control the data access by themselves and the major decryption overhead can be shift from the subscribers devices to the cloud server.

...read moreread less

Proceedings Article•DOI•

Towards using blockchain technology for IoT data access protection

[...]

Nabil Rifi, Elie Rachkidi, Nazim Agoulmine, Nada Chendeb Taher¹•Institutions (1)

Lebanese University¹

12 Sep 2017

TL;DR: This paper illustrates an architecture based on blockchain technology, and a protocol for data access, using smart contracts and a publisher-subscriber mechanism.

...read moreread less

Abstract: In the past few years, the number of wireless devices connected to the Internet has increased to a number that could reach billions in the next few years. While cloud computing is being seen as the solution to process this data, security challenges could not be addressed solely with this technology. Security problems will continue to increase with such a model, especially for private and sensitive data such as data personal and medical data collected with more and more sophisticated connected devices (forming the IoT). Thus the need for a fully decentralized peer to peer and secure technology to overcome these problems. The blockchain Technology is a promising approach giving the properties it brings to the field. This paper illustrates an architecture based on blockchain technology, and a protocol for data access, using smart contracts and a publisher-subscriber mechanism.

...read moreread less

Proceedings Article•DOI•

Developing an Adaptive Risk-Based Access Control Model for the Internet of Things

[...]

Hany F. Atlam¹, Ahmed Alenezi¹, Robert John Walters¹, Gary Wills¹, Joshua Daniel² - Show less +1 more•Institutions (2)

University of Southampton¹, BT Research²

01 Jun 2017

TL;DR: A risk-based access control model for IoT technology that takes into account real-time data information request for IoT devices and gives dynamic feedback and uses smart contracts to provide adaptive features in which the user behaviour is monitored to detect any abnormal actions from authorized users.

...read moreread less

Abstract: The Internet of Things (IoT) is creating a revolution in the number of connected devices. Cisco reported that there were 25 billion IoT devices in 2015 and modest estimation that this number will almost double by 2020. Society has become dependent on these billions of devices, devices that are connected and communicating with each other all the time with information constantly share between users, services, and internet providers. The emergent IoT devices as a technology are creating a huge security rift between users and usability, sacrificing usability for security created a number of major issues. First, IoT devices are classified under Bring Your Own Device (BYOD) that blows any organization security boundary and make them a target for espionage or tracking. Second, the size of the data generated from IoT makes big data problems pale in comparison not to mention IoT devices need a real-time response. Third, is incorporating secure access and control for IoT devices ranging from edge nodes devices to application level (business intelligence reporting tools) is a challenge because it has to account for several hardware and application levels. Establishing a secure access control model between different IoT devices and services is a major milestone for the IoT. This is important because data leakage and unauthorized access to data have a high impact on our IoT devices. However, traditional access control models with the static and rigid infrastructure cannot provide the required security for the IoT infrastructure. Therefore, this paper proposes a risk-based access control model for IoT technology that takes into account real-time data information request for IoT devices and gives dynamic feedback. The proposed model uses IoT environment features to estimate the security risk associated with each access request using user context, resource sensitivity, action severity and risk history as inputs for security risk estimation algorithm that is responsible for access decision. Then the proposed model uses smart contracts to provide adaptive features in which the user behaviour is monitored to detect any abnormal actions from authorized users.

...read moreread less

Journal Article•DOI•

Does this App Really Need My Location?: Context-Aware Privacy Management for Smartphones

[...]

Saksham Chitkara¹, Nishad Gothoskar¹, Suhas Harish¹, Jason Hong¹, Yuvraj Agarwal¹ - Show less +1 more•Institutions (1)

Carnegie Mellon University¹

11 Sep 2017

TL;DR: This paper presents the design and implementation of ProtectMyPrivacy (PmP) for Android, which can detect critical contextual information at runtime when privacy-sensitive data accesses occur and infers the purpose of the data access, i.e. whether the dataAccess is by a third-party library or by the app itself for its functionality.

...read moreread less

Abstract: The enormous popularity of smartphones, their rich sensing capabilities, and the data they have about their users have lead to millions of apps being developed and used. However, these capabilities have also led to numerous privacy concerns. Platform manufacturers, as well as researchers, have proposed numerous ways of mitigating these concerns, primarily by providing fine-grained visibility and privacy controls to the user on a per-app basis. In this paper, we show that this per-app permission approach is suboptimal for many apps, primarily because most data accesses occur due to a small set of popular third-party libraries which are common across multiple apps. To address this problem, we present the design and implementation of ProtectMyPrivacy (PmP) for Android, which can detect critical contextual information at runtime when privacy-sensitive data accesses occur. In particular, PmP infers the purpose of the data access, i.e. whether the data access is by a third-party library or by the app itself for its functionality. Based on crowdsourced data, we show that there are in fact a set of 30 libraries which are responsible for more than half of private data accesses. Controlling sensitive data accessed by these libraries can therefore be an effective mechanism for managing their privacy. We deployed our PmP app to 1,321 real users, showing that the number of privacy decisions that users have to make are significantly reduced. In addition, we show that our users are better protected against data leakage when using our new library-based blocking mechanism as compared to the traditional app-level permission mechanisms.

...read moreread less

Proceedings Article•DOI•

Building Natural Language Interfaces to Web APIs

[...]

Yu Su¹, Ahmed Hassan Awadallah², Madian Khabsa², Patrick Pantel², Michael Gamon², Mark J. Encarnación² - Show less +2 more•Institutions (2)

University of California, Santa Barbara¹, Microsoft²

06 Nov 2017

TL;DR: This work proposes the first end-to-end framework to build an NL2API for a given web API, and applies it to real-world APIs, and shows that it can collect high-quality training data at a low cost, and build NL2APIs with good performance from scratch.

...read moreread less

Abstract: As the Web evolves towards a service-oriented architecture, application program interfaces (APIs) are becoming an increasingly important way to provide access to data, services, and devices. We study the problem of natural language interface to APIs (NL2APIs), with a focus on web APIs for web services. Such NL2APIs have many potential benefits, for example, facilitating the integration of web services into virtual assistants. We propose the first end-to-end framework to build an NL2API for a given web API. A key challenge is to collect training data, i.e., NL command-API call pairs, from which an NL2API can learn the semantic mapping from ambiguous, informal NL commands to formal API calls. We propose a novel approach to collect training data for NL2API via crowdsourcing, where crowd workers are employed to generate diversified NL commands. We optimize the crowdsourcing process to further reduce the cost. More specifically, we propose a novel hierarchical probabilistic model for the crowdsourcing process, which guides us to allocate budget to those API calls that have a high value for training NL2APIs. We apply our framework to real-world APIs, and show that it can collect high-quality training data at a low cost, and build NL2APIs with good performance from scratch. We also show that our modeling of the crowdsourcing process can improve its effectiveness, such that the training data collected via our approach leads to better performance of NL2APIs than a strong baseline.

...read moreread less

Journal Article•DOI•

BAMHealthCloud: A biometric authentication and data management system for healthcare data in cloud

[...]

Kashish Ara Shakil¹, Farhana Javed Zareen¹, Mansaf Alam¹, Suraiya Jabin¹•Institutions (1)

Jamia Millia Islamia¹

14 Jul 2017-Journal of King Saud University - Computer and Information Sciences

TL;DR: In this article, a behavioral biometric signature-based authentication mechanism is proposed to ensure the security of e-medical data access in cloud-based healthcare management system, which achieves high accuracy rate for secure data access and retrieval.

...read moreread less

Proceedings Article•

Ontology-based data access with a horn fragment of metric temporal logic

[...]

Sebastian Brandt¹, E. Güzel Kalaycı², Roman Kontchakov³, Vladislav Ryzhikov², Guohui Xiao², Michael Zakharyaschev³ - Show less +2 more•Institutions (3)

Siemens¹, Free University of Bozen-Bolzano², Birkbeck, University of London³

01 Jan 2017

TL;DR: It is demonstrated by two real-world use cases that nonrecursive datalogMTL programs can express complex temporal concepts from typical user queries and thereby facilitate access to log data.

...read moreread less

Abstract: We advocate datalogMTL, a datalog extension of a Horn fragment of the metric temporal logic MTL, as a language for ontology-based access to temporal log data. We show that datalogMTL is EXPSPACE-complete even with punctual intervals, in which case MTL is known to be undecidable. Nonrecursive datalogMTL turns out to be PSPACE-complete for combined complexity and in AC0 for data complexity. We demonstrate by two real-world use cases that nonrecursive datalogMTL programs can express complex temporal concepts from typical user queries and thereby facilitate access to log data. Our experiments with Siemens turbine data and MesoWest weather data show that datalogMTL ontology-mediated queries are efficient and scale on large datasets of up to 11GB.

...read moreread less

Proceedings Article•

Log-structured non-volatile main memory

[...]

Qingda Hu¹, Jinglei Ren², Anirudh Badam², Jiwu Shu¹, Thomas Moscibroda² - Show less +1 more•Institutions (2)

Tsinghua University¹, Microsoft²

12 Jul 2017

TL;DR: This paper presents a log-structured NVMM system that not only maintains NVMM in a compact manner but also reduces the write traffic and the number of persist barriers needed for executing transactions.

...read moreread less

Abstract: Emerging non-volatile main memory (NVMM) unlocks the performance potential of applications by storing persistent data in the main memory. Such applications require a lightweight persistent transactional memory (PTM) system, instead of a heavyweight filesystem or database, to have fast access to data. In a PTM system, the memory usage, both capacity and bandwidth, plays a key role in dictating performance and efficiency. Existing memory management mechanisms for PTMs generate high memory fragmentation, high write traffic and a large number of persist barriers, since data is first written to a log and then to the main data store. In this paper, we present a log-structured NVMM system that not only maintains NVMM in a compact manner but also reduces the write traffic and the number of persist barriers needed for executing transactions. All data allocations and modifications are appended to the log which becomes the location of the data. Further, we address a unique challenge of log-structured memory management by designing a tree-based address translation mechanism where access granularities are flexible and different from allocation granularities. Our results show that the new system enjoys up to 89.9% higher transaction throughput and up to 82.8% lower write traffic than a traditional PTM system.

...read moreread less

Proceedings Article•DOI•

A blockchain-based data usage auditing architecture with enhanced privacy and availability

[...]

Nesrine Kaaniche¹, Maryline Laurent¹•Institutions (1)

Université Paris-Saclay¹

01 Oct 2017

TL;DR: A blockchain-based data usage auditing architecture ensuring availability and accountability in a privacy-preserving fashion and based on cryptographic mechanisms, preserves privacy of data owners and ensures secrecy for shared data with multiple service providers.

...read moreread less

Abstract: Recent years have witnessed the trend of increasingly relying on distributed infrastructures. This increased the number of reported incidents of security breaches compromising users' privacy, where third parties massively collect, process and manage users' personal data. Towards these security and privacy challenges, we combine hierarchical identity based cryptographic mechanisms with emerging blockchain infrastructures and propose a blockchain-based data usage auditing architecture ensuring availability and accountability in a privacy-preserving fashion. Our approach relies on the use of auditable contracts deployed in blockchain infrastructures. Thus, it offers transparent and controlled data access, sharing and processing, so that unauthorized users or untrusted servers cannot process data without client's authorization. Moreover, based on cryptographic mechanisms, our solution preserves privacy of data owners and ensures secrecy for shared data with multiple service providers. It also provides auditing authorities with tamper-proof evidences for data usage compliance.

...read moreread less

Journal Article•DOI•

GIFT-Cloud

[...]

Tom Doel¹, Dzhoshkun I. Shakir¹, Rosalind Pratt¹, Michael Aertsen², James Moggridge, Erwin Bellon², Anna L. David¹, Jan Deprest², Tom Vercauteren¹, Sbastien Ourselin¹ - Show less +6 more•Institutions (2)

University College London¹, Katholieke Universiteit Leuven²

01 Feb 2017-Computer Methods and Programs in Biomedicine

TL;DR: A platform for sharing medical imaging data between clinicians and researchers that automates anonymisation of pixel data and metadata at the clinical site and maintains subject data groupings while preserving anonymity.

...read moreread less

Journal Article•DOI•

BioFed: federated query processing over life sciences linked open data

[...]

Ali Hasnain¹, Qaiser Mehmood¹, Syeda Sana e Zainab¹, Muhammad Saleem², Claude N. Warren³, Durre Zehra¹, Stefan Decker¹, Dietrich Rebholz-Schuhmann¹ - Show less +4 more•Institutions (3)

National University of Ireland¹, Leipzig University², IBM³

15 Mar 2017-Journal of Biomedical Semantics

TL;DR: The efficient cataloguing approach of the federated query processing system ’BioFed’, the triple pattern wise source selection and the semantic source normalisation forms the core to the solution and facilitates efficient query generation for data access and provides basic provenance information in combination with the retrieved data.

...read moreread less

Abstract: Biomedical data, e.g. from knowledge bases and ontologies, is increasingly made available following open linked data principles, at best as RDF triple data. This is a necessary step towards unified access to biological data sets, but this still requires solutions to query multiple endpoints for their heterogeneous data to eventually retrieve all the meaningful information. Suggested solutions are based on query federation approaches, which require the submission of SPARQL queries to endpoints. Due to the size and complexity of available data, these solutions have to be optimised for efficient retrieval times and for users in life sciences research. Last but not least, over time, the reliability of data resources in terms of access and quality have to be monitored. Our solution (BioFed) federates data over 130 SPARQL endpoints in life sciences and tailors query submission according to the provenance information. BioFed has been evaluated against the state of the art solution FedX and forms an important benchmark for the life science domain. The efficient cataloguing approach of the federated query processing system ’BioFed’, the triple pattern wise source selection and the semantic source normalisation forms the core to our solution. It gathers and integrates data from newly identified public endpoints for federated access. Basic provenance information is linked to the retrieved data. Last but not least, BioFed makes use of the latest SPARQL standard (i.e., 1.1) to leverage the full benefits for query federation. The evaluation is based on 10 simple and 10 complex queries, which address data in 10 major and very popular data sources (e.g., Dugbank, Sider). BioFed is a solution for a single-point-of-access for a large number of SPARQL endpoints providing life science data. It facilitates efficient query generation for data access and provides basic provenance information in combination with the retrieved data. BioFed fully supports SPARQL 1.1 and gives access to the endpoint’s availability based on the EndpointData graph. Our evaluation of BioFed against FedX is based on 20 heterogeneous federated SPARQL queries and shows competitive execution performance in comparison to FedX, which can be attributed to the provision of provenance information for the source selection. Developing and testing federated query engines for life sciences data is still a challenging task. According to our findings, it is advantageous to optimise the source selection. The cataloguing of SPARQL endpoints, including type and property indexing, leads to efficient querying of data resources over the Web of Data. This could even be further improved through the use of ontologies, e.g., for abstract normalisation of query terms.

...read moreread less

Book Chapter•DOI•

Ontology-Based Data Access for Extracting Event Logs from Legacy Data: The onprom Tool and Methodology

[...]

Diego Calvanese¹, Tahir Emre Kalayci¹, Marco Montali¹, Stefano Tinella•Institutions (1)

Free University of Bozen-Bolzano¹

28 Jun 2017

TL;DR: This work exploits a framework and associated methodology for the extraction of XES event logs from relational data sources that have recently been introduced, and builds on the ontology-based data access (OBDA) paradigm for the actual log extraction.

...read moreread less

Abstract: Process mining aims at discovering, monitoring, and improving business processes by extracting knowledge from event logs. In this respect, process mining can be applied only if there are proper event logs that are compatible with accepted standards, such as extensible event stream (XES). Unfortunately, in many real world set-ups, such event logs are not explicitly given, but instead are implicitly represented in legacy information systems. In this work, we exploit a framework and associated methodology for the extraction of XES event logs from relational data sources that we have recently introduced. Our approach is based on describing logs by means of suitable annotations of a conceptual model of the available data, and builds on the ontology-based data access (OBDA) paradigm for the actual log extraction. Making use of a real-world case study in the services domain, we compare our novel approach with a more traditional extract-transform-load based one, and are able to illustrate its added value. We also present a set of tools that we have developed and that support the OBDA-based log extraction framework. The tools are integrated as plugins of the ProM process mining suite.

...read moreread less

Proceedings Article•DOI•

A Control-Plane Perspective on Reducing Data Access Latency in LTE Networks

[...]

Yuanjie Li¹, Zengwen Yuan¹, Chunyi Peng²•Institutions (2)

University of California, Los Angeles¹, Purdue University²

04 Oct 2017

TL;DR: DPCM is designed, which reduces data access latency through parallel processing approaches and exploiting device-side state replica and is implemented and validated with extensive evaluations.

...read moreread less

Abstract: Control-plane operations are indispensable to providing data access to mobile devices in the 4G LTE networks. They provision necessary control states at the device and network nodes to enable data access. However, the current design may suffer from long data access latency even under good radio conditions. The fundamental problem is that, data-plane packet delivery cannot start or resume until all control-plane procedures are completed, and these control procedures run sequentially by design. We show both are more than necessary under popular use cases. We design DPCM, which reduces data access latency through parallel processing approaches and exploiting device-side state replica. We implement DPCM and validate its effectiveness with extensive evaluations.

...read moreread less

Book Chapter•DOI•

Towards Decentralized Accountability and Self-sovereignty in Healthcare Systems

[...]

Xueping Liang¹, Xueping Liang², Sachin Shetty³, Juan Zhao¹, Daniel Bowden, Danyi Li², Jihong Liu² - Show less +3 more•Institutions (3)

Tennessee State University¹, Chinese Academy of Sciences², Old Dominion University³

06 Dec 2017

TL;DR: This paper proposes using the trusted execution platform enabled by Intel SGX to provide accountability for data access and proposes a decentralized approach with blockchain technology to address the privacy concern.

...read moreread less

Abstract: With the increasing development and adoption of wearable devices, people care more about their health conditions than ever before. Both patients and doctors as well as insurance agencies benefit from this advanced technology. However, the emerging wearable devices creates a major concern over health data privacy as data collected from those devices can reflect patients’ heath conditions and habits, and could increase the data disclosure risks among the healthcare providers and application vendors. In this paper, we propose using the trusted execution platform enabled by Intel SGX to provide accountability for data access and propose a decentralized approach with blockchain technology to address the privacy concern. By developing a web application for personal health data management (PHDM) systems, the individuals are capable of synchronizing sensor data from wearable devices with online account and controlling data access from any third parties. The protected personal health data and data access records are hashed and anchored to a permanent but secure ledger with platform dependency, ensuring data integrity and accountability. Analysis shows that our approach provides user privacy and accountability with acceptable overhead.

...read moreread less

Journal Article•DOI•

Valorising the IoT Databox: creating value for everyone

[...]

Charith Perera¹, Susan Y. L. Wakenshaw², Tim Baarslag³, Hamed Haddadi⁴, Arosha K. Bandara¹, Richard Mortier⁵, Andy Crabtree⁶, Irene C. L. Ng², Derek McAuley⁶, Jon Crowcroft⁵ - Show less +6 more•Institutions (6)

Open University¹, University of Warwick², University of Southampton³, Queen Mary University of London⁴, University of Cambridge⁵, University of Nottingham⁶

01 Jan 2017

TL;DR: In this article, the authors review the research challenges in building personal Databoxes that hold personal data and enable data access by other parties and potentially thus sharing of data with other parties.

...read moreread less

Abstract: The Internet of Things is expected to generate large amounts of heterogeneous data from diverse sources including physical sensors, user devices and social media platforms. Over the last few years, significant attention has been focused on personal data, particularly data generated by smart wearable and smart home devices. Making personal data available for access and trade is expected to become a part of the data-driven digital economy. In this position paper, we review the research challenges in building personal Databoxes that hold personal data and enable data access by other parties and potentially thus sharing of data with other parties. These Databoxes are expected to become a core part of future data marketplaces. Copyright © 2016 The Authors Transactions on Emerging Telecommunications Technologies Published by John Wiley & Sons, Ltd.

...read moreread less

Journal Article•DOI•

Understanding the Purpose of Permission Use in Mobile Apps

[...]

Haoyu Wang¹, Yuanchun Li², Yao Guo², Yuvraj Agarwal³, Jason Hong³ - Show less +1 more•Institutions (3)

Beijing University of Posts and Telecommunications¹, Peking University², Carnegie Mellon University³

11 Jul 2017-ACM Transactions on Information Systems

TL;DR: A text mining based method to infer the purpose of sensitive data access by Android apps is proposed, to extract multiple features from app code and then use those features to train a machine learning classifier for purpose inference.

...read moreread less

Abstract: Mobile apps frequently request access to sensitive data, such as location and contacts. Understanding the purpose of why sensitive data is accessed could help improve privacy as well as enable new kinds of access control. In this article, we propose a text mining based method to infer the purpose of sensitive data access by Android apps. The key idea we propose is to extract multiple features from app code and then use those features to train a machine learning classifier for purpose inference. We present the design, implementation, and evaluation of two complementary approaches to infer the purpose of permission use, first using purely static analysis, and then using primarily dynamic analysis. We also discuss the pros and cons of both approaches and the trade-offs involved.

...read moreread less

Journal Article•DOI•

WebMeV: A Cloud Platform for Analyzing and Visualizing Cancer Genomic Data

[...]

Yaoyu E. Wang¹, Lev Kutnetsov¹, Antony Partensky¹, Jalil Farid¹, John Quackenbush¹ - Show less +1 more•Institutions (1)

Harvard University¹

01 Nov 2017-Cancer Research

TL;DR: An overview of WebMeV is provided and two simple use cases are demonstrated that illustrate the value of putting data analysis in the hands of those looking to explore the underlying biology of the systems being studied.

...read moreread less

Abstract: Although large, complex genomic datasets are increasingly easy to generate, and the number of publicly available datasets in cancer and other diseases is rapidly growing, the lack of intuitive, easy-to-use analysis tools has remained a barrier to the effective use of such data. WebMeV (http://mev.tm4.org) is an open-source, web-based tool that gives users access to sophisticated tools for analysis of RNA-Seq and other data in an interface designed to democratize data access. WebMeV combines cloud-based technologies with a simple user interface to allow users to access large public datasets, such as that from The Cancer Genome Atlas or to upload their own. The interface allows users to visualize data and to apply advanced data mining analysis methods to explore the data and draw biologically meaningful conclusions. We provide an overview of WebMeV and demonstrate two simple use cases that illustrate the value of putting data analysis in the hands of those looking to explore the underlying biology of the systems being studied. Cancer Res; 77(21); e11-14. ©2017 AACR.

...read moreread less

Journal Article•DOI•

Big Remotely Sensed Data: tools, applications and experiences

[...]

Francesco Casu, Michele Manunta, Piyush Agram, R.E. Crippen

25 Sep 2017-Remote Sensing of Environment

TL;DR: The increased availability of large remote sensing datasets is generating heightened interest within the geoscience community, and more generally within human society.

...read moreread less

Journal Article•DOI•

Profunctor Optics: Modular Data Accessors

[...]

Matthew Pickering¹, Jeremy Gibbons², Nicolas Wu¹•Institutions (2)

University of Bristol¹, University of Oxford²

01 Apr 2017

TL;DR: In this article, a framework for modular data access is presented, in which individual data accessors for simple data structures may be freely combined to obtain more complex data accessor for compound data structures.

...read moreread less

Abstract: CONTEXT: Data accessors allow one to read and write components of a data structure, such as the fields of a record, the variants of a union, or the elements of a container. These data accessors are collectively known as optics; they are fundamental to programs that manipulate complex data. INQUIRY: Individual data accessors for simple data structures are easy to write, for example as pairs of "getter" and "setter" methods. However, it is not obvious how to combine data accessors, in such a way that data accessors for a compound data structure are composed out of smaller data accessors for the parts of that structure. Generally, one has to write a sequence of statements or declarations that navigate step by step through the data structure, accessing one level at a time - which is to say, data accessors are traditionally not first-class citizens, combinable in their own right. APPROACH: We present a framework for modular data access, in which individual data accessors for simple data structures may be freely combined to obtain more complex data accessors for compound data structures. Data accessors become first-class citizens. The framework is based around the notion of profunctors, a flexible generalization of functions. KNOWLEDGE: The language features required are higher-order functions ("lambdas" or "closures"), parametrized types ("generics" or "abstract types"), and some mechanism for separating interfaces from implementations ("abstract classes" or "modules"). We use Haskell as a vehicle in which to present our constructions, but languages such as Java, C#, or Scala that provide the necessary features should work just as well. GROUNDING: We provide implementations of all our constructions, in the form of a literate program: the manuscript file for the paper is also the source code for the program, and the extracted code is available separately for evaluation. We also prove the essential properties demonstrating that our profunctor-based representations are precisely equivalent to the more familiar concrete representations. IMPORTANCE: Our results should pave the way to simpler ways of writing programs that access the components of compound data structures.

...read moreread less

Journal Article•DOI•

Uniform data access platform for SQL and NoSQL database systems

[...]

Ágnes Vathy-Fogarassy¹, Tamás Hugyák¹•Institutions (1)

University of Pannonia¹

01 Sep 2017-Information Systems

TL;DR: A web-based application is developed that convincingly confirms the usefulness of the novel data integration methodology, based on a metamodel approach, to query data individually from different relational and NoSQL database systems.

...read moreread less

Collapse