scispace - formally typeset
Search or ask a question

Showing papers on "Upload published in 2019"


Proceedings ArticleDOI
20 May 2019
TL;DR: In this paper, a federated learning (FL) protocol for heterogeneous clients in a mobile edge computing (MEC) network is proposed. But the authors consider the problem of data aggregation in the overall training process and propose a new protocol to solve it.
Abstract: We envision a mobile edge computing (MEC) framework for machine learning (ML) technologies, which leverages distributed client data and computation resources for training high-performance ML models while preserving client privacy. Toward this future goal, this work aims to extend Federated Learning (FL), a decentralized learning framework that enables privacy-preserving training of models, to work with heterogeneous clients in a practical cellular network. The FL protocol iteratively asks random clients to download a trainable model from a server, update it with own data, and upload the updated model to the server, while asking the server to aggregate multiple client updates to further improve the model. While clients in this protocol are free from disclosing own private data, the overall training process can become inefficient when some clients are with limited computational resources (i.e., requiring longer update time) or under poor wireless channel conditions (longer upload time). Our new FL protocol, which we refer to as FedCS, mitigates this problem and performs FL efficiently while actively managing clients based on their resource conditions. Specifically, FedCS solves a client selection problem with resource constraints, which allows the server to aggregate as many client updates as possible and to accelerate performance improvement in ML models. We conducted an experimental evaluation using publicly-available large-scale image datasets to train deep neural networks on MEC environment simulations. The experimental results show that FedCS is able to complete its training process in a significantly shorter time compared to the original FL protocol.

1,044 citations


Proceedings Article
28 Sep 2019
TL;DR: FedPAQ is presented, a communication-efficient Federated Learning method with Periodic Averaging and Quantization that achieves near-optimal theoretical guarantees for strongly convex and non-convex loss functions and empirically demonstrate the communication-computation tradeoff provided by the method.
Abstract: Federated learning is a distributed framework according to which a model is trained over a set of devices, while keeping data localized This framework faces several systems-oriented challenges which include (i) communication bottleneck since a large number of devices upload their local updates to a parameter server, and (ii) scalability as the federated network consists of millions of devices Due to these systems challenges as well as issues related to statistical heterogeneity of data and privacy concerns, designing a provably efficient federated learning method is of significant importance yet it remains challenging In this paper, we present FedPAQ, a communication-efficient Federated Learning method with Periodic Averaging and Quantization FedPAQ relies on three key features: (1) periodic averaging where models are updated locally at devices and only periodically averaged at the server; (2) partial device participation where only a fraction of devices participate in each round of the training; and (3) quantized message-passing where the edge nodes quantize their updates before uploading to the parameter server These features address the communications and scalability challenges in federated learning We also show that FedPAQ achieves near-optimal theoretical guarantees for strongly convex and non-convex loss functions and empirically demonstrate the communication-computation tradeoff provided by our method

380 citations


Journal ArticleDOI
TL;DR: This paper proposes a cooperative UAV sense-and-send protocol to enable the UAV-to-X communications, and forms the subchannel allocation and UAV speed optimization problem to maximize the uplink sum-rate and shows that the proposed ISASOA can upload 10% more data than the greedy algorithm.
Abstract: In this paper, we consider a single-cell cellular network with a number of cellular users (CUs) and unmanned aerial vehicles (UAVs), in which multiple UAVs upload their collected data to the base station (BS). Two transmission modes are considered to support the multi-UAV communications, i.e., UAV-to-network (U2N) and UAV-to-UAV (U2U) communications. Specifically, the UAV with a high signal-to-noise ratio (SNR) for the U2N link uploads its collected data directly to the BS through U2N communication, while the UAV with a low SNR for the U2N link can transmit data to a nearby UAV through underlaying U2U communication for the sake of quality of service. We first propose a cooperative UAV sense-and-send protocol to enable the UAV-to-X communications, and then formulate the subchannel allocation and UAV speed optimization problem to maximize the uplink sum-rate. To solve this NP-hard problem efficiently, we decouple it into three sub-problems: U2N and cellular user (CU) subchannel allocation, U2U subchannel allocation, and UAV speed optimization. An iterative subchannel allocation and speed optimization algorithm (ISASOA) is proposed to solve these sub-problems jointly. The simulation results show that the proposed ISASOA can upload 10% more data than the greedy algorithm.

314 citations


Journal ArticleDOI
TL;DR: Experimental results showed that the predictors generated by BioSeq-Analysis even outperformed some state-of-the-art methods and will become a useful tool for biological sequence analysis.
Abstract: With the avalanche of biological sequences generated in the post-genomic age, one of the most challenging problems is how to computationally analyze their structures and functions. Machine learning techniques are playing key roles in this field. Typically, predictors based on machine learning techniques contain three main steps: feature extraction, predictor construction and performance evaluation. Although several Web servers and stand-alone tools have been developed to facilitate the biological sequence analysis, they only focus on individual step. In this regard, in this study a powerful Web server called BioSeq-Analysis (http://bioinformatics.hitsz.edu.cn/BioSeq-Analysis/) has been proposed to automatically complete the three main steps for constructing a predictor. The user only needs to upload the benchmark data set. BioSeq-Analysis can generate the optimized predictor based on the benchmark data set, and the performance measures can be reported as well. Furthermore, to maximize user's convenience, its stand-alone program was also released, which can be downloaded from http://bioinformatics.hitsz.edu.cn/BioSeq-Analysis/download/, and can be directly run on Windows, Linux and UNIX. Applied to three sequence analysis tasks, experimental results showed that the predictors generated by BioSeq-Analysis even outperformed some state-of-the-art methods. It is anticipated that BioSeq-Analysis will become a useful tool for biological sequence analysis.

260 citations


Journal ArticleDOI
TL;DR: This digital signature technique based on the nature of bilinear pairing for elliptic curves is used to ensure the reliability and integrity when transmitting data to a node in the DSSCB system.
Abstract: A vehicular ad-hoc network (VANET) can improve the flow of traffic to facilitate intelligent transportation and to provide convenient information services, where the goal is to provide self-organizing data transmission capabilities for vehicles on the road to enable applications, such as assisted vehicle driving and safety warnings. VANETs are affected by issues such as identity validity and message reliability when vehicle nodes share data with other nodes. The method used to allow the vehicle nodes to upload sensor data to a trusted center for storage is susceptible to security risks, such as malicious tampering and data leakage. To address these security challenges, we propose a data security sharing and storage system based on the consortium blockchain (DSSCB). This digital signature technique based on the nature of bilinear pairing for elliptic curves is used to ensure the reliability and integrity when transmitting data to a node. The emerging consortium blockchain technology provides a decentralized, secure, and reliable database, which is maintained by the entire network node. In DSSCB, smart contracts are used to limit the triggering conditions for preselected nodes when transmitting and storing data and for allocating data coins to vehicles that participate in the contribution of data. The security analysis and performance evaluations demonstrated that our DSSCB solution is more secure and reliable in terms of data sharing and storage. Compared with the traditional blockchain system, the time required to confirm the data block was reduced by nearly six times and the transmission efficiency was improved by 83.33%.

192 citations


Journal ArticleDOI
TL;DR: This work proposes an efficient and privacy-preserving carpooling scheme using blockchain-assisted vehicular fog computing to support conditional privacy, one-to-many matching, destination matching, and data auditability, and authenticates users in a conditionally anonymous way.
Abstract: Carpooling enables passengers to share a vehicle to reduce traveling time, vehicle carbon emissions, and traffic congestion. However, the majority of passengers lean to find local drivers, but querying a remote cloud server leads to an unnecessary communication overhead and an increased response delay. Recently, fog computing is introduced to provide local data processing with low latency, but it also raises new security and privacy concerns because users’ private information (e.g., identity and location) could be disclosed when these information are shared during carpooling. While they can be encrypted before transmission, it makes user matching a challenging task and malicious users can upload false locations. Moreover, carpooling records should be kept in a distributed manner to guarantee reliable data auditability. To address these problems, we propose an efficient and privacy-preserving carpooling scheme using blockchain-assisted vehicular fog computing to support conditional privacy, one-to-many matching, destination matching, and data auditability. Specifically, we authenticate users in a conditionally anonymous way. Also, we adopt private proximity test to achieve one-to-many proximity matching and extend it to efficiently establish a secret communication key between a passenger and a driver. We store all location grids into a tree and achieve get-off location matching using a range query technique. A private blockchain is built to store carpooling records. Finally, we analyze the security and privacy properties of the proposed scheme, and evaluate its performance in terms of computational costs and communication overhead.

181 citations


Journal ArticleDOI
18 Dec 2019-Symmetry
TL;DR: Comparison and analysis of the system with similar applications shows that although they have similar functions, the proposed system offers more practicability, better information accessibility, excellent user experience, and approximately the optimal balance (a kind of symmetry) of the important items of the interface design.
Abstract: Taiwan is a highly informational country, and a robust traffic network is not only critical to the national economy, but is also an important infrastructure for economic development. This paper aims to integrate government open data and global positioning system (GPS) technology to build an instant image-based traffic assistant agent with user-friendly interfaces, thus providing more convenient real-time traffic information for users and relevant government units. The proposed system is expected to overcome the difficulty of accurately distinguishing traffic information and to solve the problem of some road sections not providing instant information. Taking the New Taipei City Government traffic open data as an example, the proposed system can display information pages at an optimal size on smartphones and other computer devices, and integrate database analysis to instantly view traffic information. Users can enter the system without downloading the application and can access the cross-platform services using device browsers. The proposed system also provides a user reporting mechanism, which informs vehicle drivers on congested road sections about road conditions. Comparison and analysis of the system with similar applications shows that although they have similar functions, the proposed system offers more practicability, better information accessibility, excellent user experience, and approximately the optimal balance (a kind of symmetry) of the important items of the interface design.

168 citations


Journal ArticleDOI
TL;DR: A blockchain-based secure data sharing platform by leveraging the benefits of interplanetary file system (IPFS) and results show that SSS shows the least computational time as compared to advanced encryption standard (AES) 128 and 256.
Abstract: In a research community, data sharing is an essential step to gain maximum knowledge from the prior work. Existing data sharing platforms depend on trusted third party (TTP). Due to the involvement of TTP, such systems lack trust, transparency, security, and immutability. To overcome these issues, this paper proposed a blockchain-based secure data sharing platform by leveraging the benefits of interplanetary file system (IPFS). A meta data is uploaded to IPFS server by owner and then divided into n secret shares. The proposed scheme achieves security and access control by executing the access roles written in smart contract by owner. Users are first authenticated through RSA signatures and then submit the requested amount as a price of digital content. After the successful delivery of data, the user is encouraged to register the reviews about data. These reviews are validated through Watson analyzer to filter out the fake reviews. The customers registering valid reviews are given incentives. In this way, maximum reviews are submitted against every file. In this scenario, decentralized storage, Ethereum blockchain, encryption, and incentive mechanism are combined. To implement the proposed scenario, smart contracts are written in solidity and deployed on local Ethereum test network. The proposed scheme achieves transparency, security, access control, authenticity of owner, and quality of data. In simulation results, an analysis is performed on gas consumption and actual cost required in terms of USD, so that a good price estimate can be done while deploying the implemented scenario in real set-up. Moreover, computational time for different encryption schemes are plotted to represent the performance of implemented scheme, which is shamir secret sharing (SSS). Results show that SSS shows the least computational time as compared to advanced encryption standard (AES) 128 and 256.

145 citations


Proceedings ArticleDOI
01 Aug 2019
TL;DR: This work study and evaluate a poisoning attack in federated learning system based on generative adversarial nets (GAN), where an attacker first acts as a benign participant and stealthily trains a GAN to mimic prototypical samples of the other participants' training set which does not belong to the attacker.
Abstract: Federated learning is a novel distributed learning framework, where the deep learning model is trained in a collaborative manner among thousands of participants. The shares between server and participants are only model parameters, which prevent the server from direct access to the private training data. However, we notice that the federated learning architecture is vulnerable to an active attack from insider participants, called poisoning attack, where the attacker can act as a benign participant in federated learning to upload the poisoned update to the server so that he can easily affect the performance of the global model. In this work, we study and evaluate a poisoning attack in federated learning system based on generative adversarial nets (GAN). That is, an attacker first acts as a benign participant and stealthily trains a GAN to mimic prototypical samples of the other participants' training set which does not belong to the attacker. Then these generated samples will be fully controlled by the attacker to generate the poisoning updates, and the global model will be compromised by the attacker with uploading the scaled poisoning updates to the server. In our evaluation, we show that the attacker in our construction can successfully generate samples of other benign participants using GAN and the global model performs more than 80% accuracy on both poisoning tasks and main tasks.

120 citations


Journal ArticleDOI
TL;DR: It is proved that DADP can provide real-time crowd-sourced statistical data publishing with strong privacy protection under an untrusted server and a distributed budget allocation mechanism and an agent-based dynamic grouping mechanism to realize global $w-event $\epsilon$ε-differential privacy in a distributed way.
Abstract: The continuous publication of aggregate statistics over crowd-sourced data to the public has enabled many data mining applications (e.g., real-time traffic analysis). Existing systems usually rely on a trusted server to aggregate the spatio-temporal crowd-sourced data and then apply differential privacy mechanism to perturb the aggregate statistics before publishing to provide strong privacy guarantee. However, the privacy of users will be exposed once the server is hacked or cannot be trusted. In this paper, we study the problem of real-time crowd-sourced statistical data publishing with strong privacy protection under an untrusted server. We propose a novel distributed agent-based privacy-preserving framework, called DADP, that introduces a new level of multiple agents between the users and the untrusted server. Instead of directly uploading the check-in information to the untrusted server, a user can randomly select one agent and upload the check-in information to it with the anonymous connection technology. Each agent aggregates the received crowd-sourced data and perturbs the aggregated statistics locally with Laplace mechanism. The perturbed statistics from all the agents are further combined together to form the entire perturbed statistics for publication. In particular, we propose a distributed budget allocation mechanism and an agent-based dynamic grouping mechanism to realize global $w$w-event $\epsilon$e-differential privacy in a distributed way. We prove that DADP can provide $w$w-event $\epsilon$e-differential privacy for real-time crowd-sourced statistical data publishing under the untrusted server. Extensive experiments on real-world datasets demonstrate the effectiveness of DADP.

101 citations


Journal ArticleDOI
TL;DR: This paper investigates the cloud-based road condition monitoring (RCoM) scenario, where the authority needs to monitor real-time road conditions with the help of a cloud server so that it could make sound responses to emergency cases timely.
Abstract: The connected vehicular ad hoc network (VANET) and cloud computing technology allows entities in VANET to enjoy the advantageous storage and computing services offered by some cloud service provider. However, the advantages do not come free, since their combination brings many new security and privacy requirements for VANET applications. In this paper, we investigate the cloud-based road condition monitoring (RCoM) scenario, where the authority needs to monitor real-time road conditions with the help of a cloud server so that it could make sound responses to emergency cases timely. When some bad road condition is detected, e.g., some geologic hazard or accident happens, vehicles on site are able to report such information to a cloud server engaged by the authority. We focus on addressing three key issues in RCoM. First, the vehicles have to be authorized by some roadside unit before generating a road condition report in the domain and uploading it to the cloud server. Second, to guarantee the privacy against the cloud server, the road condition information should be reported in ciphertext format, which requires that the cloud server should be able to distinguish the reported data from different vehicles in ciphertext format for the same place without compromising their confidentiality. Third, the cloud server and authority should be able to validate the report source, i.e., to check whether the road conditions are reported by legitimate vehicles. To address these issues, we present an efficient RCoM scheme, analyze its efficiency theoretically, and demonstrate the practicality through experiments.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a terrestrial-satellite network (TSN) architecture to integrate the ultra-dense low earth orbit (LEO) networks and the terrestrial networks to achieve efficient data offloading.
Abstract: In this paper, we propose a terrestrial-satellite network (TSN) architecture to integrate the ultra-dense low earth orbit (LEO) networks and the terrestrial networks to achieve efficient data offloading. In TSN, each ground user can access the network over C-band via a macro cell, a traditional small cell, or a LEO-backhauled small cell (LSC). Each LSC is then scheduled to upload the received data via multiple satellites over Ka-band. We aim to maximize the sum data rate and the number of accessed users while satisfying the varying backhaul capacity constraints jointly determined by the LEO satellite-based backhaul links. The optimization problem is then decomposed into two closely connected subproblems and solved by our proposed matching algorithms. The simulation results show that the integrated network significantly outperforms the non-integrated ones in terms of the sum data rate. The influence of the traffic load and LEO constellation on the system performance is also discussed.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a new capacity-achieving code for the private information retrieval (PIR) problem, and showed that it has the minimum message size and the minimum upload cost (being roughly linear in the number of messages).
Abstract: We propose a new capacity-achieving code for the private information retrieval (PIR) problem, and show that it has the minimum message size (being one less than the number of servers) and the minimum upload cost (being roughly linear in the number of messages) among a general class of capacity-achieving codes, and in particular, among all capacity-achieving linear codes. Different from existing code constructions, the proposed code is asymmetric, and this asymmetry appears to be the key factor leading to the optimal message size and the optimal upload cost. The converse results on the message size and the upload cost are obtained by an analysis of the information theoretic proof of the PIR capacity, from which a set of critical properties of any capacity-achieving code in the code class of interest is extracted. The symmetry structure of the PIR problem is then analyzed, which allows us to construct symmetric codes from asymmetric ones, yielding a meaningful bridge between the proposed code and existing ones in the literature.

Journal ArticleDOI
TL;DR: This paper proposes a novel privacy-preserving patient health information sharing scheme, which allows HSPs to access and search PHI files in a secure yet efficient manner and makes use of the searchable encryption technique with keyword range search and multikeyword search.
Abstract: The integration of wearable wireless devices and cloud computing in e-health systems has significantly improved their effectiveness and availability. Patients can upload their personal health information (PHI) files to the cloud, from where the health service providers (HSPs) can obtain appropriate information to determine the health state. This system not only reduces the costs associated to healthcare but also provides timely diagnosis to save lives. However, a number of privacy concerns arise while sharing sensitive information. In this paper, we propose a novel privacy-preserving patient health information sharing scheme, which allows HSPs to access and search PHI files in a secure yet efficient manner. We make use of the searchable encryption technique with keyword range search and multikeyword search. The proposed privacy-preserving equality test protocol allows different types of numeric comparison searches on encrypted data. We also use a variant of bloom filter and message authentication code to classify PHI files, filter false data, and check integrity of search results. The simulations on real-world and synthetic data show the feasibility and efficiency of the system, and security analysis proves the privacy-preservation properties.

Proceedings ArticleDOI
14 Jul 2019
TL;DR: This paper is to complement IPFS with blockchain technology, by proposing a new approach (BlockIPFS) to create a clear audit trail and achieve improved trustworthiness of the data and authorship protection, and provide a clear route to trace back all activities associated with a given file using blockchain as a service.
Abstract: The Interplanetary File System (IPFS) is a distributed file system that seeks to decentralize the web and to make it faster and more efficient. It incorporates well-known technologies, including BitTorrent and Git, to create a swarm of computing systems that share information. Since its introduction in 2016, IPFS has seen great improvements and adoption from both individuals and enterprise organizations. Its distributed network allows users to share files and information across the globe. IPFS works well with large files that may consume or require large bandwidth to upload and/or download over the Internet. The rapid adoption of this distributed file system is in part because IPFS is designed to operate on top of different protocols, such as FTP and HTTP. However, there are underpinning concerns relating to security and access control, for example lack of traceability on how the files are accessed. The aim of this paper is to complement IPFS with blockchain technology, by proposing a new approach (BlockIPFS) to create a clear audit trail. BlockIPFS allows us to achieve improved trustworthiness of the data and authorship protection, and provide a clear route to trace back all activities associated with a given file using blockchain as a service.

Posted Content
TL;DR: It is demonstrated that with the proposed framework, the simulator car agents can transfer knowledge to the RC cars in real-time, with 27% increase in the average distance with obstacles and 42% decrease in the collision counts.
Abstract: Reinforcement learning (RL) is widely used in autonomous driving tasks and training RL models typically involves in a multi-step process: pre-training RL models on simulators, uploading the pre-trained model to real-life robots, and fine-tuning the weight parameters on robot vehicles. This sequential process is extremely time-consuming and more importantly, knowledge from the fine-tuned model stays local and can not be re-used or leveraged collaboratively. To tackle this problem, we present an online federated RL transfer process for real-time knowledge extraction where all the participant agents make corresponding actions with the knowledge learned by others, even when they are acting in very different environments. To validate the effectiveness of the proposed approach, we constructed a real-life collision avoidance system with Microsoft Airsim simulator and NVIDIA JetsonTX2 car agents, which cooperatively learn from scratch to avoid collisions in indoor environment with obstacle objects. We demonstrate that with the proposed framework, the simulator car agents can transfer knowledge to the RC cars in real-time, with 27% increase in the average distance with obstacles and 42% decrease in the collision counts.

Posted Content
TL;DR: In this article, the authors proposed a cooperative mechanism for mitigating the performance degradation due to non-independent and identically-distributed (non-IID) data in collaborative machine learning (ML), namely federated learning (FL), which trains an ML model using the rich data and computational resources of mobile clients without gathering their data to central systems.
Abstract: This paper proposes a cooperative mechanism for mitigating the performance degradation due to non-independent-and-identically-distributed (non-IID) data in collaborative machine learning (ML), namely federated learning (FL), which trains an ML model using the rich data and computational resources of mobile clients without gathering their data to central systems. The data of mobile clients is typically non-IID owing to diversity among mobile clients' interests and usage, and FL with non-IID data could degrade the model performance. Therefore, to mitigate the degradation induced by non-IID data, we assume that a limited number (e.g., less than 1%) of clients allow their data to be uploaded to a server, and we propose a hybrid learning mechanism referred to as Hybrid-FL, wherein the server updates the model using the data gathered from the clients and aggregates the model with the models trained by clients. The Hybrid-FL solves both client- and data-selection problems via heuristic algorithms, which try to select the optimal sets of clients who train models with their own data, clients who upload their data to the server, and data uploaded to the server. The algorithms increase the number of clients participating in FL and make more data gather in the server IID, thereby improving the prediction accuracy of the aggregated model. Evaluations, which consist of network simulations and ML experiments, demonstrate that the proposed scheme achieves a 13.5% higher classification accuracy than those of the previously proposed schemes for the non-IID case.

Journal ArticleDOI
TL;DR: The proposed protocol secure against man-in-the-middle attack, patient anonymity, replay attack, known-key security property, data confidentiality, data non-repudiation, message authentication, impersonation attack, session key security and patient unlinkability, is compared with existing related protocols in same cloud based TMIS.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed framework outperforms the existing privacy-preserving schemes, and can be used to collect multimedia data in various IoMT applications.
Abstract: The concept of Internet of Multimedia Things (IoMT) is becoming popular nowadays and can be used in various smart city applications, e.g., traffic management, healthcare, and surveillance. In the IoMT, the devices, e.g., Multimedia Sensor Nodes (MSNs), are capable of generating both multimedia and non-multimedia data. The generated data are forwarded to a cloud server via a Base Station (BS). However, it is possible that the Internet connection between the BS and the cloud server may be temporarily down. The limited computational resources restrict the MSNs from holding the captured data for a longer time. In this situation, mobile sinks can be utilized to collect data from MSNs and upload to the cloud server. However, this data collection may create privacy issues, such as revealing identities and location information of MSNs. Therefore, there is a need to preserve the privacy of MSNs during mobile data collection. In this paper, we propose an efficient privacy-preserving-based data collection and analysis (P2DCA) framework for IoMT applications. The proposed framework partitions an underlying wireless multimedia sensor network into multiple clusters. Each cluster is represented by a Cluster Head (CH). The CHs are responsible to protect the privacy of member MSNs through data and location coordinates aggregation. Later, the aggregated multimedia data are analyzed on the cloud server using a counter-propagation artificial neural network to extract meaningful information through segmentation. Experimental results show that the proposed framework outperforms the existing privacy-preserving schemes, and can be used to collect multimedia data in various IoMT applications.

Book ChapterDOI
01 Jan 2019
TL;DR: This book chapter describes how to protect citizen data by securing the WiFi based data transmission system that encrypts and encodes data before transfer from source to destination where the data is finally decrypted and decoded.
Abstract: Day by day cities become more intelligence because governments move slowly to convert each thing to become smarter. These cities are built with the goal of increasing liveability, safety, revivification, and sustainability by building smart services like smart education, smart government, smart mobility, smart homes and e-health but it is important to build these services along with the method for securing and maintaining the privacy of citizen’s data. Citizens can build their own services that meet their requirements and needs. This book chapter discusses the internet of things and its applications in smart cities then discusses smart cities and challenge that faces smart cities and describes how to protect citizen data by securing the WiFi based data transmission system that encrypts and encodes data before transfer from source to destination where the data is finally decrypted and decoded. The proposed system is embedded with authentication method to help the authorized people to access the data. The proposed system first compresses data with run-length encoding technique then encrypt it using the AES method but with a rotated key then the source transfers the encoded and encrypted data to the destination where the data is decrypted then decoded to restore the original data then the original data is upload to the destination’s website.

Journal ArticleDOI
04 Apr 2019
TL;DR: Simulation results demonstrate that the proposed online algorithm based on Lyapunov optimization theory can achieve high efficiency on energy consumption and significantly reduce queue backlogs compared with an offline formulation and a greedy “Big-Backlog-First” algorithm.
Abstract: As a critical supplementary to terrestrial communication networks, low-Earth-orbit (LEO) satellite-based communication networks have been gaining growing attention in recent years. In this paper, we focus on data collection from geo-distributed Internet-of-Things (IoT) networks via LEO satellites. Normally, the power supply in IoT data-gathering gateways is a bottleneck resource that constrains the overall amount of data upload. Thus, the challenge is how to collect the data from IoT gateways through LEO satellites under time-varying uplinks in an energy-efficient way. To address this problem, we first formulate a novel optimization problem, and then propose an online algorithm based on Lyapunov optimization theory to aid green data-upload for geo-distributed IoT networks. The proposed approach is to jointly maximize the overall amount of data uploaded and minimize the energy consumption, while maintaining the queue stability even without the knowledge of arrival data at IoT gateways. We finally evaluate the performance of the proposed algorithm through simulations using both real-world and synthetic data traces. Simulation results demonstrate that the proposed approach can achieve high efficiency on energy consumption and significantly reduce queue backlogs compared with an offline formulation and a greedy “Big-Backlog-First” algorithm.

Proceedings ArticleDOI
06 Nov 2019
TL;DR: In this paper, the authors analyze the design space of perceptual ad-blockers and present a unified architecture that incorporates prior academic and commercial work, and explore a variety of attacks on the adblocker's detection pipeline, that enable publishers or ad networks to evade or detect ad blocking, and at times even abuse its high privilege level to bypass web security boundaries.
Abstract: Perceptual ad-blocking is a novel approach that detects online advertisements based on their visual content. Compared to traditional filter lists, the use of perceptual signals is believed to be less prone to an arms race with web publishers and ad networks. We demonstrate that this may not be the case. We describe attacks on multiple perceptual ad-blocking techniques, and unveil a new arms race that likely disfavors ad-blockers. Unexpectedly, perceptual ad-blocking can also introduce new vulnerabilities that let an attacker bypass web security boundaries and mount DDoS attacks. We first analyze the design space of perceptual ad-blockers and present a unified architecture that incorporates prior academic and commercial work. We then explore a variety of attacks on the ad-blocker's detection pipeline, that enable publishers or ad networks to evade or detect ad-blocking, and at times even abuse its high privilege level to bypass web security boundaries. On one hand, we show that perceptual ad-blocking must visually classify rendered web content to escape an arms race centered on obfuscation of page markup. On the other, we present a concrete set of attacks on visual ad-blockers by constructing adversarial examples in a real web page context. For seven ad-detectors, we create perturbed ads, ad-disclosure logos, and native web content that misleads perceptual ad-blocking with 100% success rates. In one of our attacks, we demonstrate how a malicious user can upload adversarial content, such as a perturbed image in a Facebook post, that fools the ad-blocker into removing another users' non-ad content. Moving beyond the Web and visual domain, we also build adversarial examples for AdblockRadio, an open source radio client that uses machine learning to detects ads in raw audio streams.

Proceedings ArticleDOI
01 Apr 2019
TL;DR: The design, implementation and evaluation of HideMe are proposed, a framework to preserve the associated users’ privacy for online photo sharing and reduces the system overhead by a carefully designed face matching algorithm.
Abstract: Photo sharing on Online Social Networks (OSNs) has become one of the most popular social activities in our daily life. However, some associated friends or bystanders in the photos may not want to be viewed due to privacy concerns. In this paper, we propose the design, implementation and evaluation of HideMe, a framework to preserve the associated users’ privacy for online photo sharing. HideMe acts as a plugin to existing photo sharing OSNs, and it enables the following: a) extraction of factors when users upload their photos, b) associated friends in the uploaded photos are able to set their own privacy policies based on scenarios, instead of a photo-by-photo setting, c) any user in other friend’s uploaded photos could be hidden away from unwanted viewers based on one time policy generation. We also design a distance-based algorithm to identify and protect the privacy of bystanders. Moreover, HideMe not only protects users’ privacy but also reduces the system overhead by a carefully designed face matching algorithm. We have implemented a prototype of HideMe, and evaluation results have demonstrated its effectiveness and efficiency.

Journal ArticleDOI
TL;DR: A new security model is defined and a privacy preserving traffic monitoring scheme is proposed that uses short group signature to authenticate drivers in a conditionally anonymous way, adopt a range query technique to acquire driving information in a privacy-preserving way, and integrate it to the construction of a weighted proximity graph at each fog node through a WiFi challenge handshake to filter out false reports.
Abstract: Traffic monitoring system empowers cloud server and drivers to collect real-time driving information and acquire traffic conditions. However, drivers are more interested in local traffic, and sending driving reports to a remote cloud server consumes a heavy bandwidth and incurs an increased response delay. Recently, fog computing is introduced to provide location-sensitive and latency-aware local data management in vehicular crowdsensing, but it also raises new privacy concerns because drivers' information could be disclosed. Although these messages are encrypted before transmission, malicious drivers can upload false reports to sabotage the systems, and filtering out false encrypted reports remains a challenging issue. To address the problems, we define a new security model and propose a privacy preserving traffic monitoring scheme. Specifically, we utilize short group signature to authenticate drivers in a conditionally anonymous way, adopt a range query technique to acquire driving information in a privacy-preserving way, and integrate it to the construction of a weighted proximity graph at each fog node through a WiFi challenge handshake to filter out false reports. Moreover, we use variant Bloom filters to achieve fast traffic conditions storage and retrieval. Finally, we prove the security and privacy, evaluate the performance with real-world cloud servers.

Posted Content
17 May 2019
TL;DR: This work proposes a novel learning mechanism referred to as Hybrid-FL, where the server updates the model using data gathered from the clients and merge the model with models trained by clients, and achieves a significantly higher classification accuracy than previous schemes in the non-IID case.
Abstract: This paper proposes a cooperative mechanism for mitigating the performance degradation due to non-independent-and-identically-distributed (non-IID) data in collaborative machine learning (ML), namely federated learning (FL), which trains an ML model using the rich data and computational resources of mobile clients without gathering their data to central systems. The data of mobile clients is typically non-IID owing to diversity among mobile clients' interests and usage, and FL with non-IID data could degrade the model performance. Therefore, to mitigate the degradation induced by non-IID data, we assume that a limited number (e.g., less than 1%) of clients allow their data to be uploaded to a server, and we propose a hybrid learning mechanism referred to as Hybrid-FL, wherein the server updates the model using the data gathered from the clients and aggregates the model with the models trained by clients. The Hybrid-FL solves both client- and data-selection problems via heuristic algorithms, which try to select the optimal sets of clients who train models with their own data, clients who upload their data to the server, and data uploaded to the server. The algorithms increase the number of clients participating in FL and make more data gather in the server IID, thereby improving the prediction accuracy of the aggregated model. Evaluations, which consist of network simulations and ML experiments, demonstrate that the proposed scheme achieves a 13.5% higher classification accuracy than those of the previously proposed schemes for the non-IID case.

Proceedings ArticleDOI
01 Dec 2019
TL;DR: A privacy-enhanced federated learning (PEFL) scheme to protect the gradients over an untrusted server by encrypting participants' local gradients with Paillier homomorphic cryptosystem and demonstrating that PEFL has low computation costs while reaching high accuracy in the settings of Federated learning.
Abstract: Federated learning has emerged as a promising solution for big data analytics, which jointly trains a global model across multiple mobile devices. However, participants' sensitive data information may be leaked to an untrusted server through uploaded gradient vectors. To address this problem, we propose a privacy-enhanced federated learning (PEFL) scheme to protect the gradients over an untrusted server. This is mainly enabled by encrypting participants' local gradients with Paillier homomorphic cryptosystem. In order to reduce the computation costs of the cryptosystem, we utilize the distributed selective stochastic gradient descent (DSSGD) method in the local training phase to achieve the distributed encryption. Moreover, the encrypted gradients can be further used for secure sum aggregation at the server side. In this way, the untrusted server can only learn the aggregated statistics for all the participants' updates, while each individual's private information will be well-protected. For the security analysis, we theoretically prove that our scheme is secure under several cryptographic hard problems. Exhaustive experimental results demonstrate that PEFL has low computation costs while reaching high accuracy in the settings of federated learning.

Journal ArticleDOI
27 Dec 2019-Animal
TL;DR: The ClassifyMe software tool is designed to address the gap in computer software capable of extracting false positives, automatically identifying animals detected and sorting imagery, and provides users the opportunity to utilise state-of-the-art image recognition algorithms without the need for specialised computer programming skills.
Abstract: We present ClassifyMe a software tool for the automated identification of animal species from camera trap images. ClassifyMe is intended to be used by ecologists both in the field and in the office. Users can download a pre-trained model specific to their location of interest and then upload the images from a camera trap to a laptop or workstation. ClassifyMe will identify animals and other objects (e.g., vehicles) in images, provide a report file with the most likely species detections, and automatically sort the images into sub-folders corresponding to these species categories. False Triggers (no visible object present) will also be filtered and sorted. Importantly, the ClassifyMe software operates on the user’s local machine (own laptop or workstation)—not via internet connection. This allows users access to state-of-the-art camera trap computer vision software in situ, rather than only in the office. The software also incurs minimal cost on the end-user as there is no need for expensive data uploads to cloud services. Furthermore, processing the images locally on the users’ end-device allows them data control and resolves privacy issues surrounding transfer and third-party access to users’ datasets.

Journal ArticleDOI
15 Mar 2019-Trials
TL;DR: An indicator set to capture the maturity of the repositories’ procedures and their suitability for the hosting of IPD is developed and can help researchers to find a suitable repository for their datasets.
Abstract: Data repositories have the potential to play an important role in the effective and safe sharing of individual-participant data (IPD) from clinical studies. We analysed the current landscape of data repositories to create a detailed description of available repositories and assess their suitability for hosting data from clinical studies, from the perspective of the clinical researcher. We assessed repositories that enable storage, sharing, discoverability, re-use of the IPD and associated documents from clinical studies using a pre-defined set of 34 items and publicly available information from April to June 2018. For this purpose, we developed an indicator set to capture the maturity of the repositories’ procedures and their suitability for the hosting of IPD. The indicators cover guidelines for data upload and data de-identification, data quality controls, contracts for upload and storage, flexibility of access, application of identifiers, availability of metadata, and long-term preservation. We analysed 25 repositories, from an initial set of 55 identified as possibly relevant. Half of the included repositories were generic, i.e. not limited to a specific disease or clinical area and 13 were launched in the last 8 years. The sample was extremely heterogeneous and included repositories developed by research funders, infrastructures, universities, and editors. All but three repositories do not apply a fee for uploading, storage or access to data. None of the repositories completely demonstrated all the items included in the indicator set, but three repositories (Dryad, Drum, EASY) met – fully or partially – all items. Flexibility of data-access modalities appears to be limited, being lacking in half of the repositories. Our evaluation, though often hampered by the lack of sufficient information, can help researchers to find a suitable repository for their datasets. Some repositories are more mature because of their support for clinical dataset preparation, contractual agreements, metadata and identifiers, different modalities of access, and long-term preservation of data. Further work is now required to achieve a more robust and accurate system for evaluation, which in turn may encourage the sharing of clinical study data. Study protocol available at https://zenodo.org/record/1438261#.W64kW9Egrcs .

Journal ArticleDOI
TL;DR: This paper proposes the novel concept and design of a concrete DT-PDP scheme based on the bilinear pairings, and shows that the scheme is provably secure and efficient.
Abstract: With the rapid development of cloud computing, more and more enterprises would like to upload and store their data in the public cloud. When the parts of the business of an enterprise are purchased by another enterprise, the corresponding data will be transferred to the acquiring enterprise. For the usual case, how to outsource the computation cost of data transfer to the cloud How to ensure the remote purchased data integrity Thus, it is important to study provable data possession with outsourced data transfer (DT-PDP). In this paper, for the first time, we propose the novel concept: DT-PDP. By taking use of DT-PDP, the following three security requirements can be satisfied: (1) the other un-purchased data security of acquired enterprise can be ensured; (2) the purchased data integrity and privacy can be ensured; (3) the data transferability's computation can be outsourced to the public cloud servers. For the security concept of DT-PDP, we give its motivation, system model and security model. Then, we design a concrete DT-PDP scheme based on the bilinear pairings. At last, we analyze the security, efficiency and flexibility of the concrete DT-PDP scheme. It shows that our scheme is provably secure and efficient.

Book ChapterDOI
09 Dec 2019
TL;DR: A novel poisoning defense generative adversarial network (PDGAN) is proposed to defend the poising attack of poisoning attacks in federated learning.
Abstract: Federated learning can complete an enormous training task efficiently by inviting participants to train a deep learning model collaboratively, and the user privacy will be well preserved for the users only upload model parameters to the centralized server. However, the attackers can initiate poisoning attacks by uploading malicious updates in federated learning. Therefore, the accuracy of the global model will be impacted significantly after the attack. To address this vulnerability, we propose a novel poisoning defense generative adversarial network (PDGAN) to defend the poising attack. The PDGAN can reconstruct training data from model updates and audit the accuracy for each participant model by using the generated data. Precisely, the participant whose accuracy is lower than a predefined threshold will be identified as an attacker and model parameters of the attacker will be removed from the training procedure in this iteration. Experiments conducted on MNIST and Fashion-MNIST datasets demonstrate that our approach can indeed defend the poisoning attacks in federated learning.