scispace - formally typeset
Search or ask a question

Showing papers on "Upload published in 2021"


Journal ArticleDOI
Jiasi Weng1, Jian Weng1, Jilian Zhang1, Ming Li1, Yue Zhang1, Weiqi Luo1 
TL;DR: This paper presents a distributed, secure, and fair deep learning framework named DeepChain, which provides a value-driven incentive mechanism based on Blockchain to force the participants to behave correctly and guarantees data privacy for each participant and provides auditability for the whole training process.
Abstract: Deep learning can achieve higher accuracy than traditional machine learning algorithms in a variety of machine learning tasks. Recently, privacy-preserving deep learning has drawn tremendous attention from information security community, in which neither training data nor the training model is expected to be exposed. Federated learning is a popular learning mechanism, where multiple parties upload local gradients to a server and the server updates model parameters with the collected gradients. However, there are many security problems neglected in federated learning, for example, the participants may behave incorrectly in gradient collecting or parameter updating, and the server may be malicious as well. In this article, we present a distributed, secure, and fair deep learning framework named DeepChain to solve these problems. DeepChain provides a value-driven incentive mechanism based on Blockchain to force the participants to behave correctly. Meanwhile, DeepChain guarantees data privacy for each participant and provides auditability for the whole training process. We implement a prototype of DeepChain and conduct experiments on a real dataset for different settings, and the results show that our DeepChain is promising.

208 citations


Journal ArticleDOI
TL;DR: To ensure client data privacy, a blockchain-based federated learning approach for device failure detection in IIoT is proposed, and a novel centroid distance weighted federated averaging algorithm taking into account the distance between positive class and negative class of each client data set is proposed.
Abstract: Device failure detection is one of most essential problems in Industrial Internet of Things (IIoT). However, in conventional IIoT device failure detection, client devices need to upload raw data to the central server for model training, which might lead to disclosure of sensitive business data. Therefore, in this article, to ensure client data privacy, we propose a blockchain-based federated learning approach for device failure detection in IIoT. First, we present a platform architecture of blockchain-based federated learning systems for failure detection in IIoT, which enables verifiable integrity of client data. In the architecture, each client periodically creates a Merkle tree in which each leaf node represents a client data record, and stores the tree root on a blockchain. Furthermore, to address the data heterogeneity issue in IIoT failure detection, we propose a novel centroid distance weighted federated averaging (CDW_FedAvg) algorithm taking into account the distance between positive class and negative class of each client data set. In addition, to motivate clients to participate in federated learning, a smart contact-based incentive mechanism is designed depending on the size and the centroid distance of client data used in local model training. A prototype of the proposed architecture is implemented with our industry partner, and evaluated in terms of feasibility, accuracy, and performance. The results show that the approach is feasible, and has satisfactory accuracy and performance.

155 citations


Journal ArticleDOI
Abstract: Learning over massive data stored in different locations is essential in many real-world applications. However, sharing data is full of challenges due to the increasing demands of privacy and security with the growing use of smart mobile devices and Internet of thing (IoT) devices. Federated learning provides a potential solution to privacy-preserving and secure machine learning, by means of jointly training a global model without uploading data distributed on multiple devices to a central server. However, most existing work on federated learning adopts machine learning models with full-precision weights, and almost all these models contain a large number of redundant parameters that do not need to be transmitted to the server, consuming an excessive amount of communication costs. To address this issue, we propose a federated trained ternary quantization (FTTQ) algorithm, which optimizes the quantized networks on the clients through a self-learning quantization factor. Theoretical proofs of the convergence of quantization factors, unbiasedness of FTTQ, as well as a reduced weight divergence are given. On the basis of FTTQ, we propose a ternary federated averaging protocol (T-FedAvg) to reduce the upstream and downstream communication of federated learning systems. Empirical experiments are conducted to train widely used deep learning models on publicly available data sets, and our results demonstrate that the proposed T-FedAvg is effective in reducing communication costs and can even achieve slightly better performance on non-IID data in contrast to the canonical federated learning algorithms.

82 citations


Journal ArticleDOI
TL;DR: An estimation of the model exchange time between each client and the server is proposed, based on which a fairness guaranteed algorithm termed RBCS-F for problem-solving is designed.
Abstract: The issue of potential privacy leakage during centralized AI’s model training has drawn intensive concern from the public. A Parallel and Distributed Computing (or PDC) scheme, termed Federated Learning (FL), has emerged as a new paradigm to cope with the privacy issue by allowing clients to perform model training locally, without the necessity to upload their personal sensitive data. In FL, the number of clients could be sufficiently large, but the bandwidth available for model distribution and re-upload is quite limited, making it sensible to only involve part of the volunteers to participate in the training process. The client selection policy is critical to an FL process in terms of training efficiency, the final model’s quality as well as fairness. In this article, we will model the fairness guaranteed client selection as a Lyapunov optimization problem and then a $\mathbf {C^2MAB}$ C 2 MAB -based method is proposed for estimation of the model exchange time between each client and the server, based on which we design a fairness guaranteed algorithm termed RBCS-F for problem-solving. The regret of RBCS-F is strictly bounded by a finite constant, justifying its theoretical feasibility. Barring the theoretical results, more empirical data can be derived from our real training experiments on public datasets.

78 citations


Journal ArticleDOI
TL;DR: A UDP algorithm using the proposed CRD method can effectively improve both the training efficiency and model quality for the given privacy protection levels, and reveals that there exists an optimal number of communication rounds to achieve the best learning performance.
Abstract: Federated learning (FL), as a type of collaborative machine learning framework, is capable of preserving private data from mobile terminals (MTs) while training the data into useful models. Nevertheless, it is still possible for a curious server to infer private information from the shared models uploaded by MTs. To address this problem, we first make use of the concept of local differential privacy (LDP), and propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers. According to our analysis, the UDP framework can realize $(\epsilon_{i}, \delta_{i})$ -LDP for the i-th MT with adjustable privacy protection levels by varying the variances of the artificial noise processes. We then derive a theoretical convergence upper-bound for the UDP algorithm. It reveals that there exists an optimal number of communication rounds to achieve the best learning performance. More importantly, we propose a communication rounds discounting (CRD) method, which can achieve a much better trade-off between the computational complexity of searching and the convergence performance compared with the heuristic search method. Extensive experiments indicate that our UDP algorithm using the proposed CRD method can effectively improve both the training efficiency and model quality for the given privacy protection levels.

68 citations


Journal ArticleDOI
TL;DR: In this paper, the authors design a rigorous testbed for measuring the one-way packet delays between a 5G end device via a radio access network (RAN) to a packet core with sub-microsecond precision as well as measuring the packet core delay with nanosecond precision.
Abstract: A 5G campus network is a 5G network for the users affiliated with the campus organization, e.g., an industrial campus, covering a prescribed geographical area. A 5G campus network can operate as a so-called 5G non-standalone (NSA) network (which requires 4G Long-Term Evolution (LTE) spectrum access) or as a 5G standalone (SA) network (without 4G LTE spectrum access). 5G campus networks are envisioned to enable new use cases, which require cyclic delay-sensitive industrial communication, such as robot control. We design a rigorous testbed for measuring the one-way packet delays between a 5G end device via a radio access network (RAN) to a packet core with sub-microsecond precision as well as for measuring the packet core delay with nanosecond precision. With our testbed design, we conduct detailed measurements of the one-way download (downstream, i.e., core to end device) as well as one-way upload (upstream, i.e., end device to core) packet delays and losses for both 5G SA and 5G NSA hardware and network operation. We also measure the corresponding 5G SA and 5G NSA packet core processing delays for download and upload. We find that typically 95% of the SA download packet delays are in the range from 4–10 ms, indicating a fairly wide spread of the packet delays. Also, existing packet core implementations regularly incur packet processing latencies up to 0.4 ms, with outliers above one millisecond. Our measurement results inform the further development and refinement of 5G SA and 5G NSA campus networks for industrial use cases. We make the measurement data traces publicly available as the IEEE DataPort 5G Campus Networks: Measurement Traces dataset (DOI 10.21227/xe3c-e968).

66 citations



Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a new algorithm in which blockchain assisted Compressed algoRithm of fEderated leArning is applied for conTent caching, called CREAT to predict cached files.
Abstract: Edge computing architectures can help us quickly process the data collected by Internet of Things (IoT) and caching files to edge nodes can speed up the response speed of IoT devices requesting files. Blockchain architectures can help us ensure the security of data transmitted by IoT. Therefore, we have proposed a system which combines IoT devices, edge nodes, remote cloud and blockchain. In the system, we designed a new algorithm in which blockchain-assisted Compressed algoRithm of fEderated leArning is applied for conTent caching, called CREAT to predict cached files. In CREAT algorithm, each edge node uses local data to train a model and then uses the model to learn the features of users and files, so as to predict popular files to improve cache hit rate. In order to ensure the security of edge nodes’ data, we use federated learning (FL) to enable multiple edge nodes to cooperate in training without sharing data. In addition, for the purpose of reducing communication load in FL, we will compress gradients uploaded by edge nodes to reduce the time required for communication. What’s more, in order to ensure the security of the data transmitted in CREAT algorithm, we have incorporated blockchain technology in the algorithm. We design four smart contracts for decentralized entities to record and verify the transactions to ensure the security of data. We used MovieLens data sets for experiments and we can see that CREAT greatly improves the cache hit rate and reduces the time required to upload data.

64 citations


DOI
05 Jul 2021
TL;DR: In this article, a federated edge learning system is considered, where an edge server coordinates a set of edge devices to train a shared machine learning (ML) model based on their locally distributed data samples.
Abstract: This paper studies a federated edge learning system, in which an edge server coordinates a set of edge devices to train a shared machine learning (ML) model based on their locally distributed data samples. During the distributed training, we exploit the joint communication and computation design for improving the system energy efficiency, in which both the communication resource allocation for global ML-parameters aggregation and the computation resource allocation for locally updating ML-parameters are jointly optimized. In particular, we consider two transmission protocols for edge devices to upload ML-parameters to edge server, based on the non-orthogonal multiple access (NOMA) and time division multiple access (TDMA), respectively. Under both protocols, we minimize the total energy consumption at all edge devices over a particular finite training duration subject to a given training accuracy, by jointly optimizing the transmission power and rates at edge devices for uploading ML-parameters and their central processing unit (CPU) frequencies for local update. We propose efficient algorithms to solve the formulated energy minimization problems by using the techniques from convex optimization. Numerical results show that as compared to other benchmark schemes, our proposed joint communication and computation design significantly can improve the energy efficiency of the federated edge learning system, by properly balancing the energy tradeoff between communication and computation.

62 citations


Journal ArticleDOI
TL;DR: This paper proposes an IPFS-based (InterPlanetary File System-based) decentralized peer-to-peer image and video sharing platform built on top of blockchain technology and uses a perceptual hash (pHash) technique to detect copyright violations of images and videos.

50 citations


Journal ArticleDOI
04 Mar 2021
TL;DR: Appyters as mentioned in this paper enables the rapid development of interactive web-based bioinformatics applications by allowing users to upload their data and set various parameters for a multitude of data analysis workflows.
Abstract: Jupyter Notebooks have transformed the communication of data analysis pipelines by facilitating a modular structure that brings together code, markdown text, and interactive visualizations. Here, we extended Jupyter Notebooks to broaden their accessibility with Appyters. Appyters turn Jupyter Notebooks into fully functional standalone web-based bioinformatics applications. Appyters present to users an entry form enabling them to upload their data and set various parameters for a multitude of data analysis workflows. Once the form is filled, the Appyter executes the corresponding notebook in the cloud, producing the output without requiring the user to interact directly with the code. Appyters were used to create many bioinformatics web-based reusable workflows, including applications to build customized machine learning pipelines, analyze omics data, and produce publishable figures. These Appyters are served in the Appyters Catalog at https://appyters.maayanlab.cloud. In summary, Appyters enable the rapid development of interactive web-based bioinformatics applications.

Proceedings ArticleDOI
Tong Qin1, Yuxin Zheng1, Tongqing Chen1, Yilun Chen1, Qing Su1 
30 May 2021
TL;DR: Li et al. as discussed by the authors proposed a light-weight localization solution, which relies on low-cost cameras and compact visual semantic maps, which can be easily produced and updated by sensor-rich vehicles in a crowd-sourced way.
Abstract: Accurate localization is of crucial importance for autonomous driving tasks. Nowadays, we have seen a lot of sensor-rich vehicles (e.g. Robo-taxi) driving on the street autonomously, which rely on high-accurate sensors (e.g. Lidar and RTK GPS) and high-resolution map. However, low-cost production cars cannot afford such high expenses on sensors and maps. How to reduce costs? How do sensor-rich vehicles benefit low-cost cars? In this paper, we proposed a light-weight localization solution, which relies on low-cost cameras and compact visual semantic maps. The map is easily produced and updated by sensor-rich vehicles in a crowd-sourced way. Specifically, the map consists of several semantic elements, such as lane line, crosswalk, ground sign, and stop line on the road surface. We introduce the whole framework of on-vehicle mapping, on-cloud maintenance, and user-end localization. The map data is collected and preprocessed on vehicles. Then, the crowd-sourced data is uploaded to a cloud server. The mass data from multiple vehicles are merged on the cloud so that the semantic map is updated in time. Finally, the semantic map is compressed and distributed to production cars, which use this map for localization. We validate the performance of the proposed map in real-world experiments and compare it against other algorithms. The average size of the semantic map is 36 kb/km. We highlight that this framework is a reliable and practical localization solution for autonomous driving.

Journal ArticleDOI
TL;DR: The results show the proposed system addresses the issues of safety and environmental sustainability with an acceptable communication delay, compared to the baseline scenario where no advisory information is provided during the merging process.
Abstract: Ramp merging is considered as one of the most difficult driving scenarios due to the chaotic nature in both longitudinal and lateral driver behaviors (namely lack of effective coordination) in the merging area. In this study, we have designed a cooperative ramp merging system for connected vehicles, allowing merging vehicles to cooperate with others prior to arriving at the merging zone. Different from most of the existing studies that utilize dedicated short-range communication, we adopt a Digital Twin approach based on vehicle-to-cloud communication. On-board devices upload the data to the cloud server through the 4G/LTE cellular network. The server creates Digital Twins of vehicles and drivers whose parameters are synchronized in real time with their counterparts in the physical world, processes the data with the proposed models in the digital world, and sends advisory information back to the vehicles and drivers in the physical world. A real-world field implementation has been conducted in Riverside, California, with three passenger vehicles. The results show the proposed system addresses the issues of safety and environmental sustainability with an acceptable communication delay, compared to the baseline scenario where no advisory information is provided during the merging process.

Journal ArticleDOI
TL;DR: The optimizations of IoT-UAV data gathering and UAV-LEO data transmission are merged into an integrated optimization problem, which is solved with the aid of the successive convex approximation (SCA) and the block coordinate descent (BCD) techniques and achieves better performance than the benchmark algorithms in terms of both energy consumption and total upload data amount.
Abstract: With the advance of unmanned aerial vehicles (UAVs) and low earth orbit (LEO) satellites, the integration of space, air and ground networks has become a potential solution to the beyond fifth generation (B5G) Internet of remote things (IoRT) networks. However, due to the network heterogeneity and the high mobility of UAVs and LEOs, how to design an efficient UAV-LEO integrated data collection scheme without infrastructure support is very challenging. In this paper, we investigate the resource allocation problem for a two-hop uplink UAV-LEO integrated data collection for the B5G IoRT networks, where numerous UAVs gather data from IoT devices and transmit the IoT data to LEO satellites. In order to maximize the data gathering efficiency in the IoT-UAV data gathering process, we study the bandwidth allocation of IoT devices and the 3-dimensional (3D) trajectory design of UAVs. In the UAV-LEO data transmission process, we jointly optimize the transmit powers of UAVs and the selections of LEO satellites for the total uploaded data amount and the energy consumption of UAVs. Considering the relay role and the cache capacity limitations of UAVs, we merge the optimizations of IoT-UAV data gathering and UAV-LEO data transmission into an integrated optimization problem, which is solved with the aid of the successive convex approximation (SCA) and the block coordinate descent (BCD) techniques. Simulation results demonstrate that the proposed scheme achieves better performance than the benchmark algorithms in terms of both energy consumption and total upload data amount.

Journal ArticleDOI
TL;DR: This article proposes a vehicle cooperative positioning (CP) system based on federated learning (FedVCP), which makes full use of the potential of social Internet of Things (IoT) and collaborative edge computing (CEC) to provide high-precision positioning correction while ensuring user privacy.
Abstract: Intelligent vehicle applications, such as autonomous driving and collision avoidance, put forward a higher demand for precise positioning of vehicles. The current widely used global navigation satellite systems (GNSS) cannot meet the precision requirements of the submeter level. Due to the development of sensing techniques and vehicle-to-infrastructure (V2I) communications, some vehicles can interact with surrounding landmarks to achieve precise positioning. Existing work aims to realize the positioning correction of common vehicles by sharing the positioning data of sensor-rich vehicles. However, the privacy of trajectory data makes it difficult to collect and train data centrally. Moreover, uploading vehicle location data wastes network resources. To fill these gaps, this article proposes a vehicle cooperative positioning (CP) system based on federated learning (FedVCP), which makes full use of the potential of social Internet of Things (IoT) and collaborative edge computing (CEC) to provide high-precision positioning correction while ensuring user privacy. To the best of our knowledge, this article is the first attempt to solve the privacy of CP from a perspective of federated learning. In addition, we take the advantages of local cooperation through vehicle-to-vehicle (V2V) communications in data augmentation. For individual differences in vehicle positioning, we utilize transfer learning to eliminate the impact of such differences. Extensive experiments on real data demonstrate that our proposed model is superior to the baseline method in terms of effectiveness and convergence speed.

Journal ArticleDOI
TL;DR: A single keyword based searchable encryption scheme for the applications where multiple data owners upload their data and then multiple users can access the data that is proven adaptively secure against chosen-keyword attack in the random oracle model.
Abstract: Searchable encryption facilitates cloud server to search over encrypted data without decrypting the data. Single keyword based searchable encryption enables a user to access a subset of documents, which contains the keyword of the user’s interest. In this paper, we present a single keyword based searchable encryption scheme for the applications where multiple data owners upload their data and then multiple users can access the data. The scheme uses attribute based encryption that allows user to access the selective subset of data from cloud without revealing his/her access rights to the cloud server. The scheme is proven adaptively secure against chosen-keyword attack in the random oracle model. We have implemented the scheme on Google cloud instance and the performance of the scheme found practical in real-world applications.

Journal ArticleDOI
TL;DR: This paper proposes an effective Android malware detection system, MobiTive, leveraging customized deep neural networks to provide a real-time and responsive detection environment on mobile devices and investigates the performance of different feature extraction methods based on source code or binary code and the potential based on the evolution of mobile devices’ specifications.
Abstract: Currently, Android malware detection is mostly performed on server side against the increasing number of malware. Powerful computing resource provides more exhaustive protection for app markets than maintaining detection by a single user. However, apart from the applications (apps) provided by the official market (i.e., Google Play Store), apps from unofficial markets and third-party resources are always causing serious security threats to end-users. Meanwhile, it is a time-consuming task if the app is downloaded first and then uploaded to the server side for detection, because the network transmission has a lot of overhead. In addition, the uploading process also suffers from the security threats of attackers. Consequently, a last line of defense on mobile devices is necessary and much-needed. In this paper, we propose an effective Android malware detection system, MobiTive, leveraging customized deep neural networks to provide a real-time and responsive detection environment on mobile devices. MobiTive is a pre-installed solution rather than an app scanning and monitoring engine using after installation, which is more practical and secure. Although a deep learning-based approach can be maintained on server side efficiently for malware detection, original deep learning models cannot be directly deployed and executed on mobile devices due to various performance limitations, such as computation power, memory size, and energy. Therefore, we evaluate and investigate the following key points: (1) the performance of different feature extraction methods based on source code or binary code; (2) the performance of different feature type selections for deep learning on mobile devices; (3) the detection accuracy of different deep neural networks on mobile devices; (4) the real-time detection performance and accuracy on different mobile devices; (5) the potential based on the evolution trend of mobile devices’ specifications; and finally we further propose a practical solution (MobiTive) to detect Android malware on mobile devices.

Journal ArticleDOI
TL;DR: A framework of edge-based communication optimization is studied to reduce the number of end devices directly connected to the server while avoiding uploading unnecessary local updates, and a model cleaning method based on cosine similarity is proposed to avoid unnecessary communication.
Abstract: Federated learning can achieve the purpose of distributed machine learning without sharing privacy and sensitive data of end devices. However, high concurrent access to the server increases the transmission delay of model updates, and the local model may be an unnecessary model with the opposite gradient from the global model, thus incurring a large number of additional communication costs. To this end, we study a framework of edge-based communication optimization to reduce the number of end devices directly connected to the server while avoiding uploading unnecessary local updates. Specifically, we cluster devices in the same network location and deploy mobile edge nodes in different network locations to serve as hubs for cloud and end devices communications, thereby avoiding the latency associated with high server concurrency. Meanwhile, we propose a model cleaning method based on cosine similarity. If the value of similarity is less than a preset threshold, the local update will not be uploaded to the mobile edge nodes, thus avoid unnecessary communication. Experimental results show that compared with traditional federated learning, the proposed scheme reduces the number of local updates by 60%, and accelerates the convergence speed of the regression model by 10.3%.

Journal ArticleDOI
TL;DR: A distributed access control system based on blockchain technology to secure IoT data and can solve the problem of a single point of failure of access control by providing the dynamic and fine-grained access control for IoT data.
Abstract: With the development of the Internet of Things (IoT) field, more and more data are generated by IoT devices and transferred over the network. However, a large amount of IoT data is sensitive, and the leakage of such data is a privacy breach. The security of sensitive IoT data is a big issue, as the data is shared over an insecure network channel. Current solutions include symmetric encryption and access controls to secure the data transfer, but they have some drawbacks such as a single point of failure. Blockchain is a promising distributed ledger technology that can prevent the malicious tampering of data, offering reliable data storage. This paper proposes a distributed access control system based on blockchain technology to secure IoT data. The proposed mechanism is based on fog computing and the concept of the alliance chain. This method uses mixed linear and nonlinear spatiotemporal chaotic systems (MLNCML) and the least significant bit (LSB) to encrypt the IoT data on an edge node and then upload the encrypted data to the cloud. The proposed mechanism can solve the problem of a single point of failure of access control by providing the dynamic and fine-grained access control for IoT data. The experimental results of this method demonstrated that it can protect the privacy of IoT data efficiently.

Journal ArticleDOI
TL;DR: An Inter Planetary File System (IPFS) storage based double-blockchain solution for agricultural sampled data protection in IoT network and has smaller time consumption than cloud storage and blockchain only storage; as well as the system is more robust than other two systems.

Journal ArticleDOI
TL;DR: In this article, two bandwidth allocation schemes were proposed to maximize the number of active clients under the constraints of both latency and bandwidth in a federated learning network, where multiple mobile clients train their individual models with the help of one central server.
Abstract: This paper investigates a wireless federated learning (FL) network with limited communication bandwidth, where multiple mobile clients train their individual models with the help of one central server. We consider the practical communication scenarios, where the clients should complete the local computation and model upload within a defined latency. By jointly exploiting the dynamic characteristics of wireless channels and computational capability at the clients, we optimize the federated learning network by maximizing the number of active clients under the constraints of both latency and bandwidth. Specifically, we propose two bandwidth allocation (BA) schemes, where scheme I is based on the instantaneous channel state information (CSI), while scheme II employs the particle swarm optimization (PSO) method, based on the statistical CSI. Simulation results on the test accuracy and convergence rate are finally provided to demonstrate the advantages of the proposed optimization schemes for the considered FL network.

Journal ArticleDOI
TL;DR: This article designs a layered index based on segment trees to dynamically organize messages containing both spatial, temporal, and keyword information over a dynamic message data set in intelligent transportation system (ITS) scenarios and presents a two-server privacy-preserving spatiotemporal keyword query scheme.
Abstract: The sixth-generation (6G) communication technology has been attracting great interests from both industry and academia, as it is regarded as a promising approach to achieve more stable and low-latency communication. These promising features of 6G make it an enabler for cybertwin, a technique to create digital representations for physical objects to implement various functionalities. In this article, we consider a cybertwin-based spatiotemporal keyword query service over a dynamic message data set in intelligent transportation system (ITS) scenarios. Particularly, in the considered service, publishers upload messages to the cloud, and each cybertwin predictively launches queries to retrieve messages on behalf of the corresponding vehicle, such that each vehicle can timely receive messages that are of its interest whenever it arrives at a location. Nevertheless, as the cloud is not fully trustable, there exist privacy concerns related to the messages and queries. Up to now, although many schemes have been proposed to handle privacy-preserving spatial, temporal, or keyword queries, none of them can simultaneously support queries containing both spatial, temporal, and keyword criteria on dynamic data sets. Aiming at the issue, we design a layered index based on segment trees to dynamically organize messages containing both spatial, temporal, and keyword information. Moreover, based on a symmetric homomorphic encryption scheme, we encrypt the messages and queries and present a two-server privacy-preserving spatiotemporal keyword query scheme. We analyze the security of the proposed scheme and also conduct extensive experiments to evaluate its performance. The results show that our proposed scheme is indeed privacy preserving and computationally efficient.

Journal ArticleDOI
TL;DR: The simulation results show that the proposed priority-based service scheduling scheme can not only increase service ratio but also can improve fresh data service ratio and average request serving latency, indicating a viable and efficient approach for satisfying service requests at ZSP in the IoD.
Abstract: With the continuous miniaturization of sensors and processors and ubiquitous wireless connectivity, unmanned aerial vehicles (UAVs), also referred to as drones, are finding many new uses in enhancing our life and paving the way to the realization of Internet of Drones (IoD). In the IoD, a myriad of multisized and heterogeneous drones seamlessly interact with Zone Service Providers (ZSPs) to achieve the goal of assisting drones in accessing controlled airspace and providing navigation services. However, due to the high mobility of drones and the limited communication bandwidth between drones and ZSP, service scheduling becomes a critical issue when a set of drones wants to upload/download data to/from ZSP. In this article, we propose a priority-based service scheduling scheme, also named $Psched$ , to provide efficient data upload/download service at ZSP in the IoD. The basic idea is that the $Psched$ objectively and equitably assigns a weight to multiple service scheduling parameters based on multiattribute decision making theory, calculates the serving priority of each service request group, and then serves the service request groups based on the calculated serving priority accordingly. In addition, the $Psched$ takes into account of bandwidth competition between upload and download service requests, and provides a service request balancing to achieve the maximum benefits of service scheduling scheme. In the experimental study, we choose request deadline, data size, and data popularity as service scheduling parameters, and conduct extensive simulation experiments using OMNeT++ for performance evaluation and comparison. The simulation results show that the proposed priority-based service scheduling scheme can not only increase service ratio but also can improve fresh data service ratio and average request serving latency, indicating a viable and efficient approach for satisfying service requests at ZSP in the IoD.

Journal ArticleDOI
TL;DR: This survey looks into the issues involved in handling encrypted multimedia data, and more specifically it focuses on reversible data hiding in encrypted images (RDHEI) and the birth and evolution of RDHEI methods over the last 12 years.

Proceedings ArticleDOI
Dakuo Wang1, Josh Andres1, Justin D. Weisz1, Erick Oduor1, Casey Dugan1 
06 May 2021
TL;DR: In this paper, an automated machine learning (AutoML) system that aims to leverage the latest ML automation techniques to support data science projects is introduced. But the system is limited to a single dataset and the user does not have access to the entire dataset.
Abstract: Data science (DS) projects often follow a lifecycle that consists of laborious tasks for data scientists and domain experts (e.g., data exploration, model training, etc.). Only till recently, machine learning(ML) researchers have developed promising automation techniques to aid data workers in these tasks. This paper introduces AutoDS, an automated machine learning (AutoML) system that aims to leverage the latest ML automation techniques to support data science projects. Data workers only need to upload their dataset, then the system can automatically suggest ML configurations, preprocess data, select algorithm, and train the model. These suggestions are presented to the user via a web-based graphical user interface and a notebook-based programming user interface. Our goal is to offer a systematic investigation of user interaction and perceptions of using an AutoDS system in solving a data science task. We studied AutoDS with 30 professional data scientists, where one group used AutoDS, and the other did not, to complete a data science project. As expected, AutoDS improves productivity; Yet surprisingly, we find that the models produced by the AutoDS group have higher quality and less errors, but lower human confidence scores. We reflect on the findings by presenting design implications for incorporating automation techniques into human work in the data science lifecycle.

Proceedings ArticleDOI
Dakuo Wang1, Josh Andres1, Justin D. Weisz1, Erick Oduor1, Casey Dugan1 
TL;DR: In this article, an automated machine learning (AutoML) system that aims to leverage the latest ML automation techniques to support data science projects is presented. But, the system only needs data workers to upload their dataset, then the system can automatically suggest ML configurations, preprocess data, select algorithm, and train the model.
Abstract: Data science (DS) projects often follow a lifecycle that consists of laborious tasks for data scientists and domain experts (e.g., data exploration, model training, etc.). Only till recently, machine learning(ML) researchers have developed promising automation techniques to aid data workers in these tasks. This paper introduces AutoDS, an automated machine learning (AutoML) system that aims to leverage the latest ML automation techniques to support data science projects. Data workers only need to upload their dataset, then the system can automatically suggest ML configurations, preprocess data, select algorithm, and train the model. These suggestions are presented to the user via a web-based graphical user interface and a notebook-based programming user interface. We studied AutoDS with 30 professional data scientists, where one group used AutoDS, and the other did not, to complete a data science project. As expected, AutoDS improves productivity; Yet surprisingly, we find that the models produced by the AutoDS group have higher quality and less errors, but lower human confidence scores. We reflect on the findings by presenting design implications for incorporating automation techniques into human work in the data science lifecycle.

Journal ArticleDOI
TL;DR: A decentralized learning AMC (DecentAMC) method using model consolidation and lightweight design that substantially reduces the storage and computational capacity requirements of the EDs and communication overhead and shows remarkable improvement.
Abstract: Due to the implementation and performance limitations of centralized learning automatic modulation classification (CentAMC) method, this paper proposes a decentralized learning AMC (DecentAMC) method using model aggregation and lightweight design. Specifically, the model aggregation is realized by a central device (CD) for edge device (ED) model aggregation and multiple EDs for ED model training. The lightweight is designed by separable convolution neural network (S-CNN), in which the separable convolution layer is utilized to replace the standard convolution layer and most of fully connected layers are cut off. Simulation results show that the proposed method substantially reduces the storage and computational capacity requirements of the EDs and communication overhead. The training efficiency also shows remarkable improvement. Compared with convolution neural network (CNN), the space complexity (i.e., model parameters and output feature map) is decreased by about 94% and the time complexity (i.e., floating point operations) of S-CNN is decreased by about 96% while degrading the average correct classification probability by less than 1%. Compared with S-CNN-based CentAMC, without considering model weights uploading and downloading, the training efficiency of our proposed method is about N times of it, where N is the number of EDs. Considering the model weights uploading and downloading, the training efficiency of our proposed method can still be maintained at a high level (e.g., when the number of EDs is 12, the training efficency of the proposed AMC method is about 4 times that of S-CNN-based CentAMC in dataset D1=2FSK, 4FSK, 8FSK, BPSK, QPSK, 8PSK, 16QAM and about 5 times that of S-CNN-based CentAMC in dataset D2=2FSK, 4FSK, 8FSK, BPSK, QPSK, 8PSK, PAM2, PAM4, PAM8, 16QAM), while the communication overhead is reduced more than 35%.

Journal ArticleDOI
TL;DR: A distributed media transaction framework for DRM is proposed, which is based on the digital watermarking and a scalable blockchain model, which allows only authorized users to use online content and provide original multimedia content.
Abstract: Even though the Internet promotes data sharing and transparency, however it does not protect digital content. In today’s digital world, it has become a difficult task to release a DRM (Digital Rights Management) system that can be considered well-protected. Digital content that becomes easily available in open-source environments will in time be worthless to the creator. There may only be a one-time payment to creators upon initial upload to a given platform after which time the rights of the intellectual property are shifted to the platform itself. However, due to the online availability of content, anyone can download content and make copies. The value of digital content slowly decreases, because the value of content can usually be determined through the difficulty of it’s accessibility. There is no way to track the leakage or copyright for the spread of digital material. In this paper, a distributed media transaction framework for DRM is proposed, which is based on the digital watermarking and a scalable blockchain model. In this paper, our focus is on improving the classic blockchain systems to make it suitable for a DRM model. The DRM model in this paper allows only authorized users to use online content and provide original multimedia content. While the digital watermarking is used to reclaim the copyright ownership of offline contents in the event when the contents are leaked.

Journal ArticleDOI
TL;DR: A lightweight, privacy-preserving convolutional neural network model embedded in the designed secure computing protocols that allows CAVs to exchange raw sensor data without leaking private information, while offering great privacy protection of shared data and lightweight execution efficiency.
Abstract: Collaborative perception enables autonomous vehicles to exchange sensor data among each other to achieve cooperative object classification, which is considered an effective means to improve the perception accuracy of connected autonomous vehicles (CAVs). To protect information privacy in cooperative perception, we propose a lightweight, privacy-preserving cooperative object classification framework that allows CAVs to exchange raw sensor data (e.g., images captured by HD camera), without leaking private information. Leveraging chaotic encryption and additive secret sharing technique, image data is first encrypted into two ciphertexts and processed, in the encrypted format, by two separate edge servers. The use of chaotic mapping can avoid information leakage during data uploading. The encrypted images are then processed by the proposed privacy-preserving convolutional neural network (P-CNN) model embedded in the designed secure computing protocols. Finally, the processed results are combined/decrypted on the receiving vehicles to realize cooperative object classification. We formally prove the correctness and security of the proposed framework and carry out intensive experiments to evaluate its performance. Experiment results indicate that P-CNN offers exactly almost the same object classification results as the original CNN model, while offers great privacy protection of shared data and lightweight execution efficiency.

Posted Content
TL;DR: In this article, a multi-exit-based federated edge learning (ME-FEEL) framework is proposed, where the deep model can be divided into several sub-models with different depths and output prediction from the exit in the corresponding submodel.
Abstract: In this paper, we investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks. In this system, the IoT devices can collaboratively train a shared model without compromising data privacy. However, due to limited resources in the industrial IoT networks, including computational power, bandwidth, and channel state, it is challenging for many devices to accomplish local training and upload weights to the edge server in time. To address this issue, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework, where the deep model can be divided into several sub-models with different depths and output prediction from the exit in the corresponding sub-model. In this way, the devices with insufficient computational power can choose the earlier exits and avoid training the complete model, which can help reduce computational latency and enable devices to participate into aggregation as much as possible within a latency threshold. Moreover, we propose a greedy approach-based exit selection and bandwidth allocation algorithm to maximize the total number of exits in each communication round. Simulation experiments are conducted on the classical Fashion-MNIST dataset under a non-independent and identically distributed (non-IID) setting, and it shows that the proposed strategy outperforms the conventional FL. In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.