scispace - formally typeset
Search or ask a question

Showing papers on "Upload published in 2022"


Journal ArticleDOI
TL;DR: The DAVID Gene system was rebuilt to gain coverage of more organisms, which increased the taxonomy coverage from 17 399 to 55 464 and a species parameter for uploading a list of gene symbols to minimize the ambiguity between species, which increases the efficiency of the list upload and eliminates confusion for users.
Abstract: DAVID is a popular bioinformatics resource system including a web server and web service for functional annotation and enrichment analyses of gene lists. It consists of a comprehensive knowledgebase and a set of functional analysis tools. Here, we report all updates made in 2021. The DAVID Gene system was rebuilt to gain coverage of more organisms, which increased the taxonomy coverage from 17 399 to 55 464. All existing annotation types have been updated, if available, based on the new DAVID Gene system. Compared with the last version, the number of gene-term records for most annotation types within the updated Knowledgebase have significantly increased. Moreover, we have incorporated new annotations in the Knowledgebase including small molecule-gene interactions from PubChem, drug-gene interactions from DrugBank, tissue expression information from the Human Protein Atlas, disease information from DisGeNET, and pathways from WikiPathways and PathBank. Eight of ten subgroups split from Uniprot Keyword annotation were assigned to specific types. Finally, we added a species parameter for uploading a list of gene symbols to minimize the ambiguity between species, which increases the efficiency of the list upload and eliminates confusion for users. These current updates have significantly expanded the Knowledgebase and enhanced the discovery power of DAVID.

860 citations


Journal ArticleDOI
TL;DR: The DAVID Gene system as discussed by the authors was rebuilt to gain coverage of more organisms, which increased the taxonomy coverage from 17 399 to 55 464, and the number of gene-term records for most annotation types within the updated knowledgebase have significantly increased.
Abstract: Abstract DAVID is a popular bioinformatics resource system including a web server and web service for functional annotation and enrichment analyses of gene lists. It consists of a comprehensive knowledgebase and a set of functional analysis tools. Here, we report all updates made in 2021. The DAVID Gene system was rebuilt to gain coverage of more organisms, which increased the taxonomy coverage from 17 399 to 55 464. All existing annotation types have been updated, if available, based on the new DAVID Gene system. Compared with the last version, the number of gene-term records for most annotation types within the updated Knowledgebase have significantly increased. Moreover, we have incorporated new annotations in the Knowledgebase including small molecule-gene interactions from PubChem, drug-gene interactions from DrugBank, tissue expression information from the Human Protein Atlas, disease information from DisGeNET, and pathways from WikiPathways and PathBank. Eight of ten subgroups split from Uniprot Keyword annotation were assigned to specific types. Finally, we added a species parameter for uploading a list of gene symbols to minimize the ambiguity between species, which increases the efficiency of the list upload and eliminates confusion for users. These current updates have significantly expanded the Knowledgebase and enhanced the discovery power of DAVID.

797 citations


Journal ArticleDOI
TL;DR: Federated learning allows several actors to collaborate on the development of a single, robust machine learning model without sharing data, allowing crucial issues such as data privacy, data security, data access rights, and access to heterogeneous data to be addressed.
Abstract: Federated learning (also known as collaborative learning) is a machine learning technique that trains an algorithm without transferring data samples across numerous decentralized edge devices or servers. This strategy differs from standard centralized machine learning techniques in which all local datasets are uploaded to a single server, as well as more traditional decentralized alternatives, which frequently presume that local data samples are uniformly distributed. Federated learning allows several actors to collaborate on the development of a single, robust machine learning model without sharing data, allowing crucial issues such as data privacy, data security, data access rights, and access to heterogeneous data to be addressed. Defence, telecommunications, internet of things, and pharmaceutical industries are just a few of the sectors where it has applications.

334 citations


Journal ArticleDOI
TL;DR: In this article , a joint content caching and user association optimization problem is formulated to minimize the content download latency, and a joint CC and UA optimization algorithm (JCC-UA) is proposed.
Abstract: Deploying small cell base stations (SBS) under the coverage area of a macro base station (MBS), and caching popular contents at the SBSs in advance, are effective means to provide high-speed and low-latency services in next generation mobile communication networks. In this paper, we investigate the problem of content caching (CC) and user association (UA) for edge computing. A joint CC and UA optimization problem is formulated to minimize the content download latency. We prove that the joint CC and UA optimization problem is NP-hard. Then, we propose a CC and UA algorithm (JCC-UA) to reduce the content download latency. JCC-UA includes a smart content caching policy (SCCP) and dynamic user association (DUA). SCCP utilizes the exponential smoothing method to predict content popularity and cache contents according to prediction results. DUA includes a rapid association (RA) method and a delayed association (DA) method. Simulation results demonstrate that the proposed JCC-UA algorithm can effectively reduce the latency of user content downloading and improve the hit rates of contents cached at the BSs as compared to several baseline schemes.

85 citations


Journal ArticleDOI
TL;DR: Simulation results obtained under heterogeneous home environments indicate the advantage of the proposed approach in terms of convergence speed, appliance energy consumption, and number of agents.
Abstract: This article proposesa novel federated reinforcement learning (FRL) approach for the energy management of multiple smart homes with home appliances, a solar photovoltaic system, and an energy storage system. The novelty of the proposed FRL approach lies in the development of a distributed deep reinforcement learning (DRL) model that consists of local home energy management systems (LHEMSs) and a global server (GS). Using energy consumption data, DRL agents for LHEMSs construct and upload their local models to the GS. Then, the GS aggregates the local models to update a global model for LHEMSs and broadcasts it to the DRL agents. Finally, the DRL agents replace the previous local models with the global model and iteratively reconstruct their local models. Simulation results obtained under heterogeneous home environments indicate the advantage of the proposed approach in terms of convergence speed, appliance energy consumption, and number of agents.

70 citations


Journal ArticleDOI
TL;DR: In this paper , the authors proposed an efficient communication approach, which consists of three parts, including "customized", "partial" and "flexible", known as FedCPF.
Abstract: The sixth-generation network (6G) is expected to achieve a fully connected world, which makes full use of a large amount of sensitive data. Federated Learning (FL) is an emerging distributed computing paradigm. In Vehicular Edge Computing (VEC), FL is used to protect consumer data privacy. However, using FL in VEC will lead to expensive communication overheads, thereby occupying regular communication resources. In the traditional FL, the massive communication rounds before convergence lead to enormous communication costs. Furthermore, in each communication round, many clients upload large quantity model parameters to the parameter server in the uplink communication phase, which increases communication overheads. Moreover, a few straggler links and clients may prolong training time in each round, which will decrease the efficiency of FL and potentially increase the communication costs. In this work, we propose an efficient-communication approach, which consists of three parts, including “Customized”, “Partial”, and “Flexible”, known as FedCPF. FedCPF provides a customized local training strategy for vehicular clients to achieve convergence quickly through a constraint item within fewer communication rounds. Moreover, considering the uplink congestion, we introduce a partial client participation rule to avoid numerous vehicles uploading their updates simultaneously. Besides, regarding the diverse finishing time points of federated training, we present a flexible aggregation policy for valid updates by constraining the upload time. Experimental results show that FedCPF outperforms the traditional FedAVG algorithm in terms of testing accuracy and communication optimization in various FL settings. Compared with the baseline, FedCPF achieves efficient communication with faster convergence speed and improves test accuracy by 6.31% on average. In addition, the average communication optimization rate is improved by 2.15 times.

56 citations


Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed an efficient privacy-preserving data aggregation mechanism for federated learning to resist the reverse attack, which can aggregate users trained models secretly without leaking the users model.
Abstract: Federated learning (FL) is a kind of privacy-aware machine learning, in which the machine learning models are trained on the users side and then the model updates are transmitted to the server for aggregating. As the data owners need not upload their data, FL is a privacy-persevering machine learning model. However, FL is weak as it suffers from a reverse attack, in which an adversary can get users data by analyzing the user uploaded model. Motivated by this, in this paper, based on the secret sharing, we design EPPDA, an efficient privacy-preserving data aggregation mechanism for FL, to resist the reverse attack, which can aggregate users trained models secretly without leaking the users model. Moreover, EPPDA has efficient fault tolerance [1] for the user disconnection. Even if a large number of users are disconnected when the protocol runs, EPPDA will execute normally. Analysis shows that the EPPDA can provide a sum of locally trained models to the server without leaking any single users model. Moreover, adversary can not get any non-public information from the communication channel. Efficiency verification proves that the EPPDA not only protects users privacy but also needs fewer computing and communication resources.

52 citations


Journal ArticleDOI
TL;DR: In this article , a federated learning approach of POI recommendation is proposed to provide preferable POI recommendations while protecting user privacy of data communication in a distributed collaborative environment, where only calculated gradient information is uploaded from users to the FL server while all the users manage their rating and geographic preference data on their own devices for privacy protection during communications.
Abstract: With the popularity of Internet of Things (IoT), Point-of-Interest (POI) recommendation has become an important application for location-based services (LBS). Meanwhile, there is an increasing requirement from IoT devices on the privacy of user sensitive data via wireless communications. In order to provide preferable POI recommendations while protecting user privacy of data communication in a distributed collaborative environment, this paper proposes a federated learning (FL) approach of geographical POI recommendation. The POI recommendation is formulated by an optimization problem of matrix factorization, and singular value decomposition (SVD) technique is applied for matrix decomposition. After proving the nonconvex property of the optimization problem, we further introduce stochastic gradient descent (SGD) into SVD and design an FL framework for solving the POI recommendation problem in a parallel manner. In our FL scheme, only calculated gradient information is uploaded from users to the FL server while all the users manage their rating and geographic preference data on their own devices for privacy protection during communications. Finally, real-world dataset from large-scale LBS enterprise is adopted for conducting extensive experiments, whose experimental results validate the efficacy of our approach.

48 citations


Journal ArticleDOI
01 Jan 2022
TL;DR: FSD50K as mentioned in this paper is an open dataset containing over 51 k audio clips and over 100 h of audio manually labeled using 200 classes drawn from the AudioSet Ontology, which is used for sound event recognition.
Abstract: Most existing datasets for sound event recognition (SER) are relatively small and/or domain-specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and encompassing over 500 sound classes. However, AudioSet is not an open dataset as its official release consists of pre-computed audio features. Downloading the original audio tracks can be problematic due to YouTube videos gradually disappearing and usage rights issues. To provide an alternative benchmark dataset and thus foster SER research, we introduce FSD50K , an open dataset containing over 51 k audio clips totalling over 100 h of audio manually labeled using 200 classes drawn from the AudioSet Ontology. The audio clips are licensed under Creative Commons licenses, making the dataset freely distributable (including waveforms). We provide a detailed description of the FSD50K creation process, tailored to the particularities of Freesound data, including challenges encountered and solutions adopted. We include a comprehensive dataset characterization along with discussion of limitations and key factors to allow its audio-informed usage. Finally, we conduct sound event classification experiments to provide baseline systems as well as insight on the main factors to consider when splitting Freesound audio data for SER. Our goal is to develop a dataset to be widely adopted by the community as a new open benchmark for SER research.

43 citations


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a digital forensics tool to protect end users in 5G heterogeneous networks, which is built based on deep learning and can realize the detection of attacks via classification.
Abstract: The upcoming 5G heterogeneous networks (HetNets) have attracted much attention worldwide. Large amounts of high-velocity data can be transported by using the bandwidth spectrum of HetNets, yielding both great benefits and several concerning issues. In particular, great harm to our community could occur if the main visual information channels, such as images and videos, are maliciously attacked and uploaded to the Internet, where they can be spread quickly. Therefore, we propose a novel framework as a digital forensics tool to protect end users. It is built based on deep learning and can realize the detection of attacks via classification. Compared with the conventional methods and justified by our experiments, the data collection efficiency, robustness, and detection performance of the proposed model are all refined. In addition, assisted by 5G HetNets, our proposed framework makes it possible to provide high-quality real-time forensics services on edge consumer devices such as cell phone and laptops, which brings colossal practical value. Some discussions are also carried out to outline potential future threats.

42 citations


Journal ArticleDOI
TL;DR: In this paper , the authors proposed a federated reinforcement learning (FRL) approach for the energy management of multiple smart homes with home appliances, a solar photovoltaic system, and an energy storage system.
Abstract: This article proposesa novel federated reinforcement learning (FRL) approach for the energy management of multiple smart homes with home appliances, a solar photovoltaic system, and an energy storage system. The novelty of the proposed FRL approach lies in the development of a distributed deep reinforcement learning (DRL) model that consists of local home energy management systems (LHEMSs) and a global server (GS). Using energy consumption data, DRL agents for LHEMSs construct and upload their local models to the GS. Then, the GS aggregates the local models to update a global model for LHEMSs and broadcasts it to the DRL agents. Finally, the DRL agents replace the previous local models with the global model and iteratively reconstruct their local models. Simulation results obtained under heterogeneous home environments indicate the advantage of the proposed approach in terms of convergence speed, appliance energy consumption, and number of agents.

Journal ArticleDOI
TL;DR: In this paper, the authors propose Overlap-FedAvg, an innovative framework that loosed the chain-like constraint of federated learning and paralleled the model training phase with the model communication phase, so that the latter phase could be totally covered by the former phase.
Abstract: While petabytes of data are generated each day by a number of independent computing devices, only a few of them can be finally collected and used for deep learning (DL) due to the apprehension of data security and privacy leakage, thus seriously retarding the extension of DL. In such a circumstance, federated learning (FL) was proposed to perform model training by multiple clients’ combined data without the dataset sharing within the cluster. Nevertheless, federated learning with periodic model averaging (FedAvg) introduced massive communication overhead as the synchronized data in each iteration is about the same size as the model, and thereby leading to a low communication efficiency. Consequently, variant proposals focusing on the communication rounds reduction and data compression were proposed to decrease the communication overhead of FL. In this article, we propose Overlap-FedAvg, an innovative framework that loosed the chain-like constraint of federated learning and paralleled the model training phase with the model communication phase (i.e., uploading local models and downloading the global model), so that the latter phase could be totally covered by the former phase. Compared to vanilla FedAvg, Overlap-FedAvg was further developed with a hierarchical computing strategy, a data compensation mechanism, and a nesterov accelerated gradients (NAG) algorithm. In Particular, Overlap-FedAvg is orthogonal to many other compression methods so that they could be applied together to maximize the utilization of the cluster. Besides, the theoretical analysis is provided to prove the convergence of the proposed framework. Extensive experiments conducting on both image classification and natural language processing tasks with multiple models and datasets also demonstrate that the proposed framework substantially reduced the communication overhead and boosted the federated learning process.

Journal ArticleDOI
TL;DR: In this article , the authors propose Overlap-FedAvg, an innovative framework that loosed the chain-like constraint of federated learning and paralleled the model training phase with the model communication phase, so that the latter phase could be totally covered by the former phase.
Abstract: While petabytes of data are generated each day by a number of independent computing devices, only a few of them can be finally collected and used for deep learning (DL) due to the apprehension of data security and privacy leakage, thus seriously retarding the extension of DL. In such a circumstance, federated learning (FL) was proposed to perform model training by multiple clients' combined data without the dataset sharing within the cluster. Nevertheless, federated learning with periodic model averaging (FedAvg) introduced massive communication overhead as the synchronized data in each iteration is about the same size as the model, and thereby leading to a low communication efficiency. Consequently, variant proposals focusing on the communication rounds reduction and data compression were proposed to decrease the communication overhead of FL. In this article, we propose Overlap-FedAvg, an innovative framework that loosed the chain-like constraint of federated learning and paralleled the model training phase with the model communication phase (i.e., uploading local models and downloading the global model), so that the latter phase could be totally covered by the former phase. Compared to vanilla FedAvg, Overlap-FedAvg was further developed with a hierarchical computing strategy, a data compensation mechanism, and a nesterov accelerated gradients (NAG) algorithm. In Particular, Overlap-FedAvg is orthogonal to many other compression methods so that they could be applied together to maximize the utilization of the cluster. Besides, the theoretical analysis is provided to prove the convergence of the proposed framework. Extensive experiments conducting on both image classification and natural language processing tasks with multiple models and datasets also demonstrate that the proposed framework substantially reduced the communication overhead and boosted the federated learning process.

Journal ArticleDOI
TL;DR: Clinicians should be aware of the widespread dissemination of health misinformation on social media platforms and its potential impact on clinical care, as TikTok videos about ADHD were misleading.
Abstract: Objectives Social media platforms are increasingly being used to disseminate mental health information online. User-generated content about attention-deficit/hyperactivity disorder (ADHD) is one of the most popular health topics on the video-sharing social media platform TikTok. We sought to investigate the quality of TikTok videos about ADHD. Method The top 100 most popular videos about ADHD uploaded by TikTok video creators were classified as misleading, useful, or personal experience. Descriptive and quantitative characteristics of the videos were obtained. The Patient Education Materials Assessment Tool for Audiovisual Materials (PEMAT-A/V) and Journal of American Medical Association (JAMA) benchmark criteria were used to assess the overall quality, understandability, and actionability of the videos. Results Of the 100 videos meeting inclusion criteria, 52% (n = 52) were classified as misleading, 27% (n = 27) as personal experience, and 21% (n = 21) as useful. Classification agreement between clinician ratings was 86% (kappa statistic of 0.7766). Videos on the platform were highly understandable by viewers but had low actionability. Non-healthcare providers uploaded the majority of misleading videos. Healthcare providers uploaded higher quality and more useful videos, compared to non-healthcare providers. Conclusions Approximately half of the analyzed TikTok videos about ADHD were misleading. Clinicians should be aware of the widespread dissemination of health misinformation on social media platforms and its potential impact on clinical care.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed two novel schemes for outsourcing differential privacy, which can efficiently achieve outsourcing by using preprocessing method and secure building blocks, and to support the queries from multiple evaluators, they give the second scheme that employs a trusted execution environment to aggregately implement privacy mechanisms on multiple queries.
Abstract: Since big data becomes a main impetus to the next generation of IT industry, data privacy has received considerable attention in recent years. To deal with the privacy challenges, differential privacy has been widely discussed and related private mechanisms are proposed as privacy-enhancing techniques. However, with today’s differential privacy techniques, it is difficult to generate a sanitized dataset that can suit every machine learning task. In order to adapt to various tasks and budgets, different kinds of privacy mechanisms have to be implemented, which inevitably incur enormous costs for computation and interaction. To this end, in this article, we propose two novel schemes for outsourcing differential privacy. The first scheme efficiently achieves outsourcing differential privacy by using our preprocessing method and secure building blocks. To support the queries from multiple evaluators, we give the second scheme that employs a trusted execution environment to aggregately implement privacy mechanisms on multiple queries. During data publishing, our proposed schemes allow providers to go off-line after uploading their datasets, so that they achieve a low communication cost which is one of the critical requirements for a practical system. Finally, we report an experimental evaluation on UCI datasets, which confirms the effectiveness of our schemes.

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers and derived a theoretical convergence upper bound for the UDP algorithm.
Abstract: Federated learning (FL), as a type of collaborative machine learning framework, is capable of preserving private data from mobile terminals (MTs) while training the data into useful models. Nevertheless, from a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs. To address this problem, we first make use of the concept of local differential privacy (LDP), and propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers. According to our analysis, the UDP framework can realize $(\epsilon _{i}, \delta _{i})$ -LDP for the $i$ th MT with adjustable privacy protection levels by varying the variances of the artificial noise processes. We then derive a theoretical convergence upper-bound for the UDP algorithm. It reveals that there exists an optimal number of communication rounds to achieve the best learning performance. More importantly, we propose a communication rounds discounting (CRD) method. Compared with the heuristic search method, the proposed CRD method can achieve a much better trade-off between the computational complexity of searching and the convergence performance. Extensive experiments indicate that our UDP algorithm using the proposed CRD method can effectively improve both the training efficiency and model quality for the given privacy protection levels.

Journal ArticleDOI
TL;DR: A secure distributed data storage scheme is proposed, which can be used in blockchain enabled edge computing and can be executed with low computational cost, which is practical for IOT environment in blockchainenabled edge computing.

Journal ArticleDOI
TL;DR: This work proposes a provably secure efficient data-sharing scheme without RSU for 5G-enabled vehicular networks that not only achieves privacy and security requirements but also withstands various security attacks on the vehicular network.
Abstract: The vehicles in the fifth-generation (5G)-enabled vehicular networks exchange the data about road conditions, since the message transmission rate and the downloading service rate have been considerably brighter. The data shared by vehicles are vulnerable to privacy and security issues. Notably, the existing schemes require expensive components, namely a road-side unit (RSU), to authenticate the messages for the joining process. To cope with these issues, this paper proposes a provably secure efficient data-sharing scheme without RSU for 5G-enabled vehicular networks. Our work included six phases, namely: TA initialization (TASetup) phase, pseudonym-identity generation (PIDGen) phase, key generation (KeyGen) phase, message signing (MsgSign) phase, single verification (SigVerify) phase, and batch signatures verification (BSigVerify) phase. The vehicle in our work has the ability to verify multiple signatures simultaneously. Our work not only achieves privacy and security requirements but also withstands various security attacks on the vehicular network. Ultimately, our work also evaluates favourable performance compared to other existing schemes with regards to costs of communication and computation.

Journal ArticleDOI
TL;DR: In this article , multiple reconfigurable intelligent surfaces (RISs) are used to achieve efficient and reliable learning-oriented wireless connectivity to solve the problem of model aggregation in federated learning systems.
Abstract: The fundamental communication paradigms in the next-generation mobile networks are shifting from connected things to connected intelligence. The potential result is that current communication-centric wireless systems are greatly stressed when supporting computation-centric intelligent services with distributed big data. This is one reason that makes federated learning come into being, it allows collaborative training over many edge devices while avoiding the transmission of raw data. To tackle the problem of model aggregation in federated learning systems, this article resorts to multiple reconfigurable intelligent surfaces (RISs) to achieve efficient and reliable learning-oriented wireless connectivity. The seamless integration of communication and computation is actualized by over-the-air computation (AirComp), which can be deemed as one of the uplink nonorthogonal multiple access (NOMA) techniques without individual information decoding. Since all local parameters are uploaded via noisy concurrent transmissions, the unfavorable propagation error inevitably deteriorates the accuracy of the aggregated global model. The goals of this work are to 1) alleviate the signal distortion of AirComp over shared wireless channels and 2) speed up the convergence rate of federated learning. More specifically, both the mean-square error (MSE) and the device set in the model uploading process are optimized by jointly designing transceivers, tuning reflection coefficients, and selecting clients. Compared to baselines, extensive simulation results show that 1) the proposed algorithms can aggregate model more accurately and accelerate convergence and 2) the training loss and inference accuracy of federated learning can be improved significantly with the aid of multiple RISs.

Journal ArticleDOI
TL;DR: An innovative learning framework are proposed for AMC (named DeEnAMC), in which the framework is realized by utilizing the combination of decentralized learning and ensemble learning, and shows that the proposed DeenAMC reduces communication overhead while keeping a similar classification performance to DecentAMC.
Abstract: To deal with the deep learning-based automatic modulation classification (AMC) in the scenario that the training dataset are distributed over a network without gathering the data at a centralized location, the decentralized learning-based AMC (DecentAMC) had been presented. However, there exists frequent model parameter uploading and downloading in DecentAMC method, which cause high communication overhead. In this paper, an innovative learning framework are proposed for AMC (named DeEnAMC), in which the framework is realized by utilizing the combination of decentralized learning and ensemble learning. Our results show that the proposed DeEnAMC reduces communication overhead while keeping a similar classification performance to DecentAMC.

Journal ArticleDOI
01 Jan 2022
TL;DR: In this paper , a secure distributed data storage scheme is proposed, which can be used in blockchain enabled edge computing, and performance analysis result shows that the proposed scheme can be executed with low computational cost, which is practical for IOT environment in blockchain-enabled edge computing.
Abstract: The technique of Internet of things (IoT) connects the distributed devices via the network that can realize smart applications, such as intelligent transportation, intelligent manufacturing, smart grid and smart home. Edge computing integrates the edge devices together and provides efficient computation and storage services with low latency for users. By combining with blockchain, edge computing can provide more secure data storage and transmission supporting tamper resistance and traceability for IoT. However, the existing data storage schemes are not suitable for processing the gathered data in blockchain enabled edge computing and the efficiency of the data error locating in the previous schemes is very low. In this paper, a secure distributed data storage scheme is proposed, which can be used in blockchain enabled edge computing. In the design of the proposed scheme, the technologies of bilinear pairing and BLS-HLA (BLS-Homomorphic Linear Authenticator) are used, which allows end clients to check the uploaded data storage. In addition, the mechanisms of error locating and data dynamics are supported due to the utilization of CBF (Counting Bloom Filter). Security analysis result indicates that the proposed scheme is correct, data blocks detectable and false positive rate negligible. Performance analysis result shows that the proposed scheme can be executed with low computational cost, which is practical for IOT environment in blockchain enabled edge computing.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a new algorithm in which blockchain-assisted compressed algorithm of federated learning is applied for content caching, called CREAT to predict cached files, where each edge node uses local data to train a model and then uses the model to predict popular files to improve the cache hit rate.
Abstract: Edge computing architectures can help us quickly process the data collected by Internet of Things (IoT) and caching files to edge nodes can speed up the response speed of IoT devices requesting files. Blockchain architectures can help us ensure the security of data transmitted by IoT. Therefore, we have proposed a system that combines IoT devices, edge nodes, remote cloud, and blockchain. In the system, we designed a new algorithm in which blockchain-assisted compressed algorithm of federated learning is applied for content caching, called CREAT to predict cached files. In the CREAT algorithm, each edge node uses local data to train a model and then uses the model to learn the features of users and files, so as to predict popular files to improve the cache hit rate. In order to ensure the security of edge nodes’ data, we use federated learning (FL) to enable multiple edge nodes to cooperate in training without sharing data. In addition, for the purpose of reducing communication load in FL, we will compress gradients uploaded by edge nodes to reduce the time required for communication. What is more, in order to ensure the security of the data transmitted in the CREAT algorithm, we have incorporated blockchain technology in the algorithm. We design four smart contracts for decentralized entities to record and verify the transactions to ensure the security of data. We used MovieLens data sets for experiments and we can see that CREAT greatly improves the cache hit rate and reduces the time required to upload data.

Journal ArticleDOI
TL;DR: In this article , the authors investigate a novel scenario of computation offloading in MEC-assisted architecture, where task upload coordination between multiple vehicles, task migration between MEC/cloud servers and heterogeneous computation capabilities of MEC and cloud severs, are comprehensively investigated.
Abstract: Mobile edge computing (MEC) has been an effective paradigm for supporting computation-intensive applications by offloading resources at network edge. Especially in vehicular networks, the MEC server, is deployed as a small-scale computation server at the roadside and offloads computation-intensive task to its local server. However, due to the unique characteristics of vehicular networks, including high mobility of vehicles, dynamic distribution of vehicle densities and heterogeneous capacities of MEC servers, it is still challenging to implement efficient computation offloading mechanism in MEC-assisted vehicular networks. In this article, we investigate a novel scenario of computation offloading in MEC-assisted architecture, where task upload coordination between multiple vehicles, task migration between MEC/cloud servers and heterogeneous computation capabilities of MEC/cloud severs, are comprehensively investigated. On this basis, we formulate cooperative computation offloading (CCO) problem by modeling the procedure of task upload, migration and computation based on queuing theory, which aims at minimizing the delay of task completion. To tackle the CCO problem, we propose a probabilistic computation offloading (PCO) algorithm, which enables MEC server to independently make online scheduling based on the derived allocation probability. Specifically, the PCO transforms the objective function into augmented Lagrangian and achieves the optimal solution in an iterative way, based on a convex framework called Alternating Direction Method of Multipliers (ADMM). Last but not the least, we implement the simulation model. The comprehensive simulation results show the superiority of the proposed algorithm under a wide range of scenarios.

Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed a lightweight, privacy-preserving cooperative object classification framework that allows CAVs to exchange raw sensor data (e.g., images captured by HD camera), without leaking private information.
Abstract: Collaborative perception enables autonomous vehicles to exchange sensor data among each other to achieve cooperative object classification, which is considered an effective means to improve the perception accuracy of connected autonomous vehicles (CAVs). To protect information privacy in cooperative perception, we propose a lightweight, privacy-preserving cooperative object classification framework that allows CAVs to exchange raw sensor data (e.g., images captured by HD camera), without leaking private information. Leveraging chaotic encryption and additive secret sharing technique, image data are first encrypted into two ciphertexts and processed, in the encrypted format, by two separate edge servers. The use of chaotic mapping can avoid information leakage during data uploading. The encrypted images are then processed by the proposed privacy-preserving convolutional neural network (P-CNN) model embedded in the designed secure computing protocols. Finally, the processed results are combined/decrypted on the receiving vehicles to realize cooperative object classification. We formally prove the correctness and security of the proposed framework and carry out intensive experiments to evaluate its performance. The experimental results indicate that P-CNN offers exactly almost the same object classification results as the original CNN model, while offering great privacy protection of shared data and lightweight execution efficiency.

Journal ArticleDOI
TL;DR: In this article , a joint algorithm of UAV placement, power control, transmission time, model accuracy, bandwidth allocation, and computing resources is proposed to minimize the total energy consumption of the aerial server and users.
Abstract: Since the invention in 2016, federated learning (FL) has been a key concept of artificial intelligence, in which the data of FL users needs not to be uploaded to the central server. However, performing FL tasks may not be feasible due to the unavailability of terrestrial communications and the battery limitation of FL users. To address these issues, we make use of unmanned aerial vehicles (UAVs) and wireless powered communications (WPC) for FL networks. In order to enable sustainable FL solutions, the UAV equipped with edge computing and WPC capabilities is deployed as an aerial energy source as well as an aerial server to perform FL tasks. We propose a joint algorithm of UAV placement, power control, transmission time, model accuracy, bandwidth allocation, and computing resources, namely energy-efficient FL (E2FL), aiming at minimizing the total energy consumption of the aerial server and users. The E2FL overcomes the original nonconvex problem by an efficient algorithm. We show that sustainable FL solutions can be provided via UAV-enabled WPC through various simulation results. Moreover, the outperformance of E2FL in terms of energy efficiency over several benchmarks emphasizes the need for a joint resource allocation framework rather than optimizing a subset of optimization factors.

Journal ArticleDOI
TL;DR: A transfer learning (TL)-enabled edge-CNN framework for 5G industrial edge networks with privacy-preserving characteristic that can achieve almost 85% prediction accuracy of the baseline by uploading only about 1% model parameters, for a compression ratio of $32$ of the autoencoder.
Abstract: In this article, we propose a transfer learning (TL) enabled edge convolutional neural network (CNN) framework for 5G industrial edge networks with privacy-preserving characteristic. In particular, the edge server can use the existing image dataset to train the CNN in advance, which is further fine-tuned based on the limited datasets uploaded from the devices. With the aid of TL, the devices that are not participating in the training only need to fine-tune the trained edge-CNN model without training from scratch. Due to the energy budget of the devices and the limited communication bandwidth, a joint energy and latency problem is formulated, which is solved by decomposing the original problem into an uploading decision subproblem and a wireless bandwidth allocation subproblem. Experiments using ImageNet demonstrate that the proposed TL-enabled edge-CNN framework can achieve almost 85% prediction accuracy of the baseline by uploading only about 1% model parameters, for a compression ratio of 32 of the autoencoder.

Journal ArticleDOI
01 Mar 2022
TL;DR: Li et al. as mentioned in this paper proposed a federated trained ternary quantization (FTTQ) algorithm, which optimizes the quantized networks on the clients through a self-learning quantization factor.
Abstract: Learning over massive data stored in different locations is essential in many real-world applications. However, sharing data is full of challenges due to the increasing demands of privacy and security with the growing use of smart mobile devices and IoT devices. Federated learning provides a potential solution to privacy-preserving and secure machine learning, by means of jointly training a global model without uploading data distributed on multiple devices to a central server. However, most existing work on federated learning adopts machine learning models with full-precision weights, and almost all these models contain a large number of redundant parameters that do not need to be transmitted to the server, consuming an excessive amount of communication costs. To address this issue, we propose a federated trained ternary quantization (FTTQ) algorithm, which optimizes the quantized networks on the clients through a self-learning quantization factor. Theoretical proofs of the convergence of quantization factors, unbiasedness of FTTQ, as well as a reduced weight divergence are given. On the basis of FTTQ, we propose a ternary federated averaging protocol (T-FedAvg) to reduce the upstream and downstream communication of federated learning systems. Empirical experiments are conducted to train widely used deep learning models on publicly available datasets, and our results demonstrate that the proposed T-FedAvg is effective in reducing communication costs and can even achieve slightly better performance on non-IID data in contrast to the canonical federated learning algorithms.

Journal ArticleDOI
TL;DR: In this article , a distributed quantized gradient approach is proposed, which is characterized by adaptive communications of the quantized gradients, and achieves the same linear convergence as the gradient descent in strongly convex case, while effecting major savings in the communication in terms of transmitted bits and communication rounds.
Abstract: This paper focuses on communication-efficient federated learning problem, and develops a novel distributed quantized gradient approach, which is characterized by adaptive communications of the quantized gradients. Specifically, the federated learning builds upon the server-worker infrastructure, where the workers calculate local gradients and upload them to the server; then the server obtain the global gradient by aggregating all the local gradients and utilizes it to update the model parameter. The key idea to save communications from the worker to the server is to quantize gradients as well as skip less informative quantized gradient communications by reusing previous gradients. Quantizing and skipping result in 'lazy' worker-server communications, which justifies the term Lazily Aggregated Quantized (LAQ) gradient. Theoretically, the LAQ algorithm achieves the same linear convergence as the gradient descent in the strongly convex case, while effecting major savings in the communication in terms of transmitted bits and communication rounds. Empirically, extensive experiments using realistic data corroborate a significant communication reduction compared with state-of-the-art gradient- and stochastic gradient-based algorithms.

Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed layer-based federated learning system with privacy preservation, which successfully reduced the communication cost by selecting several layers of the model to upload for global averaging and enhanced the privacy protection by applying local differential privacy.
Abstract: In recent years, federated learning has attracted more and more attention as it could collaboratively train a global model without gathering the users' raw data. It has brought many challenges. In this paper, we proposed layer-based federated learning system with privacy preservation. We successfully reduced the communication cost by selecting several layers of the model to upload for global averaging and enhanced the privacy protection by applying local differential privacy. We evaluated our system in non independently and identically distributed scenario on three datasets. Compared with existing works, our solution achieved better performance in both model accuracy and training time.

Journal ArticleDOI
TL;DR: In this article , the authors proposed three deep learning architectures, F-EDNC, FC-EDC, and O-EDc, to detect COVID-19 infections from chest computed tomography (CT) images.
Abstract: The automatic recognition of COVID-19 diseases is critical in the present pandemic since it relieves healthcare staff of the burden of screening for infection with COVID-19. Previous studies have proven that deep learning algorithms can be utilized to aid in the diagnosis of patients with potential COVID-19 infection. However, the accuracy of current COVID-19 recognition models is relatively low. Motivated by this fact, we propose three deep learning architectures, F-EDNC, FC-EDNC, and O-EDNC, to quickly and accurately detect COVID-19 infections from chest computed tomography (CT) images. Sixteen deep learning neural networks have been modified and trained to recognize COVID-19 patients using transfer learning and 2458 CT chest images. The proposed EDNC has then been developed using three of sixteen modified pre-trained models to improve the performance of COVID-19 recognition. The results suggested that the F-EDNC method significantly enhanced the recognition of COVID-19 infections with 97.75% accuracy, followed by FC-EDNC and O-EDNC (97.55% and 96.12%, respectively), which is superior to most of the current COVID-19 recognition models. Furthermore, a localhost web application has been built that enables users to easily upload their chest CT scans and obtain their COVID-19 results automatically. This accurate, fast, and automatic COVID-19 recognition system will relieve the stress of medical professionals for screening COVID-19 infections.