scispace - formally typeset
Search or ask a question

Showing papers in "ACM Transactions on Intelligent Systems and Technology in 2022"


Journal ArticleDOI
TL;DR: A systematic literature review on current research about federated learning in the context of EHR data for healthcare applications and discusses a general architecture for FL applied to healthcare data based on the main insights obtained from the literature review.
Abstract: The use of machine learning (ML) with electronic health records (EHR) is growing in popularity as a means to extract knowledge that can improve the decision-making process in healthcare. Such methods require training of high-quality learning models based on diverse and comprehensive datasets, which are hard to obtain due to the sensitive nature of medical data from patients. In this context, federated learning (FL) is a methodology that enables the distributed training of machine learning models with remotely hosted datasets without the need to accumulate data and, therefore, compromise it. FL is a promising solution to improve ML-based systems, better aligning them to regulatory requirements, improving trustworthiness and data sovereignty. However, many open questions must be addressed before the use of FL becomes widespread. This article aims at presenting a systematic literature review on current research about FL in the context of EHR data for healthcare applications. Our analysis highlights the main research topics, proposed solutions, case studies, and respective ML methods. Furthermore, the article discusses a general architecture for FL applied to healthcare data based on the main insights obtained from the literature review. The collected literature corpus indicates that there is extensive research on the privacy and confidentiality aspects of training data and model sharing, which is expected given the sensitive nature of medical data. Studies also explore improvements to the aggregation mechanisms required to generate the learning model from distributed contributions and case studies with different types of medical data.

75 citations


Journal ArticleDOI
TL;DR: A taxonomy for text classification according to the text involved and the models used for feature extraction and classification is created, dealing with both the technical developments and benchmark datasets that support tests of predictions.
Abstract: Text classification is the most fundamental and essential task in natural language processing. The last decade has seen a surge of research in this area due to the unprecedented success of deep learning. Numerous methods, datasets, and evaluation metrics have been proposed in the literature, raising the need for a comprehensive and updated survey. This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021, focusing on models from traditional models to deep learning. We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification. We then discuss each of these categories in detail, dealing with both the technical developments and benchmark datasets that support tests of predictions. A comprehensive comparison between different techniques, as well as identifying the pros and cons of various evaluation metrics are also provided in this survey. Finally, we conclude by summarizing key implications, future research directions, and the challenges facing the research area.

52 citations


Journal ArticleDOI
TL;DR: This work proposes a new learning approach, FedBERT, that takes advantage of the federated learning and split learning approaches, resorting to pre-training BERT in a federated way and can prevent sharing the raw data information and obtain excellent performance.
Abstract: The fast growth of pre-trained models (PTMs) has brought natural language processing to a new era, which has become a dominant technique for various natural language processing (NLP) applications. Every user can download the weights of PTMs, then fine-tune the weights for a task on the local side. However, the pre-training of a model relies heavily on accessing a large-scale of training data and requires a vast amount of computing resources. These strict requirements make it impossible for any single client to pre-train such a model. To grant clients with limited computing capability to participate in pre-training a large model, we propose a new learning approach, FedBERT, that takes advantage of the federated learning and split learning approaches, resorting to pre-training BERT in a federated way. FedBERT can prevent sharing the raw data information and obtain excellent performance. Extensive experiments on seven GLUE tasks demonstrate that FedBERT can maintain its effectiveness without communicating to the sensitive local data of clients.

20 citations


Journal ArticleDOI
TL;DR: A general framework is illustrated that formulates the trade-off between privacy loss and utility loss from a unified information-theoretic point of view, and delineates quantitative bounds of privacy-utility trade-offs when different protection mechanisms including Randomization, Sparsity, and Homomorphic Encryption are used.
Abstract: In a federated learning scenario where multiple parties jointly learn a model from their respective data, there exist two conflicting goals for the choice of appropriate algorithms. On one hand, private and sensitive training data must be kept secure as much as possible in the presence of semi-honest partners; on the other hand, a certain amount of information has to be exchanged among different parties for the sake of learning utility. Such a challenge calls for the privacy-preserving federated learning solution, which maximizes the utility of the learned model and maintains a provable privacy guarantee of participating parties’ private data. This article illustrates a general framework that (1) formulates the trade-off between privacy loss and utility loss from a unified information-theoretic point of view, and (2) delineates quantitative bounds of the privacy-utility trade-off when different protection mechanisms including randomization, sparsity, and homomorphic encryption are used. It was shown that in general there is no free lunch for the privacy-utility trade-off, and one has to trade the preserving of privacy with a certain degree of degraded utility. The quantitative analysis illustrated in this article may serve as the guidance for the design of practical federated learning algorithms.

19 citations


Journal ArticleDOI
TL;DR: A federated native ad CTR prediction method named FedCTR is proposed, which can learn user-interest representations from cross-platform user behaviors in a privacy-preserving way and proposes a federated framework for collaborative model training with distributed models and user behaviors.
Abstract: Native ad is a popular type of online advertisement that has similar forms with the native content displayed on websites. Native ad click-through rate (CTR) prediction is useful for improving user experience and platform revenue. However, it is challenging due to the lack of explicit user intent, and user behaviors on the platform with native ads may be insufficient to infer users’ interest in ads. Fortunately, user behaviors exist on many online platforms that can provide complementary information for user-interest mining. Thus, leveraging multi-platform user behaviors is useful for native ad CTR prediction. However, user behaviors are highly privacy-sensitive, and the behavior data on different platforms cannot be directly aggregated due to user privacy concerns and data protection regulations. Existing CTR prediction methods usually require centralized storage of user behavior data for user modeling, which cannot be directly applied to the CTR prediction task with multi-platform user behaviors. In this article, we propose a federated native ad CTR prediction method named FedCTR, which can learn user-interest representations from cross-platform user behaviors in a privacy-preserving way. On each platform a local user model learns user embeddings from the local user behaviors on that platform. The local user embeddings from different platforms are uploaded to a server for aggregation, and the aggregated ones are sent to the ad platform for CTR prediction. Besides, we apply local differential privacy and differential privacy to the local and aggregated user embeddings, respectively, for better privacy protection. Moreover, we propose a federated framework for collaborative model training with distributed models and user behaviors. Extensive experiments on real-world dataset show that FedCTR can effectively leverage multi-platform user behaviors for native ad CTR prediction in a privacy-preserving manner.

18 citations


Journal ArticleDOI
TL;DR: A novel fairness-aware sequential recommendation task, in which a new metric, interaction fairness, is defined to estimate how recommended items are fairly interacted by users with different protected attribute groups, is proposed.
Abstract: Sequential recommendation (SR) learns from the temporal dynamics of user-item interactions to predict the next ones. Fairness-aware recommendation mitigates a variety of algorithmic biases in the learning of user preferences. This article aims at bringing a marriage between SR and algorithmic fairness. We propose a novel fairness-aware sequential recommendation task, in which a new metric, interaction fairness, is defined to estimate how recommended items are fairly interacted by users with different protected attribute groups. We propose a multi-task learning-based deep end-to-end model, FairSR, which consists of two parts. One is to learn and distill personalized sequential features from the given user and her item sequence for SR. The other is fairness-aware preference graph embedding (FPGE). The aim of FPGE is two-fold: incorporating the knowledge of users’ and items’ attributes and their correlation into entity representations, and alleviating the unfair distributions of user attributes on items. Extensive experiments conducted on three datasets show FairSR can outperform state-of-the-art SR models in recommendation performance. In addition, the recommended items by FairSR also exhibit promising interaction fairness.

17 citations


Journal ArticleDOI
TL;DR: This article proposes a novel VFL framework which enables federated inference on non-overlapping data and distill the knowledge of privileged features and transfer them to the parties’ local model which only processes local features.
Abstract: Vertical Federated Learning (VFL) enables multiple parties to collaboratively train a machine learning model over vertically distributed datasets without data privacy leakage. However, there is a limitation of the current VFL solutions: current VFL models fail to conduct inference on non-overlapping samples during inference. This limitation seriously damages the VFL model’s availability because, in practice, overlapping samples may only take up a small portion of the whole data at each party which means a large part of inference tasks will fail. In this article, we propose a novel VFL framework which enables federated inference on non-overlapping data. Our framework regards the distributed features as privileged information which is available in the training period but disappears during inference. We distill the knowledge of such privileged features and transfer them to the parties’ local model which only processes local features. Furthermore, we adopt Oblivious Transfer (OT) to preserve data ID privacy during training and inference. Empirically, we evaluate the model on the real-world dataset collected from Criteo and Taobao. Besides, we also provide a security analysis of the proposed framework.

13 citations


Journal ArticleDOI
TL;DR: This article proposes a feature correlation-aware spatio-temporal graph convolutional networks named MC-STGCN to effectively address the novel problem of multivariate correlation- aware multi-scale traffic flow predicting, and demonstrates the superior performance of the proposal on both fine- and coarse-grained traffic predictions.
Abstract: Traffic flow prediction based on vehicle trajectories collected from the installed GPS devices is critically important to Intelligent Transportation Systems (ITS). One limitation of existing traffic prediction models is that they mostly focus on predicting road-segment level traffic conditions, which can be considered as a fine-grained prediction. In many scenarios, however, a coarse-grained prediction, such as predicting the traffic flows among different urban areas covering multiple road links, is also required to help government have a better understanding on traffic conditions from the macroscopic point of view. This is especially useful in the applications of urban planning and public transportation planning. Another limitation is that the correlations among different types of traffic-related features are largely ignored. For example, the traffic flow and traffic speed are usually negatively correlated. Existing works regard these traffic-related features as independent features without considering their correlations. In this article, we for the first time study the novel problem of multivariate correlation-aware multi-scale traffic flow predicting, and we propose a feature correlation-aware spatio-temporal graph convolutional networks named MC-STGCN to effectively address it. Specifically, given a road graph, we first construct a coarse-grained road graph based on both the topology closeness and the traffic flow similarity among the nodes (road links). Then a cross-scale spatial-temporal feature learning and fusion technique is proposed for dealing with both the fine- and coarse-grained traffic data. In the spatial domain, a cross-scale GCN is proposed to learn the multi-scale spatial features jointly and fuse them together. In the temporal domain, a cross-scale temporal network that is composed of a hierarchical attention is designed for effectively capturing intra- and inter-scale temporal correlations. To effectively capture the feature correlations, a feature correlation learning component is also designed. Finally, a structural constraint is introduced to make the predictions on the two scale traffic data consistent. We conduct extensive evaluations over two real traffic datasets, and the results demonstrate the superior performance of the proposal on both fine- and coarse-grained traffic predictions.

13 citations


Journal ArticleDOI
TL;DR: It is indicated that it is unrealistic to expect an FL algorithm to simultaneously provide excellent privacy, utility, and efficiency in certain scenarios, and a framework is proposed that reconciles horizontal and vertical federated learning.
Abstract: Federated learning (FL) enables participating parties to collaboratively build a global model with boosted utility without disclosing private data information. Appropriate protection mechanisms have to be adopted to fulfill the opposing requirements in preserving privacy and maintaining high model utility. In addition, it is a mandate for a federated learning system to achieve high efficiency in order to enable large-scale model training and deployment. We propose a unified federated learning framework that reconciles horizontal and vertical federated learning. Based on this framework, we formulate and quantify the trade-offs between privacy leakage, utility loss, and efficiency reduction, which leads us to the No-Free-Lunch (NFL) theorem for the federated learning system. NFL indicates that it is unrealistic to expect an FL algorithm to simultaneously provide excellent privacy, utility, and efficiency in certain scenarios. We then analyze the lower bounds for the privacy leakage, utility loss, and efficiency reduction for several widely-adopted protection mechanisms, including Randomization, Homomorphic Encryption, Secret Sharing and Compression. Our analysis could serve as a guide for selecting protection parameters to meet particular requirements.

11 citations


Journal ArticleDOI
TL;DR: In this paper , the authors present a comprehensive appraisal of trustworthy AI from a computational perspective, focusing on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Nondiscrimination, (iii) Explainability, (iv) Privacy, (v) Accountability andamp; Auditability, and (vi) Environmental Well-being.
Abstract: In the past few decades, artificial intelligence (AI) technology has experienced swift developments, changing everyone’s daily life and profoundly altering the course of human society. The intention behind developing AI was and is to benefit humans by reducing labor, increasing everyday conveniences, and promoting social good. However, recent research and AI applications indicate that AI can cause unintentional harm to humans by, for example, making unreliable decisions in safety-critical scenarios or undermining fairness by inadvertently discriminating against a group or groups. Consequently, trustworthy AI has recently garnered increased attention regarding the need to avoid the adverse effects that AI could bring to people, so people can fully trust and live in harmony with AI technologies. A tremendous amount of research on trustworthy AI has been conducted and witnessed in recent years. In this survey, we present a comprehensive appraisal of trustworthy AI from a computational perspective to help readers understand the latest technologies for achieving trustworthy AI. Trustworthy AI is a large and complex subject, involving various dimensions. In this work, we focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Nondiscrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems. We also discuss the accordant and conflicting interactions among different dimensions and discuss potential aspects for trustworthy AI to investigate in the future.

11 citations


Journal ArticleDOI
TL;DR: This work introduces Federated Dynamic Graph Neural Network (Feddy), a distributed and secured framework to learn the object representations from graph sequences that aggregates structural information from nearby objects in the current graph as well as dynamic information from those in the previous graph.
Abstract: Distributed surveillance systems have the ability to detect, track, and snapshot objects moving around in a certain space. The systems generate video data from multiple personal devices or street cameras. Intelligent video-analysis models are needed to learn dynamic representation of the objects for detection and tracking. Can we exploit the structural and dynamic information without storing the spatiotemporal video data at a central server that leads to a violation of user privacy? In this work, we introduce Federated Dynamic Graph Neural Network (Feddy), a distributed and secured framework to learn the object representations from graph sequences: (1) It aggregates structural information from nearby objects in the current graph as well as dynamic information from those in the previous graph. It uses a self-supervised loss of predicting the trajectories of objects. (2) It is trained in a federated learning manner. The centrally located server sends the model to user devices. Local models on the respective user devices learn and periodically send their learning to the central server without ever exposing the user’s data to server. (3) Studies showed that the aggregated parameters could be inspected though decrypted when broadcast to clients for model synchronizing, after the server performed a weighted average. We design an appropriate aggregation mechanism of secure aggregation primitives that can protect the security and privacy in federated learning with scalability. Experiments on four video camera datasets as well as simulation demonstrate that Feddy achieves great effectiveness and security.

Journal ArticleDOI
TL;DR: Personalized Tour Recommendation (PTR) as discussed by the authors is a technique to generate a sequence of point-of-interest (POIs) for a particular tourist, according to the user-specific constraints such as durat...
Abstract: The main objective of Personalized Tour Recommendation (PTR) is to generate a sequence of point-of-interest (POIs) for a particular tourist, according to the user-specific constraints such as durat...

Journal ArticleDOI
TL;DR: Existing works on FL applications in EHRs are surveyed and the performance of current state-of-the-art FL algorithms on two EHR machine learning tasks of significant clinical importance on a real world multi-center EHR dataset is evaluated.
Abstract: In data-driven medical research, multi-center studies have long been preferred over single-center ones due to a single institute sometimes not having enough data to obtain sufficient statistical power for certain hypothesis testings as well as predictive and subgroup studies. The wide adoption of electronic health records (EHRs) has made multi-institutional collaboration much more feasible. However, concerns over infrastructures, regulations, privacy, and data standardization present a challenge to data sharing across healthcare institutions. Federated Learning (FL), which allows multiple sites to collaboratively train a global model without directly sharing data, has become a promising paradigm to break the data isolation. In this study, we surveyed existing works on FL applications in EHRs and evaluated the performance of current state-of-the-art FL algorithms on two EHR machine learning tasks of significant clinical importance on a real world multi-center EHR dataset.

Journal ArticleDOI
TL;DR: A novel Spatiotemporal Adaptive Gated Graph Convolution Network (STAG-GCN) is proposed to predict traffic conditions for several time steps ahead and to seek both spatial neighbors and semantic neighbors to make more connections between road nodes.
Abstract: Urban traffic flow forecasting is a critical issue in intelligent transportation systems. Due to the complexity and uncertainty of urban road conditions, how to capture the dynamic spatiotemporal correlation and make accurate predictions is very challenging. In most of existing works, urban road network is often modeled as a fixed graph based on local proximity. However, such modeling is not sufficient to describe the dynamics of the road network and capture the global contextual information. In this paper, we consider constructing the road network as a dynamic weighted graph through attention mechanism. Furthermore, we propose to seek both spatial neighbors and semantic neighbors to make more connections between road nodes. We propose a novel Spatiotemporal Adaptive Gated Graph Convolution Network (STAG-GCN) to predict traffic conditions for several time steps ahead. STAG-GCN mainly consists of two major components: (1) multivariate self-attention Temporal Convolution Network (TCN) is utilized to capture local and long-range temporal dependencies across recent, daily-periodic and weekly-periodic observations; (2) mix-hop AG-GCN extracts selective spatial and semantic dependencies within multi-layer stacking through adaptive graph gating mechanism and mix-hop propagation mechanism. The output of different components are weighted fused to generate the final prediction results. Extensive experiments on two real-world large scale urban traffic dataset have verified the effectiveness, and the multi-step forecasting performance of our proposed models outperforms the state-of-the-art baselines.

Journal ArticleDOI
TL;DR: A comprehensive review on the graph neural networks can be found in this article , where the authors provide a taxonomy for graph neural network, and refer to up to 400 relevant literatures to show the panorama of the graph networks.
Abstract: Graph neural networks provide a powerful toolkit for embedding real-world graphs into low-dimensional spaces according to specific tasks. Up to now, there have been several surveys on this topic. However, they usually lay emphasis on different angles so that the readers can not see a panorama of the graph neural networks. This survey aims to overcome this limitation, and provide a comprehensive review on the graph neural networks. First of all, we provide a novel taxonomy for the graph neural networks, and then refer to up to 400 relevant literatures to show the panorama of the graph neural networks. All of them are classified into the corresponding categories. In order to drive the graph neural networks into a new stage, we summarize four future research directions so as to overcome the facing challenges. It is expected that more and more scholars can understand and exploit the graph neural networks, and use them in their research community.

Journal ArticleDOI
TL;DR: The Deep Spatio-temporal Adaptive 3D Convolution Neural Network (ST-A3DNet) is proposed, which is a new scheme to solve both spatio-tem temporal correlation and flexibility, and consider spatio/temporal complexity (complex external factors, such as weather and holidays).
Abstract: Traffic flow prediction is the upstream problem of path planning, intelligent transportation system, and other tasks. Many studies have been carried out on the traffic flow prediction of the spatio-temporal network, but the effects of spatio-temporal flexibility (historical data of the same type of time intervals in the same location will change flexibly) and spatio-temporal correlation (different road conditions have different effects at different times) have not been considered at the same time. We propose the Deep Spatio-temporal Adaptive 3D Convolution Neural Network (ST-A3DNet), which is a new scheme to solve both spatio-temporal correlation and flexibility, and consider spatio-temporal complexity (complex external factors, such as weather and holidays). Different from other traffic forecasting models, ST-A3DNet captures the spatio-temporal relationship at the same time through the Adaptive 3D convolution module, assigns different weights flexibly according to the influence of historical data, and obtains the impact of external factors on the flow through the ex-mask module. Considering the holidays and weather conditions, we train our model for experiments in Xi’an and Chengdu. We evaluate the ST-A3DNet and the results show that we have better results than the other 11 baselines.

Journal ArticleDOI
TL;DR: This article comprehensively considers various data distributions on end devices and edges, proposing a hierarchical federated learning framework FLEE, which can realize dynamical updates of models without redeploying them and can improve model performances under all kinds of data distributions.
Abstract: With the development of smart devices, the computing capabilities of portable end devices such as mobile phones have been greatly enhanced. Meanwhile, traditional cloud computing faces great challenges caused by privacy-leakage and time-delay problems, there is a trend to push models down to edges and end devices. However, due to the limitation of computing resource, it is difficult for end devices to complete complex computing tasks alone. Therefore, this article divides the model into two parts and deploys them on multiple end devices and edges, respectively. Meanwhile, an early exit is set to reduce computing resource overhead, forming a hierarchical distributed architecture. In order to enable the distributed model to continuously evolve by using new data generated by end devices, we comprehensively consider various data distributions on end devices and edges, proposing a hierarchical federated learning framework FLEE, which can realize dynamical updates of models without redeploying them. Through image and sentence classification experiments, we verify that it can improve model performances under all kinds of data distributions, and prove that compared with other frameworks, the models trained by FLEE consume less global computing resource in the inference stage.

Journal ArticleDOI
Liwei Deng, Hao Sun, Rui Sun, Yan Zhao, Han Su 
TL;DR: A Similar Subtrajectory Search with a Graph Neural Networks framework is proposed, which outperforms the state-of-the-art baselines consistently and significantly.
Abstract: Although many applications take subtrajectories as basic units for analysis, there is little research on the similar subtrajectory search problem aiming to return a portion of a trajectory (i.e., subtrajectory), which is the most similar to a query trajectory. We find that in some special cases, when a grid-based metric is used, this problem can be formulated as a reading comprehension problem, which has been studied extensively in the field of natural language processing (NLP). By this formulation, we can obtain faster models with better performance than existing methods. However, due to the difference between natural language and trajectory (e.g., spatial relationship), it is impossible to directly apply NLP models to this problem. Therefore, we propose a Similar Subtrajectory Search with a Graph Neural Networks framework. This framework contains four modules including a spatial-aware grid embedding module, a trajectory embedding module, a query-context trajectory fusion module, and a span prediction module. Specifically, in the spatial-aware grid embedding module, the spatial-based grid adjacency is constructed and delivered to the graph neural network to learn spatial-aware grid embedding. The trajectory embedding module aims to model the sequential information of trajectories. The purpose of the query-context trajectory fusion module is to fuse the information of the query trajectory to each grid of the context trajectories. Finally, the span prediction module aims to predict the start and the end of a subtrajectory for the context trajectory, which is the most similar to the query trajectory. We conduct comprehensive experiments on two real world datasets, where the proposed framework outperforms the state-of-the-art baselines consistently and significantly.

Journal ArticleDOI
TL;DR: The experimental results verify that cST-ML can significantly improve the urban traffic prediction performance and outperform all baseline models especially when obvious traffic dynamics and temporal uncertainties are presented.
Abstract: Urban traffic status (e.g., traffic speed and volume) is highly dynamic in nature, namely, varying across space and evolving over time. Thus, predicting such traffic dynamics is of great importance to urban development and transportation management. However, it is very challenging to solve this problem due to spatial-temporal dependencies and traffic uncertainties. In this article, we solve the traffic dynamics prediction problem from Bayesian meta-learning perspective and propose a novel continuous spatial-temporal meta-learner (cST-ML), which is trained on a distribution of traffic prediction tasks segmented by historical traffic data with the goal of learning a strategy that can be quickly adapted to related but unseen traffic prediction tasks. cST-ML tackles the traffic dynamics prediction challenges by advancing the Bayesian black-box meta-learning framework through the following new points: (1) cST-ML captures the dynamics of traffic prediction tasks using variational inference, and to better capture the temporal uncertainties within tasks, cST-ML performs as a rolling window within each task; (2) cST-ML has novel designs in architecture, where CNN and LSTM are embedded to capture the spatial-temporal dependencies between traffic status and traffic-related features; (3) novel training and testing algorithms for cST-ML are designed. We also conduct experiments on two real-world traffic datasets (taxi inflow and traffic speed) to evaluate our proposed cST-ML. The experimental results verify that cST-ML can significantly improve the urban traffic prediction performance and outperform all baseline models especially when obvious traffic dynamics and temporal uncertainties are presented.

Journal ArticleDOI
TL;DR: This work tackles the problem of extreme event detection on large scale transportation networks using origin-destination mobility data, which is now widely available, and proposes a Context augmented Graph Autoencoder (Con-GAE) model, which adopts an autoencoding framework and detects anomalies via semi-supervised learning.
Abstract: Accurate and timely detection of large events on urban transportation networks enables informed mobility management. This work tackles the problem of extreme event detection on large-scale transportation networks using origin-destination mobility data, which is now widely available. Such data is highly structured in time and space, but high dimensional and sparse. Current multivariate time series anomaly detection methods cannot fully address these challenges. To exploit the structure of mobility data, we formulate the event detection problem in a novel way, as detecting anomalies in a set of time-dependent directed weighted graphs. We further propose a Context augmented Graph Autoencoder (Con-GAE) model to solve the problem, which leverages graph embedding and context embedding techniques to capture the spatial and temporal patterns. Con-GAE adopts an autoencoder framework and detects anomalies via semi-supervised learning. The performance of the method is assessed on several city-scale travel-time datasets from Uber Movement, New York taxis, and Chicago taxis and compared to state-of-the-art approaches. The proposed Con-GAE can achieve an improvement in the area under the curve score as large as 0.15 over the second best method. We also discuss real-world traffic anomalies detected by Con-GAE.

Journal ArticleDOI
TL;DR: Axolotl (Automated crossLocation-network Transfer Learning), a novel method aimed at transferring location preference models learned in a data-rich region to significantly boost the quality of recommendations in aData-scarce region is presented.
Abstract: Variability in social app usage across regions results in a high skew of the quantity and the quality of check-in data collected, which in turn is a challenge for effective location recommender systems. In this article, we present Axolotl (Automated crossLocation-network Transfer Learning), a novel method aimed at transferring location preference models learned in a data-rich region to significantly boost the quality of recommendations in a data-scarce region. Axolotl predominantly deploys two channels for information transfer: (1) a meta-learning based procedure learned using location recommendation as well as social predictions, and (2) a lightweight unsupervised cluster-based transfer across users and locations with similar preferences. Both of these work together synergistically to achieve improved accuracy of recommendations in data-scarce regions without any prerequisite of overlapping users and with minimal fine-tuning. We build Axolotl on top of a twin graph-attention neural network model used for capturing the user- and location-conditioned influences in a user-mobility graph for each region. We conduct extensive experiments on 12 user mobility datasets across the US, Japan, and Germany, using three as source regions and nine of them (that have much sparsely recorded mobility data) as target regions. Empirically, we show that Axolotl achieves up to 18% better recommendation performance than the existing state-of-the-art methods across all metrics.

Journal ArticleDOI
TL;DR: A Federated Learning face forgery detection framework to train a global model collaboratively while keeping data on local devices and achieves competitive performance compared with existing methods that are trained with centralized data, with higher-level security and privacy guarantee.
Abstract: The spread of face forgery videos is a serious threat to information credibility, calling for effective detection algorithms to identify them. Most existing methods have assumed a shared or centralized training set. However, in practice, data may be distributed on devices of different enterprises that cannot be centralized to share due to security and privacy restrictions. In this article, we propose a Federated Learning face forgery detection framework to train a global model collaboratively while keeping data on local devices. In order to make the detection model more robust, we propose a novel Inconsistency-Capture module (ICM) to capture the dynamic inconsistencies between adjacent frames of face forgery videos. The ICM contains two parallel branches. The first branch takes the whole face of adjacent frames as input to calculate a global inconsistency representation. The second branch focuses only on the inter-frame variation of critical regions to capture the local inconsistency. To the best of our knowledge, this is the first work to apply federated learning to face forgery video detection, which is trained with decentralized data. Extensive experiments show that the proposed framework achieves competitive performance compared with existing methods that are trained with centralized data, with higher-level security and privacy guarantee.

Journal ArticleDOI
TL;DR: A badminton language is introduced to fully describe the process of the shot, and a deep learning model composed of a novel short-term extractor and a long-term encoder is proposed for capturing a shot-by-shot sequence in abadminton rally by framing the problem as predicting a rally result.
Abstract: Identifying significant shots in a rally is important for evaluating players’ performance in badminton matches. While there are several studies that have quantified player performance in other sports, analyzing badminton data has remained untouched. In this article, we introduce a badminton language to fully describe the process of the shot, and we propose a deep-learning model composed of a novel short-term extractor and a long-term encoder for capturing a shot-by-shot sequence in a badminton rally by framing the problem as predicting a rally result. Our model incorporates an attention mechanism to enable the transparency between the action sequence and the rally result, which is essential for badminton experts to gain interpretable predictions. Experimental evaluation based on a real-world dataset demonstrates that our proposed model outperforms the strong baselines. We also conducted case studies to show the ability to enhance players’ decision-making confidence and to provide advanced insights for coaching, which benefits the badminton analysis community and bridges the gap between the field of badminton and computer science.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed an adaptive quantized gradient (AQG) to adaptively adjust the quantization level based on a local gradient update to fully utilize the heterogeneity of local data distribution for reducing unnecessary transmissions.
Abstract: Federated learning (FL) has attracted tremendous attentions in recent years due to its privacy-preserving measures and great potential in some distributed but privacy-sensitive applications, such as finance and health. However, high communication overloads for transmitting high-dimensional networks and extra security masks remain a bottleneck of FL. This article proposes a communication-efficient FL framework with an Adaptive Quantized Gradient (AQG), which adaptively adjusts the quantization level based on a local gradient’s update to fully utilize the heterogeneity of local data distribution for reducing unnecessary transmissions. In addition, client dropout issues are taken into account and an Augmented AQG is developed, which could limit the dropout noise with an appropriate amplification mechanism for transmitted gradients. Theoretical analysis and experiment results show that the proposed AQG leads to 18% to 50% of additional transmission reduction as compared with existing popular methods, including Quantized Gradient Descent (QGD) and Lazily Aggregated Quantized (LAQ) gradient-based methods without deteriorating convergence properties. Experiments with heterogenous data distributions corroborate a more significant transmission reduction compared with independent identical data distributions. The proposed AQG is robust to a client dropping rate up to 90% empirically, and the Augmented AQG manages to further improve the FL system’s communication efficiency with the presence of moderate-scale client dropouts commonly seen in practical FL scenarios.

Journal ArticleDOI
TL;DR: A deep reinforcement learning framework is proposed to tackle the dynamic pricing problem for ride-hailing platforms and demonstrates that the proposed method outperforms the baselines in terms of order response rate and total revenue.
Abstract: Dynamic pricing plays an important role in solving the problems such as traffic load reduction, congestion control, and revenue improvement. Efficient dynamic pricing strategies can increase capacity utilization, total revenue of service providers, and the satisfaction of both passengers and drivers. Many proposed dynamic pricing technologies focus on short-term optimization and face poor scalability in modeling long-term goals for the limitations of solution optimality and prohibitive computation. In this article, a deep reinforcement learning framework is proposed to tackle the dynamic pricing problem for ride-hailing platforms. A soft actor-critic (SAC) algorithm is adopted in the reinforcement learning framework. First, the dynamic pricing problem is translated into a Markov Decision Process (MDP) and is set up in continuous action spaces, which is no need for the discretization of action space. Then, a new reward function is obtained by the order response rate and the KL-divergence between supply distribution and demand distribution. Experiments and case studies demonstrate that the proposed method outperforms the baselines in terms of order response rate and total revenue.

Journal ArticleDOI
Fuxian Li, Jie Feng, Huan Yan, Depeng Jin, Yong Li 
TL;DR: A novel model to tackle the flow prediction task for irregular regions based on dynamic node attribute embedding and multi-view graph reconstruction and proposes a location-aware and time-aware graph attention mechanism named Semantic Graph Attention Network (Semantic-GAT).
Abstract: It is essential to predict crowd flow precisely in a city, which is practically partitioned into irregular regions based on road networks and functionality. However, prior works mainly focus on grid-based crowd flow prediction, where a city is divided into many regular grids. Although Convolutional Neural Netwok (CNN) is powerful to capture spatial dependence from grid-based Euclidean data, it fails to tackle non-Euclidean data, which reflect the correlations among irregular regions. Besides, prior works fail to jointly capture the hierarchical spatio-temporal dependence from both regular and irregular regions. Finally, the correlations among regions are time-varying and functionality-related. However, the combination of dynamic and semantic attributes of regions are ignored by related works. To address the above challenges, in this article, we propose a novel model to tackle the flow prediction task for irregular regions. First, we employ CNN and Graph Neural Network (GNN) to capture micro and macro spatial dependence among grid-based regions and irregular regions, respectively. Further, we think highly of the dynamic inter-region correlations and propose a location-aware and time-aware graph attention mechanism named Semantic Graph Attention Network (Semantic-GAT), based on dynamic node attribute embedding and multi-view graph reconstruction. Extensive experimental results based on two real-life datasets demonstrate that our model outperforms 10 baselines by reducing the prediction error around 8%.

Journal ArticleDOI
TL;DR: In this article, the timely prediction of passenger demands in different time scales is addressed for ride-hailing services, as they provide huge convenience for passengers, but the prediction of demand is difficult.
Abstract: In recent years, ride-hailing services have been increasingly prevalent, as they provide huge convenience for passengers. As a fundamental problem, the timely prediction of passenger demands in dif...

Journal ArticleDOI
TL;DR: This article proposes a tailor-made framework called CrimeTensor to predict the number of crime incidents belonging to different categories within each target region via tensor learning with spatiotemporal consistency and develops an enhanced framework which takes a set of pre-selected regions to conduct prediction so as to further improve the computational efficiency of the optimization algorithm.
Abstract: Crime poses a major threat to human life and property, which has been recognized as one of the most crucial problems in our society. Predicting the number of crime incidents in each region of a city before they happen is of great importance to fight against crime. There has been a great deal of research focused on crime prediction, ranging from introducing diversified data sources to exploring various prediction models. However, most of the existing approaches fail to offer fine-scale prediction results and take little notice of the intricate spatial-temporal-categorical correlations contained in crime incidents. In this article, we propose a tailor-made framework called CrimeTensor to predict the number of crime incidents belonging to different categories within each target region via tensor learning with spatiotemporal consistency. In particular, we model the crime data as a tensor and present an objective function which tries to take full advantage of the spatial, temporal, and categorical correlations contained in crime incidents. Moreover, a well-designed optimization algorithm which transforms the objective into a compact form and then applies CP decomposition to find the optimal solution is elaborated to solve the objective function. Furthermore, we develop an enhanced framework which takes a set of pre-selected regions to conduct prediction so as to further improve the computational efficiency of the optimization algorithm. Finally, extensive experiments are performed on both proprietary and public datasets and our framework significantly outperforms all the baselines in terms of each evaluation metric.


Journal ArticleDOI
TL;DR: A unified WSVOS model by adopting a two-branch architecture with a multi-level cross-br branch fusion strategy, named as dual-attention cross-BRanch fusion network (DACF-Net), which demonstrates the effectiveness of proposed approach compared with the state of thearts on two challenging datasets.
Abstract: Recently, concerning the challenge of collecting large-scale explicitly annotated videos, weakly supervised video object segmentation (WSVOS) using video tags has attracted much attention. Existing WSVOS approaches follow a general pipeline including two phases, i.e., a pseudo masks generation phase and a refinement phase. To explore the intrinsic property and correlation buried in the video frames, most of them focus on the later phase by introducing optical flow as temporal information to provide more supervision. However, these optical flow-based studies are greatly affected by illumination and distortion and lack consideration of the discriminative capacity of multi-level deep features. In this article, with the goal of capturing more effective temporal information and investigating a temporal information fusion strategy accordingly, we propose a unified WSVOS model by adopting a two-branch architecture with a multi-level cross-branch fusion strategy, named as dual-attention cross-branch fusion network (DACF-Net). Concretely, the two branches of DACF-Net, i.e., a temporal prediction subnetwork (TPN) and a spatial segmentation subnetwork (SSN), are used for extracting temporal information and generating predicted segmentation masks, respectively. To perform the cross-branch fusion between TPN and SSN, we propose a dual-attention fusion module that can be plugged into the SSN flexibly. We also pose a cross-frame coherence loss (CFCL) to achieve smooth segmentation results by exploiting the coherence of masks produced by TPN and SSN. Extensive experiments demonstrate the effectiveness of proposed approach compared with the state-of-the-arts on two challenging datasets, i.e., Davis-2016 and YouTube-Objects.