Showing papers on "Task (computing) published in 2020"

PDF

Open Access

Proceedings Article•

[...]

Tom B. Brown¹, Benjamin Mann, Nick Ryder², Melanie Subbiah, Jared Kaplan³, Prafulla Dhariwal¹, Arvind Neelakantan⁴, Pranav Shyam, Girish Sastry¹, Amanda Askell¹, Sandhini Agarwal¹, Ariel Herbert-Voss¹, Gretchen Krueger¹, Thomas Henighan¹, Rewon Child¹, Aditya Ramesh¹, Daniel M. Ziegler⁵, Jeffrey Wu¹, Clemens Winter, Christopher Hesse¹, Mark Chen¹, Eric Sigler, Mateusz Litwin, Scott Gray¹, Benjamin Chess¹, Jack Clark¹, Christopher Berner, Samuel McCandlish¹, Alec Radford¹, Ilya Sutskever¹, Dario Amodei¹ - Show less +27 more•Institutions (5)

OpenAI¹, University of California, Berkeley², Johns Hopkins University³, Google⁴, Massachusetts Institute of Technology⁵

28 May 2020

TL;DR: GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.

...read moreread less

Abstract: Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.

...read moreread less

10,132 citations

Posted Content•

Language Models are Few-Shot Learners

[...]

OpenAI¹, University of California, Berkeley², Johns Hopkins University³, Google⁴, Massachusetts Institute of Technology⁵

28 May 2020-arXiv: Computation and Language

TL;DR: This article showed that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.

...read moreread less

1,886 citations

Proceedings Article•DOI•

12-in-1: Multi-Task Vision and Language Representation Learning

[...]

Jiasen Lu¹, Vedanuj Goswami², Marcus Rohrbach², Devi Parikh¹, Stefan Lee³ - Show less +1 more•Institutions (3)

Georgia Institute of Technology¹, Facebook², Oregon State University³

14 Jun 2020

TL;DR: This paper investigated the relationship between vision and language tasks by developing a large-scale, multi-task model, which culminates in a single model on 12 datasets from four broad categories of task including visual question answering, caption-based image retrieval, grounding referring expressions, and multimodal verification.

...read moreread less

Abstract: Much of vision-and-language research focuses on a small but diverse set of independent tasks and supporting datasets often studied in isolation; however, the visually-grounded language understanding skills required for success at these tasks overlap significantly. In this work, we investigate these relationships between vision-and-language tasks by developing a large-scale, multi-task model. Our approach culminates in a single model on 12 datasets from four broad categories of task including visual question answering, caption-based image retrieval, grounding referring expressions, and multimodal verification. Compared to independently trained single-task models, this represents a reduction from approximately 3 billion parameters to 270 million while simultaneously improving performance by 2.05 points on average across tasks. We use our multi-task framework to perform in-depth analysis of the effect of joint training diverse tasks. Further, we show that finetuning task-specific models from our single multi-task model can lead to further improvements, achieving performance at or above the state-of-the-art.

...read moreread less

267 citations

Posted Content•

AdapterFusion: Non-Destructive Task Composition for Transfer Learning

[...]

Jonas Pfeiffer¹, Aishwarya Kamath², Andreas Rücklé¹, Kyunghyun Cho³, Iryna Gurevych¹ - Show less +1 more•Institutions (3)

Technische Universität Darmstadt¹, New York University², Courant Institute of Mathematical Sciences³

01 May 2020-arXiv: Computation and Language

TL;DR: This work proposes AdapterFusion, a new two stage learning algorithm that leverages knowledge from multiple tasks by separating the two stages, i.e., knowledge extraction and knowledge composition, so that the classifier can effectively exploit the representations learned frommultiple tasks in a non-destructive manner.

...read moreread less

Abstract: Sequential fine-tuning and multi-task learning are methods aiming to incorporate knowledge from multiple tasks; however, they suffer from catastrophic forgetting and difficulties in dataset balancing. To address these shortcomings, we propose AdapterFusion, a new two stage learning algorithm that leverages knowledge from multiple tasks. First, in the knowledge extraction stage we learn task specific parameters called adapters, that encapsulate the task-specific information. We then combine the adapters in a separate knowledge composition step. We show that by separating the two stages, i.e., knowledge extraction and knowledge composition, the classifier can effectively exploit the representations learned from multiple tasks in a non-destructive manner. We empirically evaluate AdapterFusion on 16 diverse NLU tasks, and find that it effectively combines various types of knowledge at different layers of the model. We show that our approach outperforms traditional strategies such as full fine-tuning as well as multi-task learning. Our code and adapters are available at this http URL.

...read moreread less

265 citations

Proceedings Article•

Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere

[...]

Tongzhou Wang¹, Phillip Isola¹•Institutions (1)

Massachusetts Institute of Technology¹

20 May 2020

TL;DR: In contrast to contrastive loss, SsnL as discussed by the authors identifies two key properties related to the Contrastive Loss: alignment (closeness) of features from positive pairs, and uniformity of the induced distribution of the normalized features on the hypersphere.

...read moreread less

Abstract: Contrastive representation learning has been outstandingly successful in practice. In this work, we identify two key properties related to the contrastive loss: (1) alignment (closeness) of features from positive pairs, and (2) uniformity of the induced distribution of the (normalized) features on the hypersphere. We prove that, asymptotically, the contrastive loss optimizes these properties, and analyze their positive effects on downstream tasks. Empirically, we introduce an optimizable metric to quantify each property. Extensive experiments on standard vision and language datasets confirm the strong agreement between both metrics and downstream task performance. Remarkably, directly optimizing for these two metrics leads to representations with comparable or better performance at downstream tasks than contrastive learning. Project Page: https://ssnl.github.io/hypersphere Code: https://github.com/SsnL/align_uniform

...read moreread less

179 citations

Proceedings Article•DOI•

SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization

[...]

Xianzhi Du¹, Tsung-Yi Lin¹, Pengchong Jin¹, Golnaz Ghiasi¹, Mingxing Tan¹, Yin Cui¹, Quoc V. Le¹, Xiaodan Song¹ - Show less +4 more•Institutions (1)

Google¹

14 Jun 2020

TL;DR: SpineNet is proposed, a backbone with scale-permuted intermediate features and cross-scale connections that is learned on an object detection task by Neural Architecture Search, and can transfer to classification tasks, achieving 5% top-1 accuracy improvement on a challenging iNaturalist fine-grained dataset.

...read moreread less

Abstract: Convolutional neural networks typically encode an input image into a series of intermediate features with decreasing resolutions. While this structure is suited to classification tasks, it does not perform well for tasks requiring simultaneous recognition and localization (e.g., object detection). The encoder-decoder architectures are proposed to resolve this by applying a decoder network onto a backbone model designed for classification tasks. In this paper, we argue encoder-decoder architecture is ineffective in generating strong multi-scale features because of the scale-decreased backbone. We propose SpineNet, a backbone with scale-permuted intermediate features and cross-scale connections that is learned on an object detection task by Neural Architecture Search. Using similar building blocks, SpineNet models outperform ResNet-FPN models by 3%+ AP at various scales while using 10-20% fewer FLOPs. In particular, SpineNet-190 achieves 52.1% AP on COCO, attaining the new state-of-the-art performance for single model object detection without test-time augmentation. SpineNet can transfer to classification tasks, achieving 5% top-1 accuracy improvement on a challenging iNaturalist fine-grained dataset. Code is at: https://github.com/tensorflow/tpu/tree/master/models/official/detection.

...read moreread less

149 citations

Proceedings Article•DOI•

Autopilot: workload autoscaling at Google

[...]

Krzysztof Rzadca¹, Pawel Findeisen², Jacek Swiderski², Przemyslaw Zych², Przemyslaw Broniek², Jarek Kusmierek², Pawel Krzysztof Nowak², Beata Strack², Piotr Witusowski², Steven Hand², John Wilkes² - Show less +7 more•Institutions (2)

University of Warsaw¹, Google²

15 Apr 2020

TL;DR: Despite its advantages, ensuring that Autopilot was widely adopted took significant effort, including making potential recommendations easily visible to customers who had yet to opt in, automatically migrating certain categories of jobs, and adding support for custom recommenders.

...read moreread less

Abstract: In many public and private Cloud systems, users need to specify a limit for the amount of resources (CPU cores and RAM) to provision for their workloads. A job that exceeds its limits might be throttled or killed, resulting in delaying or dropping end-user requests, so human operators naturally err on the side of caution and request a larger limit than the job needs. At scale, this results in massive aggregate resource wastage. To address this, Google uses Autopilot to configure resources automatically, adjusting both the number of concurrent tasks in a job (horizontal scaling) and the CPU/memory limits for individual tasks (vertical scaling). Autopilot walks the same fine line as human operators: its primary goal is to reduce slack - the difference between the limit and the actual resource usage - while minimizing the risk that a task is killed with an out-of-memory (OOM) error or its performance degraded because of CPU throttling. Autopilot uses machine learning algorithms applied to historical data about prior executions of a job, plus a set of finely-tuned heuristics, to walk this line. In practice, Autopiloted jobs have a slack of just 23%, compared with 46% for manually-managed jobs. Additionally, Autopilot reduces the number of jobs severely impacted by OOMs by a factor of 10. Despite its advantages, ensuring that Autopilot was widely adopted took significant effort, including making potential recommendations easily visible to customers who had yet to opt in, automatically migrating certain categories of jobs, and adding support for custom recommenders. At the time of writing, Autopiloted jobs account for over 48% of Google's fleet-wide resource usage.

...read moreread less

136 citations

Journal Article•DOI•

RLBench: The Robot Learning Benchmark & Learning Environment

[...]

Stephen James¹, Zicong Ma¹, David Rovick Arrojo¹, Andrew J. Davison¹•Institutions (1)

Imperial College London¹

18 Feb 2020

TL;DR: RLBench as discussed by the authors is a large-scale few-shot benchmark for robot learning with hundreds of hand-designed tasks, ranging from simple target reaching and door opening to longer multi-stage tasks such as opening an oven and placing a tray in it.

...read moreread less

Abstract: We present a challenging new benchmark and learning-environment for robot learning: RLBench. The benchmark features 100 completely unique, hand-designed tasks, ranging in difficulty from simple target reaching and door opening to longer multi-stage tasks, such as opening an oven and placing a tray in it. We provide an array of both proprioceptive observations and visual observations, which include rgb, depth, and segmentation masks from an over-the-shoulder stereo camera and an eye-in-hand monocular camera. Uniquely, each task comes with an infinite supply of demos through the use of motion planners operating on a series of waypoints given during task creation time; enabling an exciting flurry of demonstration-based learning possibilities. RLBench has been designed with scalability in mind; new tasks, along with their motion-planned demos, can be easily created and then verified by a series of tools, allowing users to submit their own tasks to the RLBench task repository. This large-scale benchmark aims to accelerate progress in a number of vision-guided manipulation research areas, including: reinforcement learning, imitation learning, multi-task learning, geometric computer vision, and in particular, few-shot learning. With the benchmark's breadth of tasks and demonstrations, we propose the first large-scale few-shot challenge in robotics. We hope that the scale and diversity of RLBench offers unparalleled research opportunities in the robot learning community and beyond. Benchmarking code and videos can be found at https://sites.google.com/view/rlbench .

...read moreread less

124 citations

Posted Content•

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training

[...]

Weituo Hao¹, Chunyuan Li², Xiujun Li², Lawrence Carin¹, Jianfeng Gao² - Show less +1 more•Institutions (2)

Duke University¹, Microsoft²

25 Feb 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper presents the first pre-training and fine-tuning paradigm for vision-and-language navigation (VLN) tasks, which leads to significant improvement over existing methods, achieving a new state of the art.

...read moreread less

Abstract: Learning to navigate in a visual environment following natural-language instructions is a challenging task, because the multimodal inputs to the agent are highly variable, and the training data on a new task is often limited. In this paper, we present the first pre-training and fine-tuning paradigm for vision-and-language navigation (VLN) tasks. By training on a large amount of image-text-action triplets in a self-supervised learning manner, the pre-trained model provides generic representations of visual environments and language instructions. It can be easily used as a drop-in for existing VLN frameworks, leading to the proposed agent called Prevalent. It learns more effectively in new tasks and generalizes better in a previously unseen environment. The performance is validated on three VLN tasks. On the Room-to-Room benchmark, our model improves the state-of-the-art from 47% to 51% on success rate weighted by path length. Further, the learned representation is transferable to other VLN tasks. On two recent tasks, vision-and-dialog navigation and "Help, Anna!" the proposed Prevalent leads to significant improvement over existing methods, achieving a new state of the art.

...read moreread less

120 citations

Journal Article•DOI•

Deep reinforcement learning for dynamic computation offloading and resource allocation in cache-assisted mobile edge computing systems

[...]

Samrat Nath, Jingxian Wu¹•Institutions (1)

University of Arkansas¹

05 Sep 2020

TL;DR: A dynamic scheduling policy based on Deep Reinforcement Learning (DRL) with the Deep Deterministic Policy Gradient (DDPG) method is proposed to solve the problem of dynamic caching, computation offloading, and resource allocation in cache-assisted multi-user MEC systems with stochastic task arrivals.

...read moreread less

Abstract: Mobile Edge Computing (MEC) is one of the most promising techniques for next-generation wireless communication systems. In this paper, we study the problem of dynamic caching, computation offloading, and resource allocation in cache-assisted multi-user MEC systems with stochastic task arrivals. There are multiple computationally intensive tasks in the system, and each Mobile User (MU) needs to execute a task either locally or remotely in one or more MEC servers by offloading the task data. Popular tasks can be cached in MEC servers to avoid duplicates in offloading. The cached contents can be either obtained through user offloading, fetched from a remote cloud, or fetched from another MEC server. The objective is to minimize the long-term average of a cost function, which is defined as a weighted sum of energy consumption, delay, and cache contents' fetching costs. The weighting coefficients associated with the different metrics in the objective function can be adjusted to balance the tradeoff among them. The optimum design is performed with respect to four decision parameters: whether to cache a given task, whether to offload a given uncached task, how much transmission power should be used during offloading, and how much MEC resources to be allocated for executing a task. We propose to solve the problems by developing a dynamic scheduling policy based on Deep Reinforcement Learning (DRL) with the Deep Deterministic Policy Gradient (DDPG) method. A new decentralized DDPG algorithm is developed to obtain the optimum designs for multi-cell MEC systems by leveraging on the cooperations among neighboring MEC servers. Simulation results demonstrate that the proposed algorithm outperforms other existing strategies, such as Deep Q-Network (DQN).

...read moreread less

117 citations

Journal Article•DOI•

Near-Optimal and Truthful Online Auction for Computation Offloading in Green Edge-Computing Systems

[...]

Deyu Zhang¹, Tan Long¹, Ju Ren¹, Mohamad Khattar Awad², Shan Zhang³, Yaoxue Zhang¹, Peng-Jun Wan⁴ - Show less +3 more•Institutions (4)

Central South University¹, Kuwait University², University of Waterloo³, Illinois Institute of Technology⁴

01 Apr 2020-IEEE Transactions on Mobile Computing

TL;DR: This paper proposes an online Rewards-optimal Auction (RoA) to optimize the long-term sum-of-rewards for processing offloaded tasks, meanwhile adapting to the highly dynamic energy harvesting process and computation task arrivals.

...read moreread less

Abstract: Utilizing the intelligence at the network edge, edge computing paradigm emerges to provide time-sensitive computing services for Internet of Things. In this paper, we investigate sustainable computation offloading in an edge-computing system that consists of energy harvesting-enabled mobile devices (MDs) and a dispatcher. The dispatcher collects computation tasks generated by IoT devices with limited computation power, and offloads them to resourceful MDs in exchange for rewards. We propose an online Rewards-optimal Auction (RoA) to optimize the long-term sum-of-rewards for processing offloaded tasks, meanwhile adapting to the highly dynamic energy harvesting (EH) process and computation task arrivals. RoA is designed based on Lyapunov optimization and Vickrey-Clarke-Groves auction, the operation of which does not require a prior knowledge of the energy harvesting, task arrivals, or wireless channel statistics. Our analytical results confirm the optimality of tasks assignment. Furthermore, simulation results validate the analytical analysis, and verify the efficacy of the proposed RoA.

...read moreread less

Posted Content•

PolyLaneNet: Lane Estimation via Deep Polynomial Regression

[...]

Lucas Tabelini¹, Rodrigo F. Berriel¹, Thiago M. Paixão², Claudine Badue¹, Alberto F. De Souza¹, Thiago Oliveira-Santos¹ - Show less +2 more•Institutions (2)

Universidade Federal do Espírito Santo¹, International Foundation for Electoral Systems²

23 Apr 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A novel method for lane detection is presented that uses as input an image from a forward-looking camera mounted in the vehicle and outputs polynomials representing each lane marking in the image, via deep polynomial regression, which is shown to be competitive with existing state-of-the-art methods in the TuSimple dataset.

...read moreread less

Abstract: One of the main factors that contributed to the large advances in autonomous driving is the advent of deep learning. For safer self-driving vehicles, one of the problems that has yet to be solved completely is lane detection. Since methods for this task have to work in real-time (+30 FPS), they not only have to be effective (i.e., have high accuracy) but they also have to be efficient (i.e., fast). In this work, we present a novel method for lane detection that uses as input an image from a forward-looking camera mounted in the vehicle and outputs polynomials representing each lane marking in the image, via deep polynomial regression. The proposed method is shown to be competitive with existing state-of-the-art methods in the TuSimple dataset while maintaining its efficiency (115 FPS). Additionally, extensive qualitative results on two additional public datasets are presented, alongside with limitations in the evaluation metrics used by recent works for lane detection. Finally, we provide source code and trained models that allow others to replicate all the results shown in this paper, which is surprisingly rare in state-of-the-art lane detection methods. The full source code and pretrained models are available at this https URL.

...read moreread less

Proceedings Article•DOI•

Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks

[...]

Trapit Bansal¹, Rishikesh Jha¹, Andrew McCallum¹•Institutions (1)

University of Massachusetts Amherst¹

01 Dec 2020

TL;DR: LEOPARD is trained with the state-of-the-art transformer architecture and shows better generalization to tasks not seen at all during training, with as few as 4 examples per label, than self-supervised pre-training or multi-task training.

...read moreread less

Abstract: Pre-trained transformer models have shown enormous success in improving performance on several downstream tasks. However, fine-tuning on a new task still requires large amounts of task-specific labeled data to achieve good performance. We consider this problem of learning to generalize to new tasks, with a few examples, as a meta-learning problem. While meta-learning has shown tremendous progress in recent years, its application is still limited to simulated problems or problems with limited diversity across tasks. We develop a novel method, LEOPARD, which enables optimization-based meta-learning across tasks with a different number of classes, and evaluate different methods on generalization to diverse NLP classification tasks. LEOPARD is trained with the state-of-the-art transformer architecture and shows better generalization to tasks not seen at all during training, with as few as 4 examples per label. Across 17 NLP tasks, including diverse domains of entity typing, natural language inference, sentiment analysis, and several other text classification tasks, we show that LEOPARD learns better initial parameters for few-shot learning than self-supervised pre-training or multi-task training, outperforming many strong baselines, for example, yielding 14.6% average relative gain in accuracy on unseen tasks with only 4 examples per label.

...read moreread less

Journal Article•DOI•

Deep Reinforcement Learning for Task Offloading in Mobile Edge Computing Systems

[...]

Ming Tang¹, Vincent W. S. Wong¹•Institutions (1)

University of British Columbia¹

10 Nov 2020-IEEE Transactions on Mobile Computing

TL;DR: Simulation results show that the proposed model-free deep reinforcement learning-based distributed algorithm can better exploit the processing capacities of the edge nodes and significantly reduce the ratio of dropped tasks and average delay when compared with several existing algorithms.

...read moreread less

Abstract: In mobile edge computing systems, an edge node may have a high load when a large number of mobile devices offload their tasks to it. Those offloaded tasks may experience large processing delay or even be dropped when their deadlines expire. Due to the uncertain load dynamics at the edge nodes, it is challenging for each device to determine its offloading decision (i.e., whether to offload or not, and which edge node it should offload its task to) in a decentralized manner. In this work, we consider non-divisible and delay-sensitive tasks as well as edge load dynamics, and formulate a task offloading problem to minimize the expected long-term cost. We propose a model-free deep reinforcement learning-based distributed algorithm, where each device can determine its offloading decision without knowing the task models and offloading decision of other devices. To improve the estimation of the long-term cost in the algorithm, we incorporate the long short-term memory (LSTM), dueling deep Q-network (DQN), and double-DQN techniques. Simulation results show that our proposed algorithm can better exploit the processing capacities of the edge nodes and significantly reduce the ratio of dropped tasks and average delay when compared with several existing algorithms.

...read moreread less

Posted Content•

Deep Reinforcement Learning for Task Offloading in Mobile Edge Computing Systems

[...]

Ming Tang¹, Vincent W. S. Wong¹•Institutions (1)

University of British Columbia¹

10 Apr 2020-arXiv: Networking and Internet Architecture

TL;DR: In this paper, a model-free deep reinforcement learning-based distributed algorithm was proposed to minimize the expected long-term cost of task offloading in mobile edge computing systems. But the offloading decision of each device was left to the edge nodes.

...read moreread less

Abstract: In mobile edge computing systems, an edge node may have a high load when a large number of mobile devices offload their tasks to it. Those offloaded tasks may experience large processing delay or even be dropped when their deadlines expire. Due to the uncertain load dynamics at the edge nodes, it is challenging for each device to determine its offloading decision (i.e., whether to offload or not, and which edge node it should offload its task to) in a decentralized manner. In this work, we consider non-divisible and delay-sensitive tasks as well as edge load dynamics, and formulate a task offloading problem to minimize the expected long-term cost. We propose a model-free deep reinforcement learning-based distributed algorithm, where each device can determine its offloading decision without knowing the task models and offloading decision of other devices. To improve the estimation of the long-term cost in the algorithm, we incorporate the long short-term memory (LSTM), dueling deep Q-network (DQN), and double-DQN techniques. Simulation results with 50 mobile devices and five edge nodes show that the proposed algorithm can reduce the ratio of dropped tasks and average task delay by 86.4%-95.4% and 18.0%-30.1%, respectively, when compared with several existing algorithms.

...read moreread less

Proceedings Article•DOI•

CC2Vec: Distributed Representations of Code Changes

[...]

Thong Hoang¹, Hong Jin Kang¹, Julia Lawall¹, David Lo²•Institutions (2)

Singapore Management University¹, French Institute for Research in Computer Science and Automation²

12 Mar 2020-arXiv: Software Engineering

TL;DR: This work proposed CC2Vec, a neural network model that learns a representation of code changes guided by their accompanying log messages, which represent the semantic intent of the code changes, which outperform the state-of-the-art techniques.

...read moreread less

Abstract: Existing work on software patches often use features specific to a single task. These works often rely on manually identified features, and human effort is required to identify these features for each task. In this work, we propose CC2Vec, a neural network model that learns a representation of code changes guided by their accompanying log messages, which represent the semantic intent of the code changes. CC2Vec models the hierarchical structure of a code change with the help of the attention mechanism and uses multiple comparison functions to identify the differences between the removed and added code. To evaluate if CC2Vec can produce a distributed representation of code changes that is general and useful for multiple tasks on software patches, we use the vectors produced by CC2Vec for three tasks: log message generation, bug fixing patch identification, and just-in-time defect prediction. In all tasks, the models using CC2Vec outperform the state-of-the-art techniques.

...read moreread less

Journal Article•DOI•

DLT-Net: Joint Detection of Drivable Areas, Lane Lines, and Traffic Objects

[...]

Yeqiang Qian¹, John M. Dolan², Ming Yang¹•Institutions (2)

Shanghai Jiao Tong University¹, Carnegie Mellon University²

01 Nov 2020-IEEE Transactions on Intelligent Transportation Systems

TL;DR: This work proposes a unified neural network named DLT-Net to detect drivable areas, lane lines, and traffic objects simultaneously, and constructs context tensors between sub-task decoders to share designate influence among tasks.

...read moreread less

Abstract: Perception is an essential task for self-driving cars, but most perception tasks are usually handled independently. We propose a unified neural network named DLT-Net to detect drivable areas, lane lines, and traffic objects simultaneously. These three tasks are most important for autonomous driving, especially when a high-definition map and accurate localization are unavailable. Instead of separating tasks in the decoder, we construct context tensors between sub-task decoders to share designate influence among tasks. Therefore, each task can benefit from others during multi-task learning. Experiments show that our model outperforms the conventional multi-task network in terms of the task-wise accuracy and the overall computational efficiency, in the challenging BDD dataset.

...read moreread less

Proceedings Article•DOI•

CC2Vec: distributed representations of code changes

[...]

Thong Hoang¹, Hong Jin Kang¹, David Lo¹, Julia Lawall²•Institutions (2)

Singapore Management University¹, University of Paris²

27 Jun 2020

TL;DR: CC2Vec as discussed by the authors is a neural network model that learns a representation of code changes guided by their accompanying log messages, which represent the semantic intent of the code changes and uses multiple comparison functions to identify the differences between the removed and added code.

...read moreread less

Journal Article•DOI•

Dynamic Feature Integration for Simultaneous Detection of Salient Object, Edge, and Skeleton

[...]

Jiang-Jiang Liu¹, Qibin Hou², Ming-Ming Cheng¹•Institutions (2)

Nankai University¹, National University of Singapore²

26 Aug 2020-IEEE Transactions on Image Processing

TL;DR: Zhang et al. as discussed by the authors introduced a selective integration module that allows each task to dynamically choose features at different levels from the shared backbone based on its own characteristics and designed a task-adaptive attention module, aiming at intelligently allocating information for different tasks according to the image content priors.

...read moreread less

Abstract: Salient object segmentation, edge detection, and skeleton extraction are three contrasting low-level pixel-wise vision problems, where existing works mostly focused on designing tailored methods for each individual task. However, it is inconvenient and inefficient to store a pre-trained model for each task and perform multiple different tasks in sequence. There are methods that solve specific related tasks jointly but require datasets with different types of annotations supported at the same time. In this article, we first show some similarities shared by these tasks and then demonstrate how they can be leveraged for developing a unified framework that can be trained end-to-end. In particular, we introduce a selective integration module that allows each task to dynamically choose features at different levels from the shared backbone based on its own characteristics. Furthermore, we design a task-adaptive attention module, aiming at intelligently allocating information for different tasks according to the image content priors. To evaluate the performance of our proposed network on these tasks, we conduct exhaustive experiments on multiple representative datasets. We will show that though these tasks are naturally quite different, our network can work well on all of them and even perform better than current single-purpose state-of-the-art methods. In addition, we also conduct adequate ablation analyses that provide a full understanding of the design principles of the proposed framework.

...read moreread less

Proceedings Article•

Multi-Task Reinforcement Learning with Soft Modularization

[...]

Ruihan Yang¹, Huazhe Xu², Yi Wu³, Xiaolong Wang⁴•Institutions (4)

Nankai University¹, University of California, Berkeley², OpenAI³, Chinese Academy of Sciences⁴

30 Mar 2020

TL;DR: This work introduces an explicit modularization technique on policy representation to alleviate the optimization issue of training multiple tasks with unclear parameters, and designs a routing network which estimates different routing strategies to reconfigure the base network for each task.

...read moreread less

Abstract: Multi-task learning is a very challenging problem in reinforcement learning. While training multiple tasks jointly allow the policies to share parameters across different tasks, the optimization problem becomes non-trivial: It remains unclear what parameters in the network should be reused across tasks, and how the gradients from different tasks may interfere with each other. Thus, instead of naively sharing parameters across tasks, we introduce an explicit modularization technique on policy representation to alleviate this optimization issue. Given a base policy network, we design a routing network which estimates different routing strategies to reconfigure the base network for each task. Instead of directly selecting routes for each task, our task-specific policy uses a method called soft modularization to softly combine all the possible routes, which makes it suitable for sequential tasks. We experiment with various robotics manipulation tasks in simulation and show our method improves both sample efficiency and performance over strong baselines by a large margin.

...read moreread less

Proceedings Article•DOI•

Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation

[...]

Fajie Yuan¹, Xiangnan He², Alexandros Karatzoglou³, Liguang Zhang¹•Institutions (3)

Tencent¹, University of Science and Technology of China², Google³

25 Jul 2020

TL;DR: This paper develops a parameter-efficient transfer learning architecture, termed as PeterRec, which can be configured on-the-fly to various downstream tasks, and shows that PeterRec performs efficient transfer learning in multiple domains, where it achieves comparable or sometimes better performance relative to fine-tuning the entire model parameters.

...read moreread less

Abstract: Inductive transfer learning has had a big impact on computer vision and NLP domains but has not been used in the area of recommender systems. Even though there has been a large body of research on generating recommendations based on modeling user-item interaction sequences, few of them attempt to represent and transfer these models for serving downstream tasks where only limited data exists. In this paper, we delve on the task of effectively learning a single user representation that can be applied to a diversity of tasks, from cross-domain recommendations to user profile predictions. Fine-tuning a large pre-trained network and adapting it to downstream tasks is an effective way to solve such tasks. However, fine-tuning is parameter inefficient considering that an entire model needs to be re-trained for every new task. To overcome this issue, we develop a parameter-efficient transfer learning architecture, termed as PeterRec, which can be configured on-the-fly to various downstream tasks. Specifically, PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks, which are small but as expressive as learning the entire network. We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks. Moreover, we show that PeterRec performs efficient transfer learning in multiple domains, where it achieves comparable or sometimes better performance relative to fine-tuning the entire model parameters. Codes and datasets are available at https://github.com/fajieyuan/sigir2020_peterrec.

...read moreread less

Journal Article•DOI•

Toward optimal participant decisions with voting-based incentive model for crowd sensing

[...]

Nan Jiang¹, Xu Dong¹, Zhou Jie¹, Hongyang Yan², Tao Wan¹, Jiaqi Zheng³ - Show less +2 more•Institutions (3)

East China Jiaotong University¹, Guangzhou University², Nanjing University³

01 Feb 2020-Information Sciences

TL;DR: The experimental results show that in the proposed CIBV model, each task is performed by multiple participants, and each participant can perform multiple tasks, the model can greatly improve the participants’ execution ability value and provide the platform with the ability to control the process of selecting participants.

...read moreread less

Journal Article•DOI•

A Crowd-Sensing Framework for Allocation of Time-Constrained and Location-Based Tasks

[...]

Rebeca Estrada¹, Rabeb Mizouni², Hadi Otrok³, Anis Ouali², Jamal Bentahar³ - Show less +1 more•Institutions (3)

Escuela Superior Politecnica del Litoral¹, Khalifa University², Concordia University³

01 Sep 2020-IEEE Transactions on Services Computing

TL;DR: Simulation results show that the proposed framework maximizes the aggregated quality of information, reduces the budget and response time to perform a task and increases the average recommenders’ reputation and their payment.

...read moreread less

Abstract: Thanks to the capabilities of the built-in sensors of smart devices, mobile crowd-sensing (MCS) has become a promising technique for massive data collection. In this paradigm, the service provider recruits workers (i.e., common people with smart devices) to perform sensing tasks requested by the consumers. To efficiently handle workers’ recruitment and task allocation, several factors have to be considered such as the quality of the sensed data that the workers can deliver and the different tasks locations. This allocation becomes even more challenging when the MCS tries to efficiently allocate multiple tasks under limited budget, time constraints, and the uncertainty that selected workers will not be able to perform the tasks. In this paper, we propose a service computing framework for time constrained-task allocation in location based crowd-sensing systems. This framework relies on (1) a recruitment algorithm that implements a multi-objective task allocation algorithm based on Particle Swarm Optimization, (2) queuing schemes to handle efficiently the incoming sensing tasks in the server side and at the end-user side, (3) a task delegation mechanism to avoid delaying or declining the sensing requests due to unforeseen user context, and (4) a reputation management component to manage the reputation of users based on their sensing activities and task delegation. The platform goal is to efficiently determine the most appropriate set of workers to assign to each incoming task so that high quality results are returned within the requested response time. Simulations are conducted using real datasets from Foursquare1 and Enron email social network.2 Simulation results show that the proposed framework maximizes the aggregated quality of information, reduces the budget and response time to perform a task and increases the average recommenders’ reputation and their payment.

...read moreread less

Posted Content•

Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation

[...]

Fajie Yuan¹, Xiangnan He², Alexandros Karatzoglou³, Liguang Zhang¹•Institutions (3)

Tencent¹, University of Science and Technology of China², Google³

13 Jan 2020-arXiv: Information Retrieval

TL;DR: In this paper, a parameter efficient transfer learning architecture, termed as PeterRec, is proposed, which can be configured on-the-fly to various downstream tasks, from cross-domain recommendations to user profile predictions.

...read moreread less

Abstract: Inductive transfer learning has had a big impact on computer vision and NLP domains but has not been used in the area of recommender systems. Even though there has been a large body of research on generating recommendations based on modeling user-item interaction sequences, few of them attempt to represent and transfer these models for serving downstream tasks where only limited data exists. In this paper, we delve on the task of effectively learning a single user representation that can be applied to a diversity of tasks, from cross-domain recommendations to user profile predictions. Fine-tuning a large pre-trained network and adapting it to downstream tasks is an effective way to solve such tasks. However, fine-tuning is parameter inefficient considering that an entire model needs to be re-trained for every new task. To overcome this issue, we develop a parameter efficient transfer learning architecture, termed as PeterRec, which can be configured on-the-fly to various downstream tasks. Specifically, PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks, which are small but as expressive as learning the entire network. We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks. Moreover, we show that PeterRec performs efficient transfer learning in multiple domains, where it achieves comparable or sometimes better performance relative to fine-tuning the entire model parameters. Codes and datasets are available at this https URL.

...read moreread less

Proceedings Article•DOI•

Offloading Dependent Tasks in Mobile Edge Computing with Service Caching

[...]

Gongming Zhao¹, Hongli Xu¹, Yangming Zhao², Chunming Qiao², Liusheng Huang¹ - Show less +1 more•Institutions (2)

University of Science and Technology of China¹, State University of New York System²

06 Jul 2020

TL;DR: This paper formally defines the problem of offloading dependent tasks with service caching (ODT-SC), and proves that there exists no algorithm with constant approximation for this hard problem, and designs an efficient convex programming based algorithm (CP) to solve this problem.

...read moreread less

Abstract: In Mobile Edge Computing (MEC), many tasks require specific service support for execution and in addition, have a dependent order of execution among the tasks. However, previous works often ignore the impact of having limited services cached at the edge nodes on (dependent) task offloading, thus may lead to an infeasible offloading decision or a longer completion time. To bridge the gap, this paper studies how to efficiently offload dependent tasks to edge nodes with limited (and predetermined) service caching. We formally define the problem of offloading dependent tasks with service caching (ODT-SC), and prove that there exists no algorithm with constant approximation for this hard problem. Then, we design an efficient convex programming based algorithm (CP) to solve this problem. Moreover, we study a special case with a homogeneous MEC and propose a favorite successor based algorithm (FS) to solve this special case with a competitive ratio of O(1). Extensive simulation results using Google data traces show that our proposed algorithms can significantly reduce applications’ completion time by about 27-51% compared with other alternatives.

...read moreread less

Journal Article•DOI•

Dynamic Feature Integration for Simultaneous Detection of Salient Object, Edge and Skeleton.

[...]

Jiang-Jiang Liu¹, Qibin Hou², Ming-Ming Cheng¹•Institutions (2)

Nankai University¹, National University of Singapore²

18 Apr 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: Zhang et al. as discussed by the authors proposed a unified framework for salient object segmentation, edge detection, and skeleton extraction, which allows each task to dynamically choose features at different levels from the shared backbone based on its own characteristics, and designed a task-adaptive attention module to intelligently allocate information for different tasks according to the image content priors.

...read moreread less

Abstract: In this paper, we solve three low-level pixel-wise vision problems, including salient object segmentation, edge detection, and skeleton extraction, within a unified framework. We first show some similarities shared by these tasks and then demonstrate how they can be leveraged for developing a unified framework that can be trained end-to-end. In particular, we introduce a selective integration module that allows each task to dynamically choose features at different levels from the shared backbone based on its own characteristics. Furthermore, we design a task-adaptive attention module, aiming at intelligently allocating information for different tasks according to the image content priors. To evaluate the performance of our proposed network on these tasks, we conduct exhaustive experiments on multiple representative datasets. We will show that though these tasks are naturally quite different, our network can work well on all of them and even perform better than current single-purpose state-of-the-art methods. In addition, we also conduct adequate ablation analyses that provide a full understanding of the design principles of the proposed framework. To facilitate future research, source code will be released.

...read moreread less

Journal Article•DOI•

Task Scheduling Algorithm Based on Improved Firework Algorithm in Fog Computing

[...]

Shudong Wang¹, Tianyu Zhao¹, Shanchen Pang¹•Institutions (1)

China University of Petroleum¹

13 Feb 2020-IEEE Access

TL;DR: An improved firework algorithm (I-FA) is proposed by introducing the explosion radius detection mechanism of fireworks to better reduce the processing time of the task and ensure better overall load balancing of fog devices in a cloud-fog computing system.

...read moreread less

Abstract: As an emerging computing model close to the end-user, fog computing can move tasks from the cloud to the fog device to process, and make up for the lack of cloud computing in terms of mobile support, delay and location-aware. Due to the less computing resources and weak processing ability of fog devices, the lightweight computing has been the primary choice, higher requirements are imposed for time and load. Therefore, in the Cloud-fog computing architecture, a new task scheduling method(I-FASC) is proposed for the characteristics of tasks and resources, including an improved firework algorithm (I-FA) is proposed by introducing the explosion radius detection mechanism of fireworks. By two sets of experiments show this method can better reduce the processing time of the task and ensure better overall load balancing of fog devices in a cloud-fog computing system.

...read moreread less

Journal Article•DOI•

A Multiobjective multifactorial optimization algorithm based on decomposition and dynamic resource allocation strategy

[...]

Shuangshuang Yao¹, Zhiming Dong², Xianpeng Wang¹, Lei Ren³•Institutions (3)

Northeastern University (China)¹, Chinese Ministry of Education², Beihang University³

01 Feb 2020-Information Sciences

TL;DR: The statistical analysis of experimental results illustrates the superiority of the proposed MFEA/D-DRA algorithm on a variety of benchmark MO-MFO problems.

...read moreread less

Journal Article•DOI•

Adaptive Offloading for Time-Critical Tasks in Heterogeneous Internet of Vehicles

[...]

Chunhui Liu¹, Kai Liu¹, Songtao Guo¹, Ruitao Xie², Victor C. S. Lee³, Sang H. Son⁴ - Show less +2 more•Institutions (4)

Chongqing University¹, Shenzhen University², City University of Hong Kong³, Daegu Gyeongbuk Institute of Science and Technology⁴

26 May 2020-IEEE Internet of Things Journal

TL;DR: An adaptive task offloading algorithm (ATOA) is proposed that adaptively categorizes all tasks into four types of pending lists by considering the dynamic requirements and resource constraints, and then tasks in each list will be cooperatively offloaded to different nodes based on their features.

...read moreread less

Abstract: With the recent development of wireless communication, sensing, and computing technologies, Internet of Vehicles (IoV) has attracted great attention in both academia and industry. Nevertheless, it is challenging to process time-critical tasks due to unique characteristics of IoV, including heterogeneous computation and communication capacities of network nodes, intermittent wireless connections, unevenly distributed workload, massive data transmission, intensive computation demands, and high mobility of vehicles. In this article, we propose a two-layer vehicular fog computing (VFC) architecture to explore the synergistic effect of the cloud, the static fog, and the mobile fog on processing time-critical tasks in IoV. Then, we give a motivational case study by implementing a prototype of a traffic abnormity detection and warning system, which demonstrates the necessity and urgency of developing adaptive task offloading mechanisms in such a scenario and gives insight into the problem formulation. Furthermore, we formulate the offloading model, aiming at maximizing the completion ratio of time-critical tasks. On this basis, we propose an adaptive task offloading algorithm (ATOA). Specifically, it adaptively categorizes all tasks into four types of pending lists by considering the dynamic requirements and resource constraints, and then tasks in each list will be cooperatively offloaded to different nodes based on their features. Finally, we build the simulation model and give a comprehensive performance evaluation. The results demonstrate the superiority of ATOA.

...read moreread less

Journal Article•DOI•

POST: Parallel Offloading of Splittable Tasks in Heterogeneous Fog Networks

[...]

Zening Liu¹, Yang Yang¹, Kunlun Wang¹, Ziyu Shao¹, Junshan Zhang² - Show less +1 more•Institutions (2)

ShanghaiTech University¹, Arizona State University²

10 Jan 2020-IEEE Internet of Things Journal

TL;DR: The structural properties of the problem are characterized and thus the existence of generalized Nash equilibrium (GNE) is proven via the fixed-point theorem, and the corresponding distributed task offloading algorithm is developed via the Gauss–Seidel-type method.

...read moreread less

Abstract: Fog computing has been promoted to support delay-sensitive applications in future Internet of Things (IoT). For a general heterogeneous fog network consisting of many dispersive fog nodes (FNs), it may well happen that some of them have delay-sensitive tasks to process, i.e., task nodes (TNs), and some have spare resources to help the TNs to process tasks, i.e., helper nodes (HNs). It remains a fundamental challenge to effectively map multiple tasks or TNs into multiple HNs to minimize every task’s service delay in a distributed manner, i.e., the multitask multihelper (MTMH) problem. The problem becomes more challenging as tasks are splittable, i.e., tasks can be divided into multiple subtasks and offloaded to multiple HNs to further reduce the service delay via the scheme similar to distributed computing, because it introduces the more complicated task division problem which results in a much larger and more complex solution space. To tackle this challenge, in this article, a generalized Nash equilibrium problem (GNEP), called parallel offloading of splittable tasks (POST), is formulated and studied thoroughly. The structural properties of the problem are characterized and thus the existence of generalized Nash equilibrium (GNE) is proven via the fixed-point theorem. Furthermore, the corresponding distributed task offloading algorithm is developed via the Gauss–Seidel-type method. The simulation results show that the proposed POST algorithm can offer much better performance in terms of the system average delay, individual delay, delay reduction ratio (DRR), and number of beneficial TNs, compared with the existing solution to the counterpart problem for nonsplittable tasks.

...read moreread less

Collapse