scispace - formally typeset
Search or ask a question

Showing papers on "Task (computing) published in 2018"


Proceedings ArticleDOI
Ashvin Nair1, Bob McGrew1, Marcin Andrychowicz1, Wojciech Zaremba1, Pieter Abbeel1 
21 May 2018
TL;DR: This work uses demonstrations to overcome the exploration problem and successfully learn to perform long-horizon, multi-step robotics tasks with continuous control such as stacking blocks with a robot arm.
Abstract: Exploration in environments with sparse rewards has been a persistent problem in reinforcement learning (RL). Many tasks are natural to specify with a sparse reward, and manually shaping a reward function can result in suboptimal performance. However, finding a non-zero reward is exponentially more difficult with increasing task horizon or action dimensionality. This puts many real-world tasks out of practical reach of RL methods. In this work, we use demonstrations to overcome the exploration problem and successfully learn to perform long-horizon, multi-step robotics tasks with continuous control such as stacking blocks with a robot arm. Our method, which builds on top of Deep Deterministic Policy Gradients and Hindsight Experience Replay, provides an order of magnitude of speedup over RL on simulated robotics tasks. It is simple to implement and makes only the additional assumption that we can collect a small set of demonstrations. Furthermore, our method is able to solve tasks not solvable by either RL or behavior cloning alone, and often ends up outperforming the demonstrator policy.

590 citations


Book ChapterDOI
08 Sep 2018
TL;DR: In this article, a CNN is used to detect individual keypoints and predict their relative displacements, allowing them to group keypoints into person pose instances and then associate semantic person pixels with their corresponding person instance, delivering instance-level person segmentations.
Abstract: We present a box-free bottom-up approach for the tasks of pose estimation and instance segmentation of people in multi-person images using an efficient single-shot model. The proposed PersonLab model tackles both semantic-level reasoning and object-part associations using part-based modeling. Our model employs a convolutional network which learns to detect individual keypoints and predict their relative displacements, allowing us to group keypoints into person pose instances. Further, we propose a part-induced geometric embedding descriptor which allows us to associate semantic person pixels with their corresponding person instance, delivering instance-level person segmentations. Our system is based on a fully-convolutional architecture and allows for efficient inference, with runtime essentially independent of the number of people present in the scene. Trained on COCO data alone, our system achieves COCO test-dev keypoint average precision of 0.665 using single-scale inference and 0.687 using multi-scale inference, significantly outperforming all previous bottom-up pose estimation systems. We are also the first bottom-up method to report competitive results for the person class in the COCO instance segmentation task, achieving a person category average precision of 0.417.

487 citations


Proceedings Article
15 Feb 2018
TL;DR: DEN as discussed by the authors dynamically expands network capacity upon arrival of each task with only the necessary number of units, and effectively prevents semantic drift by splitting/duplicating units and timestamping them.
Abstract: We propose a novel deep network architecture for lifelong learning which we refer to as Dynamically Expandable Network (DEN), that can dynamically decide its network capacity as it trains on a sequence of tasks, to learn a compact overlapping knowledge sharing structure among tasks. DEN is efficiently trained in an online manner by performing selective retraining, dynamically expands network capacity upon arrival of each task with only the necessary number of units, and effectively prevents semantic drift by splitting/duplicating units and timestamping them. We validate DEN on multiple public datasets under lifelong learning scenarios, on which it not only significantly outperforms existing lifelong learning methods for deep networks, but also achieves the same level of performance as the batch counterparts with substantially fewer number of parameters. Further, the obtained network fine-tuned on all tasks obtained significantly better performance over the batch models, which shows that it can be used to estimate the optimal network structure even when all tasks are available in the first place.

452 citations


Posted Content
TL;DR: This work shows that using experience replay buffers for all past events with a mixture of on- and off-policy learning can still learn new tasks quickly yet can substantially reduce catastrophic forgetting in both Atari and DMLab domains, even matching the performance of methods that require task identities.
Abstract: Continual learning is the problem of learning new tasks or knowledge while protecting old knowledge and ideally generalizing from old experience to learn new tasks faster. Neural networks trained by stochastic gradient descent often degrade on old tasks when trained successively on new tasks with different data distributions. This phenomenon, referred to as catastrophic forgetting, is considered a major hurdle to learning with non-stationary data or sequences of new tasks, and prevents networks from continually accumulating knowledge and skills. We examine this issue in the context of reinforcement learning, in a setting where an agent is exposed to tasks in a sequence. Unlike most other work, we do not provide an explicit indication to the model of task boundaries, which is the most general circumstance for a learning agent exposed to continuous experience. While various methods to counteract catastrophic forgetting have recently been proposed, we explore a straightforward, general, and seemingly overlooked solution - that of using experience replay buffers for all past events - with a mixture of on- and off-policy learning, leveraging behavioral cloning. We show that this strategy can still learn new tasks quickly yet can substantially reduce catastrophic forgetting in both Atari and DMLab domains, even matching the performance of methods that require task identities. When buffer storage is constrained, we confirm that a simple mechanism for randomly discarding data allows a limited size buffer to perform almost as well as an unbounded one.

408 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: This paper proposes a novel multi-task guided prediction-and-distillation network (PAD-Net), which first predicts a set of intermediate auxiliary tasks ranging from low level to high level, and then the predictions from these intermediate Auxiliary tasks are utilized as multi-modal input via the authors' proposed multi- modal distillation modules for the final tasks.
Abstract: Depth estimation and scene parsing are two particularly important tasks in visual scene understanding. In this paper we tackle the problem of simultaneous depth estimation and scene parsing in a joint CNN. The task can be typically treated as a deep multi-task learning problem [42]. Different from previous methods directly optimizing multiple tasks given the input training data, this paper proposes a novel multi-task guided prediction-and-distillation network (PAD-Net), which first predicts a set of intermediate auxiliary tasks ranging from low level to high level, and then the predictions from these intermediate auxiliary tasks are utilized as multi-modal input via our proposed multi-modal distillation modules for the final tasks. During the joint learning, the intermediate tasks not only act as supervision for learning more robust deep representations but also provide rich multi-modal information for improving the final tasks. Extensive experiments are conducted on two challenging datasets (i.e. NYUD-v2 and Cityscapes) for both the depth estimation and scene parsing tasks, demonstrating the effectiveness of the proposed approach.

288 citations


Proceedings Article
15 Feb 2018
TL;DR: This work uses a generator network to propose tasks for the agent to try to achieve, specified as goal states, and shows that, by using this framework, an agent can efficiently and automatically learn to perform a wide set of tasks without requiring any prior knowledge of its environment.
Abstract: Reinforcement learning is a powerful technique to train an agent to perform a task. However, an agent that is trained using reinforcement learning is only capable of achieving the single task that is specified via its reward function. Such an approach does not scale well to settings in which an agent needs to perform a diverse set of tasks, such as navigating to varying positions in a room or moving objects to varying locations. Instead, we propose a method that allows an agent to automatically discover the range of tasks that it is capable of performing. We use a generator network to propose tasks for the agent to try to achieve, specified as goal states. The generator network is optimized using adversarial training to produce tasks that are always at the appropriate level of difficulty for the agent. Our method thus automatically produces a curriculum of tasks for the agent to learn. We show that, by using this framework, an agent can efficiently and automatically learn to perform a wide set of tasks without requiring any prior knowledge of its environment. Our method can also learn to achieve tasks with sparse rewards, which traditionally pose significant challenges.

186 citations


Proceedings ArticleDOI
21 May 2018
TL;DR: Neural Task Programming (NTP) as mentioned in this paper takes as input a task specification (e.g., video demonstration of a task) and recursively decomposes it into finer sub-task specifications, where bottom-level programs are callable subroutines that interact with the environment.
Abstract: In this work, we propose a novel robot learning framework called Neural Task Programming (NTP), which bridges the idea of few-shot learning from demonstration and neural program induction. NTP takes as input a task specification (e.g., video demonstration of a task) and recursively decomposes it into finer sub-task specifications. These specifications are fed to a hierarchical neural program, where bottom-level programs are callable subroutines that interact with the environment. We validate our method in three robot manipulation tasks. NTP achieves strong generalization across sequential tasks that exhibit hierarchal and compositional structures. The experimental results show that NTP learns to generalize well towards unseen tasks with increasing lengths, variable topologies, and changing objectives.stanfordvl.github.io/ntp/.

174 citations


Proceedings ArticleDOI
01 Jan 2018
TL;DR: Experimental results show that SyntaxSQLNet can handle a significantly greater number of complex SQL examples than prior work, outperforming the previous state-of-the-art model by 9.5% in exact matching accuracy.
Abstract: Most existing studies in text-to-SQL tasks do not require generating complex SQL queries with multiple clauses or sub-queries, and generalizing to new, unseen databases. In this paper we propose SyntaxSQLNet, a syntax tree network to address the complex and cross-domain text-to-SQL generation task. SyntaxSQLNet employs a SQL specific syntax tree-based decoder with SQL generation path history and table-aware column attention encoders. We evaluate SyntaxSQLNet on a new large-scale text-to-SQL corpus containing databases with multiple tables and complex SQL queries containing multiple SQL clauses and nested queries. We use a database split setting where databases in the test set are unseen during training. Experimental results show that SyntaxSQLNet can handle a significantly greater number of complex SQL examples than prior work, outperforming the previous state-of-the-art model by 9.5% in exact matching accuracy. To our knowledge, we are the first to study this complex text-to-SQL task. Our task and models with the latest updates are available at https://yale-lily.github.io/seq2sql/spider.

156 citations


Posted Content
TL;DR: In this paper, a multi-task guided prediction-and-distillation network (PAD-Net) is proposed to jointly perform depth estimation and scene parsing in a joint CNN, where the intermediate tasks not only act as supervision for learning more robust deep representations but also provide rich multi-modal information for improving the final tasks.
Abstract: Depth estimation and scene parsing are two particularly important tasks in visual scene understanding. In this paper we tackle the problem of simultaneous depth estimation and scene parsing in a joint CNN. The task can be typically treated as a deep multi-task learning problem [42]. Different from previous methods directly optimizing multiple tasks given the input training data, this paper proposes a novel multi-task guided prediction-and-distillation network (PAD-Net), which first predicts a set of intermediate auxiliary tasks ranging from low level to high level, and then the predictions from these intermediate auxiliary tasks are utilized as multi-modal input via our proposed multi-modal distillation modules for the final tasks. During the joint learning, the intermediate tasks not only act as supervision for learning more robust deep representations but also provide rich multi-modal information for improving the final tasks. Extensive experiments are conducted on two challenging datasets (i.e. NYUD-v2 and Cityscapes) for both the depth estimation and scene parsing tasks, demonstrating the effectiveness of the proposed approach.

133 citations


Journal ArticleDOI
TL;DR: This paper presents an experimental system that can exploit a variety of online QoS aware adaptive task allocation schemes, and three such schemes are designed and compared.
Abstract: The increasingly wide application of Cloud Computing enables the consolidation of tens of thousands of applications in shared infrastructures. Thus, meeting the QoS requirements of so many diverse applications in such shared resource environments has become a real challenge, especially since the characteristics and workload of applications differ widely and may change over time. This paper presents an experimental system that can exploit a variety of online QoS aware adaptive task allocation schemes, and three such schemes are designed and compared. These are a measurement driven algorithm that uses reinforcement learning, secondly a “sensible” allocation algorithm that assigns tasks to sub-systems that are observed to provide a lower response time, and then an algorithm that splits the task arrival stream into sub-streams at rates computed from the hosts’ processing capabilities. All of these schemes are compared via measurements among themselves and with a simple round-robin scheduler, on two experimental test-beds with homogenous and heterogenous hosts having different processing capacities.

117 citations


Journal ArticleDOI
TL;DR: The consumer range camera Kinect is used to monitor drivers and identify driving tasks in a real vehicle, and the FFNN tasks detector is proved to be an efficient model that can be implemented for real-time driver distraction and dangerous behavior recognition.
Abstract: Driver decisions and behaviors regarding the surrounding traffic are critical to traffic safety. It is important for an intelligent vehicle to understand driver behavior and assist in driving tasks according to their status. In this paper, the consumer range camera Kinect is used to monitor drivers and identify driving tasks in a real vehicle. Specifically, seven common tasks performed by multiple drivers during driving are identified in this paper. The tasks include normal driving, left-, right-, and rear-mirror checking, mobile phone answering, texting using a mobile phone with one or both hands, and the setup of in-vehicle video devices. The first four tasks are considered safe driving tasks, while the other three tasks are regarded as dangerous and distracting tasks. The driver behavior signals collected from the Kinect consist of a color and depth image of the driver inside the vehicle cabin. In addition, 3-D head rotation angles and the upper body (hand and arm at both sides) joint positions are recorded. Then, the importance of these features for behavior recognition is evaluated using random forests and maximal information coefficient methods. Next, a feedforward neural network (FFNN) is used to identify the seven tasks. Finally, the model performance for task recognition is evaluated with different features (body only, head only, and combined). The final detection result for the seven driving tasks among five participants achieved an average of greater than 80% accuracy, and the FFNN tasks detector is proved to be an efficient model that can be implemented for real-time driver distraction and dangerous behavior recognition.

23 Oct 2018
TL;DR: This work proposes using GPU-accelerated RL simulations as an alternative to CPU ones for speeding up Deep RL training, and shows promising speed-ups of learning various continuous-control, locomotion tasks.
Abstract: Most Deep Reinforcement Learning (Deep RL) algorithms require a prohibitively large number of training samples for learning complex tasks. Many recent works on speeding up Deep RL have focused on distributed training and simulation. While distributed training is often done on the GPU, simulation is not. In this work, we propose using GPU-accelerated RL simulations as an alternative to CPU ones. Using NVIDIA Flex, a GPU-based physics engine, we show promising speed-ups of learning various continuous-control, locomotion tasks. With one GPU and CPU core, we are able to train the Humanoid running task in less than 20 minutes, using 10-1000x fewer CPU cores than previous works. We also demonstrate the scalability of our simulator to multi-GPU settings to train more challenging locomotion tasks.

Posted Content
Yabo Ni1, Dan Ou1, Shichen Liu1, Xiang Li1, Wenwu Ou1, Anxiang Zeng1, Luo Si1 
TL;DR: This work proposes to learn universal user representations across multiple tasks for more effective personalization and refers this work as Deep User Perception Network (DUPN) and conducts an extensive set of offline and online experiments.
Abstract: Tasks such as search and recommendation have become increas- ingly important for E-commerce to deal with the information over- load problem. To meet the diverse needs of di erent users, person- alization plays an important role. In many large portals such as Taobao and Amazon, there are a bunch of di erent types of search and recommendation tasks operating simultaneously for person- alization. However, most of current techniques address each task separately. This is suboptimal as no information about users shared across di erent tasks. In this work, we propose to learn universal user representations across multiple tasks for more e ective personalization. In partic- ular, user behavior sequences (e.g., click, bookmark or purchase of products) are modeled by LSTM and attention mechanism by integrating all the corresponding content, behavior and temporal information. User representations are shared and learned in an end-to-end setting across multiple tasks. Bene ting from better information utilization of multiple tasks, the user representations are more e ective to re ect their interests and are more general to be transferred to new tasks. We refer this work as Deep User Perception Network (DUPN) and conduct an extensive set of o ine and online experiments. Across all tested ve di erent tasks, our DUPN consistently achieves better results by giving more e ective user representations. Moreover, we deploy DUPN in large scale operational tasks in Taobao. Detailed implementations, e.g., incre- mental model updating, are also provided to address the practical issues for the real world applications.

Proceedings Article
25 Apr 2018
TL;DR: This paper investigates the ride-sharing assignment problem as a combinatorial optimization problem, shows that it is NP-hard, and designs an approximation algorithm which guarantees to output a solution with at most 2.5 times the optimal cost.
Abstract: We investigate the ride-sharing assignment problem from an algorithmic resource allocation point of view. Given a number of requests with source and destination locations, and a number of available car locations, the task is to assign cars to requests with two requests sharing one car. We formulate this as a combinatorial optimization problem, and show that it is NP-hard. We then design an approximation algorithm which guarantees to output a solution with at most 2.5 times the optimal cost. Experiments are conducted showing that our algorithm actually has a much better approximation ratio (around 1.2) on synthetically generated data.

Proceedings Article
03 Jul 2018
TL;DR: In this paper, a framework for meta-learning that is based on generalization error bounds is presented, allowing to extend various PAC-Bayes bounds to meta learning, where prior knowledge is incorporated through setting an experience-dependent prior for novel tasks.
Abstract: In meta-learning an agent extracts knowledge from observed tasks, aiming to facilitate learning of novel future tasks. Under the assumption that future tasks are 'related' to previous tasks, the accumulated knowledge should be learned in a way which captures the common structure across learned tasks, while allowing the learner sufficient flexibility to adapt to novel aspects of new tasks. We present a framework for meta-learning that is based on generalization error bounds, allowing us to extend various PAC-Bayes bounds to meta-learning. Learning takes place through the construction of a distribution over hypotheses based on the observed tasks, and its utilization for learning a new task. Thus, prior knowledge is incorporated through setting an experience-dependent prior for novel tasks. We develop a gradient-based algorithm which minimizes an objective function derived from the bounds and demonstrate its effectiveness numerically with deep neural networks. In addition to establishing the improved performance available through meta-learning, we demonstrate the intuitive way by which prior information is manifested at different levels of the network.

Proceedings ArticleDOI
19 Mar 2018
TL;DR: Experimental evaluations show that the proposed approach clearly outperforms IBM's own mapping solution with respect to runtime as well as resulting costs.
Abstract: In March 2017, IBM launched the project IBM Q with the goal to provide access to quantum computers for a broad audience. This allowed users to conduct quantum experiments on a 5-qubit and, since June 2017, also on a 16-qubit quantum computer (called IBM QX2 and IBM QX3, respectively). In order to use these, the desired quantum functionality (e.g. provided in terms of a quantum circuit) has to properly be mapped so that the underlying physical constraints are satisfied — a complex task. This demands for solutions to automatically and efficiently conduct this mapping process. In this paper, we propose such an approach which satisfies all constraints given by the architecture and, at the same time, aims to keep the overhead in terms of additionally required quantum gates minimal. The proposed approach is generic and can easily be configured for future architectures. Experimental evaluations show that the proposed approach clearly outperforms IBM's own mapping solution with respect to runtime as well as resulting costs.

Proceedings ArticleDOI
16 Apr 2018
TL;DR: This paper perturb the locations of both tasks and workers based on geo-indistinguishability and then devise techniques to quantify the probability of reachability between a task and a worker, given their perturbed locations.
Abstract: With spatial crowdsourcing (SC), requesters outsource their spatiotemporal tasks (tasks associated with location and time) to a set of workers, who will perform the tasks by physically traveling to the tasks' locations. However, current solutions require the locations of the workers and/or the tasks to be disclosed to untrusted parties (SC server) for effective assignments of tasks to workers. In this paper we propose a framework for assigning tasks to workers in an online manner without compromising the location privacy of workers and tasks. We perturb the locations of both tasks and workers based on geo-indistinguishability and then devise techniques to quantify the probability of reachability between a task and a worker, given their perturbed locations. We investigate both analytical and empirical models for quantifying the worker-task pair reachability and propose task assignment strategies that strike a balance among various metrics such as the number of completed tasks, worker travel distance and system overhead. Extensive experiments on real-world datasets show that our proposed techniques result in minimal disclosure of task locations and no disclosure of worker locations without significantly sacrificing the total number of assigned tasks.

Proceedings ArticleDOI
Yabo Ni1, Dan Ou1, Shichen Liu1, Xiang Li1, Wenwu Ou1, Anxiang Zeng1, Luo Si1 
19 Jul 2018
TL;DR: In this paper, a Deep User Perception Network (DUPN) is proposed to learn universal user representations across multiple tasks for more effective personalization, where user behavior sequences are modeled by LSTM and attention mechanism by integrating all the corresponding content, behavior and temporal information.
Abstract: Tasks such as search and recommendation have become increasingly important for E-commerce to deal with the information overload problem. To meet the diverse needs of different users, personalization plays an important role. In many large portals such as Taobao and Amazon, there are a bunch of different types of search and recommendation tasks operating simultaneously for personalization. However, most of current techniques address each task separately. This is suboptimal as no information about users shared across different tasks. In this work, we propose to learn universal user representations across multiple tasks for more effective personalization. In particular, user behavior sequences (e.g., click, bookmark or purchase of products) are modeled by LSTM and attention mechanism by integrating all the corresponding content, behavior and temporal information. User representations are shared and learned in an end-to-end setting across multiple tasks. Benefiting from better information utilization of multiple tasks, the user representations are more effective to reflect their interests and are more general to be transferred to new tasks. We refer this work as Deep User Perception Network (DUPN) and conduct an extensive set of offline and online experiments. Across all tested five different tasks, our DUPN consistently achieves better results by giving more effective user representations. Moreover, we deploy DUPN in large scale operational tasks in Taobao. Detailed implementations, e.g., incremental model updating, are also provided to address the practical issues for the real world applications.

Journal ArticleDOI
TL;DR: A software-defined network (SDN) architecture is developed and a star topology is considered where a centered AV outsources its computing tasks to the surrounding AVs for its autonomous driving, and a market mechanism is developed in which the surroundingAVs sell their computing power at a cost based on their local idle computing resources.
Abstract: The autonomous vehicles (AVs), like that in knight rider, were completely a scientific fiction just a few years ago, but are now already practical with real-world commercial deployments. A salient challenge of AVs, however, is the intensive computing tasks to carry out on board for the real-time traffic detection and driving decision making; this imposes heavy load to AVs due to the limited computing power. To explore more computing power and enable scalable autonomous driving, in this paper, we propose a collaborative task computing scheme for AVs, in which the AVs in proximity dynamically share idle computing power among each other. This, however, raises another fundamental problem on how to incentivize AVs to contribute their computing power and how to fully utilize the pool of group computing power in an optimal way. This paper studies the problem by modeling the issue as a market-based optimal computing resource allocation problem. In specific, we develop a software-defined network (SDN) architecture and consider a star topology where a centered AV outsources its computing tasks to the surrounding AVs for its autonomous driving. A market mechanism is developed in which the surrounding AVs sell their computing power at a cost based on their local idle computing resources. Then, we classify the tasks requested by the centered AV into two types which are task with time to live (TTL) and task without TTL, respectively. With different task types, we define corresponding cost models of the centered AV and formulate them as two minimization problems. The optimal solutions of the problems are achieved to guide the centered AV to wisely allocate computing tasks to surrounding AVs towards minimal cost. Finally, the performance of the proposed scheme is evaluated using simulations, which show that the proposed scheme can result in the guaranteed computing performance yet the lowest costs compared with other conventional schemes.

Posted Content
TL;DR: This work introduces several reinforcement learning tasks with multiple commercially available robots that present varying levels of learning difficulty, setup, and repeatability and test the learning performance of off-the-shelf implementations of four reinforcement learning algorithms and analyzes sensitivity to their hyper-parameters to determine their readiness for applications in various real-world tasks.
Abstract: Through many recent successes in simulation, model-free reinforcement learning has emerged as a promising approach to solving continuous control robotic tasks. The research community is now able to reproduce, analyze and build quickly on these results due to open source implementations of learning algorithms and simulated benchmark tasks. To carry forward these successes to real-world applications, it is crucial to withhold utilizing the unique advantages of simulations that do not transfer to the real world and experiment directly with physical robots. However, reinforcement learning research with physical robots faces substantial resistance due to the lack of benchmark tasks and supporting source code. In this work, we introduce several reinforcement learning tasks with multiple commercially available robots that present varying levels of learning difficulty, setup, and repeatability. On these tasks, we test the learning performance of off-the-shelf implementations of four reinforcement learning algorithms and analyze sensitivity to their hyper-parameters to determine their readiness for applications in various real-world tasks. Our results show that with a careful setup of the task interface and computations, some of these implementations can be readily applicable to physical robots. We find that state-of-the-art learning algorithms are highly sensitive to their hyper-parameters and their relative ordering does not transfer across tasks, indicating the necessity of re-tuning them for each task for best performance. On the other hand, the best hyper-parameter configuration from one task may often result in effective learning on held-out tasks even with different robots, providing a reasonable default. We make the benchmark tasks publicly available to enhance reproducibility in real-world reinforcement learning.

Proceedings ArticleDOI
01 Jan 2018
TL;DR: This article proposed multi-task label embedding to convert labels in text classification into semantic vectors, thereby turning the original tasks into vector matching tasks, which can effectively improve the performances of related tasks with semantic representations of labels and additional information from each other.
Abstract: Multi-task learning in text classification leverages implicit correlations among related tasks to extract common features and yield performance gains. However, a large body of previous work treats labels of each task as independent and meaningless one-hot vectors, which cause a loss of potential label information. In this paper, we propose Multi-Task Label Embedding to convert labels in text classification into semantic vectors, thereby turning the original tasks into vector matching tasks. Our model utilizes semantic correlations among tasks and makes it convenient to scale or transfer when new tasks are involved. Extensive experiments on five benchmark datasets for text classification show that our model can effectively improve the performances of related tasks with semantic representations of labels and additional information from each other.

20 Sep 2018
TL;DR: In this paper, the authors introduce several reinforcement learning tasks with multiple commercially available robots that present varying levels of learning difficulty, setup, and repeatability, and test the learning performance of off-the-shelf implementations of four reinforcement learning algorithms and analyze sensitivity to their hyper-parameters to determine their readiness for applications in various real-world tasks.
Abstract: Through many recent successes in simulation, model-free reinforcement learning has emerged as a promising approach to solving continuous control robotic tasks. The research community is now able to reproduce, analyze and build quickly on these results due to open source implementations of learning algorithms and simulated benchmark tasks. To carry forward these successes to real-world applications, it is crucial to withhold utilizing the unique advantages of simulations that do not transfer to the real world and experiment directly with physical robots. However, reinforcement learning research with physical robots faces substantial resistance due to the lack of benchmark tasks and supporting source code. In this work, we introduce several reinforcement learning tasks with multiple commercially available robots that present varying levels of learning difficulty, setup, and repeatability. On these tasks, we test the learning performance of off-the-shelf implementations of four reinforcement learning algorithms and analyze sensitivity to their hyper-parameters to determine their readiness for applications in various real-world tasks. Our results show that with a careful setup of the task interface and computations, some of these implementations can be readily applicable to physical robots. We find that state-of-the-art learning algorithms are highly sensitive to their hyper-parameters and their relative ordering does not transfer across tasks, indicating the necessity of re-tuning them for each task for best performance. On the other hand, the best hyper-parameter configuration from one task may often result in effective learning on held-out tasks even with different robots, providing a reasonable default. We make the benchmark tasks publicly available to enhance reproducibility in real-world reinforcement learning.

Journal ArticleDOI
TL;DR: The proposed algorithm reduces the energy consumption by depreciating the number of active PMs, while also minimizes the makespan and task rejection rate and demonstrates the effectiveness of the proposed algorithm over some existing standard algorithms.

Proceedings Article
03 Jul 2018
TL;DR: NetGAN as discussed by the authors is the first implicit generative model for graphs able to mimic real-world networks, which is based on a stochastic neural network that generates discrete output samples and is trained using the Wasserstein GAN objective.
Abstract: We propose NetGAN - the first implicit generative model for graphs able to mimic real-world networks. We pose the problem of graph generation as learning the distribution of biased random walks over the input graph. The proposed model is based on a stochastic neural network that generates discrete output samples and is trained using the Wasserstein GAN objective. NetGAN is able to produce graphs that exhibit well-known network patterns without explicitly specifying them in the model definition. At the same time, our model exhibits strong generalization properties, as highlighted by its competitive link prediction performance, despite not being trained specifically for this task. Being the first approach to combine both of these desirable properties, NetGAN opens exciting avenues for further research.

Posted Content
TL;DR: This work proposes a method that learns both how to learn primitive behaviors from video demonstrations and how to dynamically compose these behaviors to perform multi-stage tasks by "watching" a human demonstrator.
Abstract: We consider the problem of learning multi-stage vision-based tasks on a real robot from a single video of a human performing the task, while leveraging demonstration data of subtasks with other objects. This problem presents a number of major challenges. Video demonstrations without teleoperation are easy for humans to provide, but do not provide any direct supervision. Learning policies from raw pixels enables full generality but calls for large function approximators with many parameters to be learned. Finally, compound tasks can require impractical amounts of demonstration data, when treated as a monolithic skill. To address these challenges, we propose a method that learns both how to learn primitive behaviors from video demonstrations and how to dynamically compose these behaviors to perform multi-stage tasks by "watching" a human demonstrator. Our results on a simulated Sawyer robot and real PR2 robot illustrate our method for learning a variety of order fulfillment and kitchen serving tasks with novel objects and raw pixel inputs.

Proceedings ArticleDOI
01 Jan 2018
TL;DR: This work proposes a new system, called FlexPS, which introduces a novel multi-stage abstraction to support flexible parallelism control and achieves significant speedups and resource saving compared with the state-of-the-art PS systems such as Petuum and Multiverso.
Abstract: As a general abstraction for coordinating the distributed storage and access of model parameters, the parameter server (PS) architecture enables distributed machine learning to handle large datasets and high dimensional models. Many systems, such as Parameter Server and Petuum, have been developed based on the PS architecture and widely used in practice. However, none of these systems supports changing parallelism during runtime, which is crucial for the efficient execution of machine learning tasks with dynamic workloads. We propose a new system, called FlexPS, which introduces a novel multi-stage abstraction to support flexible parallelism control. With the multi-stage abstraction, a machine learning task can be mapped to a series of stages and the parallelism for a stage can be set according to its workload. Optimizations such as stage scheduler, stage-aware consistency controller, and direct model transfer are proposed for the efficiency of multi-stage machine learning in FlexPS. As a general and complete PS systems, FlexPS also incorporates many optimizations that are not limited to multi-stage machine learning. We conduct extensive experiments using a variety of machine learning workloads, showing that FlexPS achieves significant speedups and resource saving compared with the state-of-the-art PS systems such as Petuum and Multiverso.

Proceedings ArticleDOI
23 Sep 2018
TL;DR: An attention aware system that issues Take-Over Requests at emerging task boundaries and directly on consumer devices such as smartphones or tablets is proposed, which has the potential to reduce stress, increase Take- over performance, and can further raise user acceptance/trust.
Abstract: A major promise of automated vehicles is to render it possible for drivers to engage in nondriving related tasks, a setting where the execution pattern will switch from concurrent to sequential multitasking. To allow drivers to safely and efficiently switch between multiple activities (including vehicle control in case of Take-Over situations), we postulate that future vehicles should incorporate capabilities of attentive user interfaces, that precisely plan the timing of interruptions based on driver availability. We propose an attention aware system that issues Take-Over Requests (1) at emerging task boundaries and (2) directly on consumer devices such as smartphones or tablets. Results of a driving simulator study (N=18), where we evaluated objective, physiological, and subjective measurements, confirm our assumption: attention aware Take-Over Requests have the potential to reduce stress, increase Take-Over performance, and can further raise user acceptance/trust. Consequently, we emphasize to implement attentive user interfaces in future vehicles.

Journal ArticleDOI
01 Jul 2018
TL;DR: R heem is presented, a general-purpose cross-platform data processing system that decouples applications from the underlying platforms and allows users to focus on the business logic of their applications rather than on the mechanics of how to compose and execute them.
Abstract: Solving business problems increasingly requires going beyond the limits of a single data processing platform (platform for short), such as Hadoop or a DBMS. As a result, organizations typically perform tedious and costly tasks to juggle their code and data across different platforms. Addressing this pain and achieving automatic cross-platform data processing is quite challenging: finding the most efficient platform for a given task requires quite good expertise for all the available platforms. We present Rheem, a general-purpose cross-platform data processing system that decouples applications from the underlying platforms. It not only determines the best platform to run an incoming task, but also splits the task into subtasks and assigns each subtask to a specific platform to minimize the overall cost (e.g., runtime or monetary cost). It features (i) an interface to easily compose data analytic tasks; (ii) a novel cost-based optimizer able to find the most efficient platform in almost all cases; and (iii) an executor to efficiently orchestrate tasks over different platforms. As a result, it allows users to focus on the business logic of their applications rather than on the mechanics of how to compose and execute them. Using different real-world applications with Rheem, we demonstrate how cross-platform data processing can accelerate performance by more than one order of magnitude compared to single-platform data processing.

Posted Content
TL;DR: This work accomplishes pseudo-rehearsal by using a Generative Adversarial Network to generate items so that the deep network can learn to sequentially classify the CIFAR-10, SVHN and MNIST datasets.
Abstract: In general, neural networks are not currently capable of learning tasks in a sequential fashion When a novel, unrelated task is learnt by a neural network, it substantially forgets how to solve previously learnt tasks One of the original solutions to this problem is pseudo-rehearsal, which involves learning the new task while rehearsing generated items representative of the previous task/s This is very effective for simple tasks However, pseudo-rehearsal has not yet been successfully applied to very complex tasks because in these tasks it is difficult to generate representative items We accomplish pseudo-rehearsal by using a Generative Adversarial Network to generate items so that our deep network can learn to sequentially classify the CIFAR-10, SVHN and MNIST datasets After training on all tasks, our network loses only 167% absolute accuracy on CIFAR-10 and gains 024% absolute accuracy on SVHN Our model's performance is a substantial improvement compared to the current state of the art solution

Posted Content
TL;DR: An energy efficient virtualized IoT P2P networks heuristic (EEVIPN) model based on the MILP model principles is developed to maximize the number of processing tasks served by the system, and minimize the total power consumed by the IoT network.
Abstract: In this paper, an energy efficient IoT virtualization framework with P2P networking and edge computing is proposed. In this network, the IoT task processing requests are served by peers. The peers in our work are represented by IoT objects and relays that host virtual machines (VMs). We have considered three scenarios to investigate the saving in power consumption and the system capabilities in terms of task processing. The first scenario is the relays only scenario, where the task requests are processed using relays only. The second scenario is the objects only scenario, where the task requests are processed using the IoT objects only. The last scenario is a hybrid scenario, where the task requests are processed using both IoT objects and VMs. We have developed a mixed integer linear programming (MILP) model to maximize the number of processing tasks served by the system and minimize the total power consumed by the IoT network. We investigated our framework under the impact of VMs placement constraints, fairness constraints between the objects, tasks number limitations, uplink and downlink limited capacities, and processing capability limitations. Based on the MILP model principles, we developed an energy efficient virtualized IoT P2P networks heuristic (EEVIPN). The heuristic results were comparable to those of the MILP in terms of energy efficiency and tasks processing. Our results show that the hybrid scenario serves up to 77% (57% on average) processing task requests, but with higher energy consumption compared to the other scenarios. The relays only scenario can serve 74% (57% on average) of the processing task requests with 8% saving in power consumption compared to the hybrid scenario. In contrast, 28% (22% on average) of task requests can be successfully handled by applying the objects only scenario with up to 62% power saving compared to the hybrid scenario.